Skip to content
Gary Wu
Go back

Automating Slate.js Editors from a Chrome Extension

Edit page

Slate.js editors don’t respond to standard browser automation. execCommand, dispatchEvent, direct DOM writes — none of it works the way you’d expect. The text appears in the DOM but the host application reads empty state on submit. This article documents every approach that failed, why each one failed, and the one technique that reliably gets text into Slate’s internal model from a Chrome MV3 extension service worker.

What you’ll learn:


Table of Contents

Open Table of Contents

The Problem

You’re building a browser automation platform. Your Chrome MV3 extension receives JSON step programs from a remote server, executes them against web pages via content scripts, and reports results back. It works perfectly for standard inputs, standard buttons, standard forms.

Then you encounter a Slate.js rich text editor.

Slate is not a <textarea>. It’s not even a regular contenteditable div. It’s a React-managed DOM where the visible elements are a projection of an internal data model. The critical constraint: the host application reads Slate’s internal model on submit, not the DOM. If your text is in the DOM but not in the model, the application sees an empty editor.

This creates a uniquely hard automation problem. With a regular <input>, you can set .value and dispatch a change event. With a regular contenteditable, you can use execCommand('insertText'). With Slate, you need to go through Slate’s own event processing pipeline so that:

  1. The editor’s internal model (a tree of Node objects) is updated
  2. The onChange callback fires
  3. The host application’s React state is updated in the same render cycle

Miss any one of these steps and the submit button reads stale state.

What makes this worse in a Chrome extension

Chrome MV3 extensions run content scripts in an isolated world — a separate JavaScript context from the page. You share the DOM, but you don’t share the window object, constructors, or event listeners. An InputEvent created in the isolated world is literally a different class than the InputEvent the page’s JavaScript knows about.

React’s event delegation — where all event handlers are registered on the root element — runs in the page’s JavaScript context. Events from the isolated world never reach it.


Background: Chrome Extension World Isolation

Chrome extensions have two JavaScript execution environments for interacting with page content. Understanding the distinction is the single most important concept in this article.

Isolated World (default for content scripts)

┌─────────────────────────────────────────────────┐
│                  Web Page                         │
│                                                   │
│  ┌─────────────┐        ┌──────────────────────┐ │
│  │ Page JS      │        │ Content Script       │ │
│  │              │        │ (Isolated World)     │ │
│  │ window.*     │        │                      │ │
│  │ React        │        │ window.* (different) │ │
│  │ Slate editor │        │ No React access      │ │
│  │              │        │ No Slate access      │ │
│  └──────┬───────┘        └──────────┬───────────┘ │
│         │                           │              │
│         └─────────┬─────────────────┘              │
│                   │                                │
│            ┌──────▼──────┐                         │
│            │   Shared    │                         │
│            │    DOM      │                         │
│            └─────────────┘                         │
└─────────────────────────────────────────────────────┘

Your content script can read and write the DOM. It can querySelector, it can click(), it can textContent. But:

MAIN World (via chrome.scripting.executeScript)

// From the service worker (background.js):
chrome.scripting.executeScript({
  target: { tabId },
  world: "MAIN",       // <-- runs in page's JS context
  func: (value) => {
    // This code has full access to the page's JavaScript:
    // - window.* is the page's window
    // - React fiber trees are accessible
    // - Event constructors are the same ones React listens for
    // - Slate's editor instance is reachable
  },
  args: ["hello world"],
});

MAIN world scripts run as if they were part of the page. Events created here are real page events. React processes them normally. This is only available from the service worker via chrome.scripting.executeScript — content scripts cannot opt into MAIN world at runtime.

Key insight: The isolated world vs MAIN world distinction is invisible when automating simple HTML forms. It only becomes critical when the target uses a JavaScript framework (React, Vue, Angular) that manages its own event system. Slate.js is the hardest case because it combines React’s synthetic events with a custom data model that diverges from the DOM.

Tradeoffs

Isolated WorldMAIN World
AccessDOM onlyDOM + page JS
Chrome APIsFull (chrome.runtime, etc.)None
SecurityExtension code is protectedPage can see/interfere with your code
Event dispatchEvents invisible to ReactEvents processed normally
InvocationContent script runs automaticallyMust use chrome.scripting.executeScript from SW
CommunicationDirect chrome.runtime.sendMessageMust use DOM events or postMessage to talk to extension

Background: Slate.js DOM Architecture

Before examining why each approach fails, you need to understand how Slate renders to the DOM.

The empty state

When a Slate editor is empty, the DOM looks like this:

<div data-slate-editor="true" contenteditable="true" role="textbox">
  <div data-slate-node="element">
    <span data-slate-node="text">
      <span data-slate-zero-width="z" data-slate-length="0">
        &#xFEFF;         <!-- zero-width no-break space -->
      </span>
    </span>
  </div>
  <div data-slate-placeholder="true"
       contenteditable="false"
       style="position: absolute; pointer-events: none; ...">
    What do you want to create?   <!-- placeholder text -->
  </div>
</div>

Key elements:

After text is inserted

When text exists in the editor:

<div data-slate-editor="true" contenteditable="true" role="textbox">
  <div data-slate-node="element">
    <span data-slate-node="text">
      <span data-slate-string="true">
        A cat wearing a tiny hat   <!-- actual content -->
      </span>
    </span>
  </div>
  <!-- placeholder div is now hidden via style -->
</div>

The critical difference: data-slate-string="true" replaces data-slate-zero-width. The placeholder div is hidden but still present in the DOM.

The model behind the DOM

Slate’s actual data model is a tree of Node objects:

// Simplified Slate model
interface Editor {
  children: Node[];
  selection: Range | null;
  onChange: () => void;
}

interface Element {
  type: string;
  children: Node[];
}

interface Text {
  text: string;
  // ...marks (bold, italic, etc.)
}

The DOM is a projection of this model. When Slate processes a beforeinput event, it:

  1. Reads the inputType and data from the event
  2. Calls the appropriate transform on the model (e.g., Editor.insertText(editor, data))
  3. Triggers onChange
  4. React re-renders the DOM from the new model state

If you modify the DOM directly without going through this pipeline, the model and DOM diverge. Slate’s model still says “empty” while the DOM shows your text. The host application reads the model.


What Failed (and Why)

Each of these approaches was tested against a production Slate.js editor (Google Flow, which uses Slate for its prompt input). Each one fails for a specific, instructive reason.

Attempt 1: execCommand('insertText') from Isolated World

// In content.js (isolated world)
function setSlateText(el, text) {
  el.focus();
  document.execCommand("selectAll", false);
  document.execCommand("delete", false);
  document.execCommand("insertText", false, text);
  el.dispatchEvent(new InputEvent("input", {
    bubbles: true,
    cancelable: true,
  }));
}

What happens: Text appears visually in the editor. But it renders inside the data-slate-zero-width span (the empty-state marker), not in a data-slate-string span. The Slate model is unchanged.

Why it fails: execCommand modifies the DOM directly through the browser’s built-in editing commands. These commands bypass Slate’s event pipeline entirely. Slate never sees a beforeinput event, never updates its model, never fires onChange. The host app reads the model on submit and sees an empty editor.

Key insight: execCommand is a DOM operation, not a JavaScript event. It doesn’t trigger React’s synthetic event system regardless of which world you’re in. Even in MAIN world, execCommand('insertText') does the wrong thing — see Attempt 4.

Attempt 2: dispatchEvent(new InputEvent('beforeinput')) from Isolated World

// In content.js (isolated world)
const el = document.querySelector("div[data-slate-editor='true']");
el.focus();
el.dispatchEvent(new InputEvent("beforeinput", {
  inputType: "insertText",
  data: "A cat wearing a tiny hat",
  bubbles: true,
  cancelable: true,
}));

What happens: Nothing. The editor state doesn’t change at all.

Why it fails: This is the Chrome extension world isolation problem at its core. The InputEvent constructor in the isolated world is a different class than the InputEvent in the page’s JavaScript context. When this event bubbles up the DOM, React’s event delegation (which runs in the page’s context) doesn’t recognize it as a real beforeinput event.

More precisely: React uses addEventListener on the root element (in the page’s context) to listen for beforeinput. The browser’s event dispatch does deliver the event — DOM events cross the world boundary. But React’s handler checks properties and behavior of the event object, and the isolated-world event object doesn’t satisfy those checks. Slate’s onDOMBeforeInput handler in the Editable component never fires.

Attempt 3: React Fiber Walk + editor.insertText() from MAIN World

// In chrome.scripting.executeScript({ world: "MAIN" })
const el = document.querySelector("div[data-slate-editor='true']");

// Walk the React fiber tree to find the Slate editor instance
const fiberKey = Object.keys(el).find(k => k.startsWith("__reactFiber$"));
let fiber = el[fiberKey];

// Walk up the fiber tree looking for the editor in memoizedProps or stateNode
while (fiber) {
  const editor = fiber.memoizedProps?.editor
    || fiber.stateNode?.editor
    || fiber.memoizedState?.memoizedState?.editor;
  if (editor?.insertText) {
    editor.insertText("A cat wearing a tiny hat");
    break;
  }
  fiber = fiber.return;
}

What happens: The text appears in the editor. The Slate model IS updated — editor.insertText() modifies the internal node tree correctly. Slate’s DOM rendering updates to show a data-slate-string span.

But on submit, the host application still reads the old (empty) value.

Why it fails: This is the subtlest failure. The issue is when onChange fires relative to the host app’s React state update cycle.

The host application (Google Flow, in this case) stores the prompt text in its own React state, updated via Slate’s onChange callback:

// Simplified host app code
const [prompt, setPrompt] = useState("");

<Slate
  editor={editor}
  onChange={() => {
    const text = Editor.string(editor, []);
    setPrompt(text);  // Host app tracks prompt in its own state
  }}
>
  <Editable />
</Slate>

When you call editor.insertText() directly, onChange fires. But the state update may be batched by React and not committed before the submit handler reads prompt. There’s also the possibility that onChange is wrapped in a useCallback with stale closure variables, or that the state update is inside a startTransition and treated as low-priority.

The result: Slate’s model has the text, the DOM shows the text, but the host app’s submit handler reads its own React state which hasn’t been updated yet.

Key insight: Calling Slate API methods directly bypasses the host application’s event handling, not just Slate’s. Even if Slate is happy, the app wrapping Slate may not be.

Attempt 4: execCommand('insertText') from MAIN World

// In chrome.scripting.executeScript({ world: "MAIN" })
const el = document.querySelector("div[data-slate-editor='true']");
el.focus();
document.execCommand("insertText", false, "A cat wearing a tiny hat");

What happens: The text is inserted, but into the wrong element. The cursor lands in the [data-slate-placeholder] div — the “What do you want to create?” overlay. Text appears inside the placeholder element, which is contenteditable="false", creating an impossible DOM state. The text can’t be deleted. The application rejects the submission with “prompt must be valid.”

Why it fails: When you call el.focus() on the Slate editor root, the browser places the cursor at the first focusable position. But Slate’s placeholder div ([data-slate-placeholder]) is absolutely positioned over the content area. Depending on the layout, focus() may place the cursor in the placeholder div rather than in the actual content area (the [data-slate-node="element"] div).

execCommand('insertText') inserts text at the current cursor position. If the cursor is in the placeholder, the text goes into the placeholder. Since the placeholder is contenteditable="false", the result is a corrupted DOM that Slate can’t reconcile with its model.

Attempt 5: Clicking the wrong submit button

This isn’t a Slate failure, but it’s part of the same debugging session and it’s a common automation trap.

After successfully typing text (using the working approach below), the automation needed to click a submit button:

// The operator step
{ "op": "clickLast", "selector": "button:has(i.google-symbols)" }

What happens: The text is cleared and no generation starts.

Why it fails: The Google Flow interface has multiple buttons with Material Symbols icons:

clickLast("button:has(i.google-symbols)") matches all of them and picks the last one in DOM order, which is the clear button (x). The input is erased.


What Works

InputEvent('beforeinput') dispatched from MAIN world, with proper cursor placement via simulated mouse events.

Here’s why this is the correct approach:

  1. MAIN world ensures the InputEvent constructor is the page’s own InputEvent. React and Slate process it normally.
  2. Mouse events before dispatch establish Slate’s internal selection. Slate tracks its own selection state (separate from window.getSelection()). Without a proper Slate selection, beforeinput has nowhere to insert text.
  3. beforeinput with inputType: "insertText" follows Slate’s own input processing path. Slate handles this event in its Editable component’s onDOMBeforeInput handler, which calls Editor.insertText() internally. This updates the model, triggers onChange, and the host app’s state is updated in the same React batch.
// In chrome.scripting.executeScript({ world: "MAIN" })
async function typeIntoSlate(selector, value) {
  const el = document.querySelector(selector);
  if (!el) throw new Error(`Slate editor not found: ${selector}`);

  // Step 1: Establish Slate's selection with real mouse coordinates
  const rect = el.getBoundingClientRect();
  const cx = rect.left + 10;
  const cy = rect.top + rect.height / 2;

  el.focus();

  el.dispatchEvent(new MouseEvent("mousedown", {
    bubbles: true, cancelable: true, button: 0,
    clientX: cx, clientY: cy,
  }));
  el.dispatchEvent(new MouseEvent("mouseup", {
    bubbles: true, cancelable: true, button: 0,
    clientX: cx, clientY: cy,
  }));
  el.dispatchEvent(new MouseEvent("click", {
    bubbles: true, cancelable: true, button: 0,
    clientX: cx, clientY: cy,
  }));

  await new Promise(r => setTimeout(r, 150));

  // Step 2: Dispatch beforeinput — Slate's handler processes this natively
  el.dispatchEvent(new InputEvent("beforeinput", {
    inputType: "insertText",
    data: value,
    bubbles: true,
    cancelable: true,
  }));

  await new Promise(r => setTimeout(r, 150));

  // Step 3: Verify the text landed in Slate's model (not just the DOM)
  const slateStr = Array.from(el.querySelectorAll("[data-slate-string]"))
    .map(n => n.textContent)
    .join("");

  return slateStr.includes(value);
}

Why the mouse events matter

You might think el.focus() is enough. It isn’t. Here’s what focus() alone does:

  1. The browser gives keyboard focus to the contenteditable div
  2. window.getSelection() may or may not point inside the editor
  3. Slate’s internal editor.selection may still be null

Slate updates its selection state in response to mouse events, not just focus. The Editable component has onMouseDown and onClick handlers that call ReactEditor.toSlateRange() to convert the browser selection into a Slate selection. Without this, editor.selection is null, and beforeinput processing with a null selection is a no-op.

The coordinates matter too. Using rect.left + 10 (slightly inside the left edge) and rect.top + rect.height / 2 (vertical center) ensures the click lands in the content area, not on the placeholder overlay.

Why beforeinput and not input

Slate processes beforeinput events, not input events. The beforeinput event is cancellable and fires before the browser modifies the DOM. Slate intercepts it, prevents the default browser behavior, and makes its own model changes. The input event fires after the browser has already modified the DOM — too late for Slate’s pipeline.

This is actually the modern approach for all contenteditable editors. The Input Events Level 2 specification standardized beforeinput as the correct event for rich text input processing. Slate adopted this as its primary input mechanism starting in 2020.


The Full Implementation

Here’s the production implementation from a Chrome MV3 extension service worker that dispatches typeSlate steps to Slate.js editors. This is the actual code running in a browser automation platform.

The service worker (background.js)

// In the service worker's step dispatcher
async function sendStep(tabId, step) {
  // typeSlate MUST run in MAIN world — isolated-world events
  // can't trigger React's synthetic event system
  if (step.op === "typeSlate") {
    try {
      const [{ result }] = await chrome.scripting.executeScript({
        target: { tabId },
        world: "MAIN",
        func: async (selector, value) => {
          const el = document.querySelector(selector);
          if (!el) {
            return {
              ok: false,
              error: `typeSlate: selector not found: ${selector}`,
            };
          }

          // Focus and click to establish Slate's selection.
          // Real coordinates place the cursor in the content area,
          // not on the placeholder overlay.
          const rect = el.getBoundingClientRect();
          el.focus();
          const cx = rect.left + 10;
          const cy = rect.top + rect.height / 2;

          el.dispatchEvent(new MouseEvent("mousedown", {
            bubbles: true, cancelable: true, button: 0,
            clientX: cx, clientY: cy,
          }));
          el.dispatchEvent(new MouseEvent("mouseup", {
            bubbles: true, cancelable: true, button: 0,
            clientX: cx, clientY: cy,
          }));
          el.dispatchEvent(new MouseEvent("click", {
            bubbles: true, cancelable: true, button: 0,
            clientX: cx, clientY: cy,
          }));

          await new Promise(r => setTimeout(r, 150));

          // Fire beforeinput — Slate's own handler processes this,
          // updating model AND triggering onChange so the host app's
          // React state is properly updated.
          el.dispatchEvent(new InputEvent("beforeinput", {
            inputType: "insertText",
            data: value,
            bubbles: true,
            cancelable: true,
          }));

          await new Promise(r => setTimeout(r, 150));

          // Verify via [data-slate-string] nodes, not el.textContent
          const slateStr = Array.from(
            el.querySelectorAll("[data-slate-string]")
          )
            .map(n => n.textContent)
            .join("");

          const inserted = slateStr.includes(value);

          return {
            ok: inserted,
            source: "beforeinput",
            textAfter: slateStr,
            error: inserted
              ? null
              : `typeSlate: text not in slate strings (got: "${slateStr.slice(0, 80)}")`,
          };
        },
        args: [step.selector, step.value],
      });

      if (!result) return { ok: false, error: "typeSlate: no result" };

      return {
        ok: result.ok,
        error: result.ok ? undefined : result.error,
        outputs: {
          typeSlate_source: result.source ?? null,
          typeSlate_text: result.textAfter ?? null,
        },
        logs: [
          `typeSlate: source=${result.source} inserted=${result.ok} text="${(result.textAfter ?? "").slice(0, 60)}"`,
        ],
      };
    } catch (e) {
      return { ok: false, error: "typeSlate: " + e.message };
    }
  }

  // Other ops dispatch to content script normally...
}

The operator step (JSON DSL)

{
  "op": "typeSlate",
  "selector": "div[data-slate-editor='true']",
  "value": "A photorealistic cat wearing a tiny top hat, studio lighting"
}

Why the content script throws on typeSlate

In the content script, typeSlate is defined as an immediate error:

const ops = {
  async typeSlate() {
    throw new Error(
      "typeSlate must be intercepted by background SW — " +
      "content script should never receive this op"
    );
  },
  // ... other ops
};

This is a safety guard. If the service worker’s step router fails to intercept typeSlate and forwards it to the content script, the content script throws rather than silently producing broken behavior. Fail loud, not wrong.


Verification: Reading Slate’s Real Content

After dispatching beforeinput, you need to verify the text actually made it into Slate’s model. The obvious approach — reading el.textContent — is wrong.

Why el.textContent lies

// DON'T do this:
const text = el.textContent;
// Returns: "What do you want to create?A cat wearing a tiny hat"
//           ^-- placeholder text ----^^-- actual content --------^

el.textContent concatenates text from ALL child nodes, including the [data-slate-placeholder] div. Even when the placeholder is visually hidden (via CSS), it’s still in the DOM. You get placeholder text glued to actual content.

The correct approach: [data-slate-string] elements

// DO this:
const slateStr = Array.from(el.querySelectorAll("[data-slate-string]"))
  .map(n => n.textContent)
  .join("");

// Returns: "A cat wearing a tiny hat"

[data-slate-string="true"] elements only exist when Slate has real text content. They replace the [data-slate-zero-width] elements that mark empty nodes. If [data-slate-string] elements exist and contain your text, the Slate model has been updated.

Checking for the empty state

function isSlateEmpty(editorEl) {
  // If zero-width markers exist and no string nodes exist, the editor is empty
  const hasZeroWidth = editorEl.querySelector("[data-slate-zero-width]") !== null;
  const hasString = editorEl.querySelector("[data-slate-string]") !== null;
  return hasZeroWidth && !hasString;
}

Finding the Right Submit Button

After typing text into a Slate editor, you typically need to click a submit button. When the application uses icon buttons (e.g., Google Material Symbols), CSS selectors are unreliable.

The problem with CSS selectors

// Matches ALL buttons with a Material Symbols icon
document.querySelectorAll("button:has(i.google-symbols)")
// Returns: [settings_btn, clear_btn, submit_btn]

In Google Flow, typing text into the editor causes a clear (x) button to appear. If your automation uses clickLast("button:has(i.google-symbols)"), it clicks the clear button and erases the input.

The solution: match by icon text content

Google Material Symbols renders icons using ligatures. The <i> element’s text content is the icon name:

<button>
  <i class="google-symbols">arrow_forward</i>  <!-- submit -->
</button>
<button>
  <i class="google-symbols">close</i>          <!-- clear -->
</button>
<button>
  <i class="google-symbols">settings</i>       <!-- settings -->
</button>

Match on the icon name:

function clickIcon(iconName) {
  const btn = Array.from(document.querySelectorAll("button"))
    .filter(el => {
      if (el.offsetParent === null) {
        const s = getComputedStyle(el);
        if (s.position !== "fixed") return false;
      }
      return true;
    })
    .find(b =>
      b.querySelector("i.google-symbols")?.textContent?.trim() === iconName
    );

  if (!btn) throw new Error(`clickIcon: icon "${iconName}" not found`);

  btn.scrollIntoView({ behavior: "smooth", block: "center" });
  btn.click();
}

// Usage:
clickIcon("arrow_forward");  // clicks only the submit button

The operator step in JSON DSL:

{ "op": "clickIcon", "icon": "arrow_forward" }

This approach is:


Patterns for Other Rich Text Editors

The Slate.js approach generalizes. Here’s how to adapt it for other rich text editor frameworks.

Tiptap / ProseMirror

Tiptap (built on ProseMirror) is simpler to automate because execCommand('insertText') works from the isolated world:

// In content.js (isolated world) — works fine
function setTiptapText(el, text) {
  el.focus();
  document.execCommand("selectAll", false);
  document.execCommand("delete", false);
  document.execCommand("insertText", false, text);
  el.dispatchEvent(new InputEvent("input", {
    bubbles: true,
    cancelable: true,
  }));
}

ProseMirror’s input rules plugin intercepts DOM mutations (via MutationObserver) rather than relying on beforeinput. This means execCommand mutations are visible to ProseMirror even from the isolated world.

Draft.js

Draft.js (Facebook’s editor, now in maintenance mode) uses React’s synthetic onChange. From MAIN world:

// In chrome.scripting.executeScript({ world: "MAIN" })
const el = document.querySelector("[data-editor]");
el.focus();

// Draft.js reads from native beforeinput too (in modern mode)
el.dispatchEvent(new InputEvent("beforeinput", {
  inputType: "insertText",
  data: value,
  bubbles: true,
  cancelable: true,
}));

Draft.js in legacy mode may require the React fiber approach (walking __reactFiber$* keys), as it relied on onChange from contenteditable rather than beforeinput.

CodeMirror 6

CodeMirror 6 handles beforeinput similarly to Slate:

// In chrome.scripting.executeScript({ world: "MAIN" })
const el = document.querySelector(".cm-content");
el.focus();

el.dispatchEvent(new InputEvent("beforeinput", {
  inputType: "insertText",
  data: value,
  bubbles: true,
  cancelable: true,
}));

Comparison table

EditorAutomation ApproachWorld RequiredexecCommand Works?Notes
Slate.jsbeforeinput + mouse eventsMAINNoMust establish Slate selection first
Tiptap/ProseMirrorexecCommand('insertText')Isolated OKYesUses MutationObserver, not beforeinput
Draft.js (modern)beforeinputMAINNoSimilar to Slate
Draft.js (legacy)React fiber walkMAINNoNeed to find ContentState
CodeMirror 6beforeinputMAINPartialSimpler — no React wrapper typically
Monacoeditor.setValue() via fiberMAINNoNot contenteditable — uses own rendering
Plain contenteditableexecCommand('insertText')Isolated OKYesSimplest case
Standard <textarea>.value = + dispatchEvent('input')Isolated OKN/ANo rich text

Key insight: The harder the editor works to maintain its own model (separate from the DOM), the harder it is to automate. Slate and Draft.js are the hardest because they’re React-managed AND model-driven. ProseMirror/Tiptap are easier because they observe DOM mutations. Plain contenteditable is trivial.


Anti-Patterns

Don’tDo InsteadWhy
Use execCommand('insertText') on Slate editorsDispatch beforeinput from MAIN worldexecCommand bypasses Slate’s event pipeline; text enters DOM but not the model
Dispatch events from isolated world for React appsUse chrome.scripting.executeScript({ world: "MAIN" })Isolated-world events are invisible to React’s synthetic event system
Call editor.insertText() directly via fiber walkLet beforeinput flow through Slate’s handlerDirect API calls may not trigger the host app’s onChange state update
Use el.textContent to verify Slate contentQuery [data-slate-string] elementstextContent includes the hidden placeholder text
Use el.focus() alone before dispatching inputSimulate mousedown/mouseup/click with real coordinatesSlate needs mouse events to initialize its internal selection state
Use CSS selectors like :has(i.google-symbols) for icon buttonsMatch by icon textContent (e.g., "arrow_forward")CSS matches all icon buttons; text content is unique per icon
Let content script handle MAIN-world operationsThrow an error in content script; handle in service workerFail loud rather than silently producing broken behavior
Use setTimeout as the only timing mechanismVerify the expected DOM state after each stepRace conditions vary by machine; verification is deterministic
Assume focus() places cursor in content areaUse coordinates to click inside the editor boundsFocus may land on placeholder div (absolutely positioned overlay)
Use clickLast for buttons that appear dynamicallyUse semantic selectors (clickIcon, clickText)New buttons (clear, settings) change DOM order unpredictably

Small Examples

Detecting a Slate editor on the page

function findSlateEditor() {
  return document.querySelector("div[data-slate-editor='true']");
}

function isSlateEditor(el) {
  return el?.getAttribute("data-slate-editor") === "true";
}

Clearing a Slate editor before typing

// In MAIN world
async function clearSlate(selector) {
  const el = document.querySelector(selector);
  if (!el) return false;

  const rect = el.getBoundingClientRect();
  el.focus();

  // Select all content
  el.dispatchEvent(new InputEvent("beforeinput", {
    inputType: "deleteContentBackward",
    bubbles: true,
    cancelable: true,
  }));

  // Better: use Ctrl+A then delete
  document.execCommand("selectAll", false);

  el.dispatchEvent(new InputEvent("beforeinput", {
    inputType: "deleteContentBackward",
    bubbles: true,
    cancelable: true,
  }));

  await new Promise(r => setTimeout(r, 100));

  return isSlateEmpty(el);
}

function isSlateEmpty(el) {
  return el.querySelector("[data-slate-zero-width]") !== null
    && el.querySelector("[data-slate-string]") === null;
}

Reading current Slate content

function getSlateText(editorEl) {
  return Array.from(editorEl.querySelectorAll("[data-slate-string]"))
    .map(n => n.textContent)
    .join("");
}

function getSlateContent(editorEl) {
  // Returns structured content (paragraphs, not just flat text)
  const paragraphs = editorEl.querySelectorAll("[data-slate-node='element']");
  return Array.from(paragraphs).map(p => {
    const strings = p.querySelectorAll("[data-slate-string]");
    return Array.from(strings).map(s => s.textContent).join("");
  });
}

Waiting for Slate to be ready

async function waitForSlate(selector, timeoutMs = 10000) {
  const deadline = Date.now() + timeoutMs;
  while (Date.now() < deadline) {
    const el = document.querySelector(selector);
    if (el && el.getAttribute("data-slate-editor") === "true") {
      // Check that Slate has initialized (has child nodes)
      if (el.querySelector("[data-slate-node]")) return el;
    }
    await new Promise(r => setTimeout(r, 200));
  }
  return null;
}

Type and submit as a single operation

// Operator DSL steps for typing into Slate and submitting
const steps = [
  { "op": "waitForSelector", "selector": "div[data-slate-editor='true']" },
  {
    "op": "typeSlate",
    "selector": "div[data-slate-editor='true']",
    "value": "A photorealistic cat in a tiny hat, studio lighting"
  },
  { "op": "sleep", "ms": 300 },
  { "op": "clickIcon", "icon": "arrow_forward" }
];

Handling multi-line input in Slate

// In MAIN world — insert text with line breaks
async function typeMultilineSlate(selector, lines) {
  const el = document.querySelector(selector);
  if (!el) return false;

  // Establish selection
  const rect = el.getBoundingClientRect();
  el.focus();
  el.dispatchEvent(new MouseEvent("mousedown", {
    bubbles: true, cancelable: true, button: 0,
    clientX: rect.left + 10, clientY: rect.top + rect.height / 2,
  }));
  el.dispatchEvent(new MouseEvent("mouseup", {
    bubbles: true, cancelable: true, button: 0,
    clientX: rect.left + 10, clientY: rect.top + rect.height / 2,
  }));
  el.dispatchEvent(new MouseEvent("click", {
    bubbles: true, cancelable: true, button: 0,
    clientX: rect.left + 10, clientY: rect.top + rect.height / 2,
  }));
  await new Promise(r => setTimeout(r, 150));

  for (let i = 0; i < lines.length; i++) {
    // Insert text
    el.dispatchEvent(new InputEvent("beforeinput", {
      inputType: "insertText",
      data: lines[i],
      bubbles: true,
      cancelable: true,
    }));
    await new Promise(r => setTimeout(r, 50));

    // Insert line break (except after last line)
    if (i < lines.length - 1) {
      el.dispatchEvent(new InputEvent("beforeinput", {
        inputType: "insertParagraph",
        bubbles: true,
        cancelable: true,
      }));
      await new Promise(r => setTimeout(r, 50));
    }
  }

  await new Promise(r => setTimeout(r, 150));
  return true;
}

The Architecture: Extension as Step Executor

The typeSlate implementation exists within a larger architecture. Understanding the full picture helps you build your own automation systems.

The three-tier architecture

┌────────────────────┐    poll    ┌─────────────────┐
│   CF Worker        │◄──────────│  Chrome Extension│
│   (Job Queue)      │───step────│  (Service Worker)│
│                    │           │                   │
│  - operators (DSL) │  report   │  ┌─────────────┐ │
│  - job state       │◄──────────│  │ Content.js  │ │
│  - step cursor     │           │  │ (Isolated)  │ │
│  - delivery        │           │  └──────┬──────┘ │
└────────────────────┘           │         │        │
                                 │  For typeSlate:  │
                                 │  ┌─────────────┐ │
                                 │  │ MAIN world  │ │
                                 │  │ (inline fn) │ │
                                 │  └─────────────┘ │
                                 └──────────────────┘
  1. CF Worker holds operators (JSON step programs) and jobs (step cursor + outputs). Stateless API.
  2. Service Worker polls for the next step, routes to the correct tab, dispatches the step.
  3. Content script (isolated world) executes most steps: click, type, wait, observe.
  4. MAIN world (inline function via executeScript) handles steps that need page JS access: typeSlate.

Why the service worker dispatches MAIN world scripts

Content scripts can’t call chrome.scripting.executeScript. Only the service worker has access to the chrome.scripting API. This means:

The service worker acts as a router. It inspects the step’s op field and decides whether to forward to the content script (via chrome.tabs.sendMessage) or execute directly (via chrome.scripting.executeScript).

Keeping the service worker alive

Chrome MV3 terminates idle service workers after ~30 seconds. Long-running steps (video generation, waiting for assets) would cause the service worker to die mid-execution. The solution is a heartbeat:

// Heartbeat keeps SW alive during long content script steps
const heartbeat = setInterval(() => {
  chrome.storage.local.set({ __heartbeat: Date.now() });
}, 5000);

try {
  const result = await sendStepToContentScript(tabId, step);
  return result;
} finally {
  clearInterval(heartbeat);
}

Writing to chrome.storage.local counts as activity and prevents termination.


Debugging Checklist

When your Slate automation doesn’t work, check these in order:

  1. Is the event dispatched from MAIN world? If you’re in a content script (isolated world), events won’t reach React. Use chrome.scripting.executeScript({ world: "MAIN" }).

  2. Is Slate’s selection initialized? Check that you dispatched mousedown/mouseup/click with real coordinates before beforeinput. Without a Slate selection, text insertion is a no-op.

  3. Are the coordinates inside the content area? Use getBoundingClientRect() on the [data-slate-editor] element. Click at left + 10, top + height/2 — not the center, which might hit the placeholder overlay on some layouts.

  4. Is the timing sufficient? The 150ms delays between mouse events and beforeinput aren’t arbitrary. Slate’s event handlers are asynchronous (they schedule React state updates). Too fast and the selection isn’t established yet.

  5. Is the text in [data-slate-string] elements? Don’t check el.textContent. Query for [data-slate-string] nodes specifically.

  6. Does the host app’s state update? Some applications wrap Slate’s onChange in debounce or startTransition. If the submit happens within the debounce window, the app state may be stale. Add extra delay before clicking submit.

  7. Are you clicking the right button? Use icon text matching (clickIcon), not position-based selectors (clickLast).


References

Chrome Extension APIs

Slate.js

Web Standards

Extension Development Frameworks

Other Rich Text Editors


Edit page
Share this post on:

Previous Post
Local Testing of Distributed Systems
Next Post
The Faceless Channel Playbook