Automating Slate.js Editors from a Chrome Extension

Slate.js editors don’t respond to standard browser automation. execCommand, dispatchEvent, direct DOM writes — none of it works the way you’d expect. The text appears in the DOM but the host application reads empty state on submit. This article documents every approach that failed, why each one failed, and the one technique that reliably gets text into Slate’s internal model from a Chrome MV3 extension service worker.

What you’ll learn:

Why Slate.js editors are uniquely resistant to standard automation techniques
The difference between Chrome’s isolated world and MAIN world, and why it matters for React apps
Five specific approaches that fail, with detailed explanations of why each one fails
The one approach that works: beforeinput dispatched from MAIN world with proper cursor placement
How to verify text actually landed in Slate’s model (not just the DOM)
How to find the right button when multiple icon buttons share the same CSS structure
A reusable typeSlate operator pattern for Chrome extension automation frameworks

Open Table of Contents

The Problem
- What makes this worse in a Chrome extension
Background: Chrome Extension World Isolation
Background: Slate.js DOM Architecture
What Failed (and Why)
What Works
- Why the mouse events matter
- Why beforeinput and not input
The Full Implementation
Verification: Reading Slate’s Real Content
Finding the Right Submit Button
- The problem with CSS selectors
- The solution: match by icon text content
Patterns for Other Rich Text Editors
Anti-Patterns
Small Examples
The Architecture: Extension as Step Executor
Debugging Checklist
References

The Problem

You’re building a browser automation platform. Your Chrome MV3 extension receives JSON step programs from a remote server, executes them against web pages via content scripts, and reports results back. It works perfectly for standard inputs, standard buttons, standard forms.

Then you encounter a Slate.js rich text editor.

Slate is not a <textarea>. It’s not even a regular contenteditable div. It’s a React-managed DOM where the visible elements are a projection of an internal data model. The critical constraint: the host application reads Slate’s internal model on submit, not the DOM. If your text is in the DOM but not in the model, the application sees an empty editor.

This creates a uniquely hard automation problem. With a regular <input>, you can set .value and dispatch a change event. With a regular contenteditable, you can use execCommand('insertText'). With Slate, you need to go through Slate’s own event processing pipeline so that:

The editor’s internal model (a tree of Node objects) is updated
The onChange callback fires
The host application’s React state is updated in the same render cycle

Miss any one of these steps and the submit button reads stale state.

What makes this worse in a Chrome extension

Chrome MV3 extensions run content scripts in an isolated world — a separate JavaScript context from the page. You share the DOM, but you don’t share the window object, constructors, or event listeners. An InputEvent created in the isolated world is literally a different class than the InputEvent the page’s JavaScript knows about.

React’s event delegation — where all event handlers are registered on the root element — runs in the page’s JavaScript context. Events from the isolated world never reach it.

Background: Chrome Extension World Isolation

Chrome extensions have two JavaScript execution environments for interacting with page content. Understanding the distinction is the single most important concept in this article.

Isolated World (default for content scripts)

┌─────────────────────────────────────────────────┐
│                  Web Page                         │
│                                                   │
│  ┌─────────────┐        ┌──────────────────────┐ │
│  │ Page JS      │        │ Content Script       │ │
│  │              │        │ (Isolated World)     │ │
│  │ window.*     │        │                      │ │
│  │ React        │        │ window.* (different) │ │
│  │ Slate editor │        │ No React access      │ │
│  │              │        │ No Slate access      │ │
│  └──────┬───────┘        └──────────┬───────────┘ │
│         │                           │              │
│         └─────────┬─────────────────┘              │
│                   │                                │
│            ┌──────▼──────┐                         │
│            │   Shared    │                         │
│            │    DOM      │                         │
│            └─────────────┘                         │
└─────────────────────────────────────────────────────┘

Your content script can read and write the DOM. It can querySelector, it can click(), it can textContent. But:

window is a different object. window.someGlobal set by the page is invisible.
Event constructors are different. new InputEvent(...) in the isolated world creates an instance from the isolated world’s InputEvent class.
React’s event delegation runs in the page’s context. Synthetic events created in the isolated world are invisible to React’s root listener.

MAIN World (via `chrome.scripting.executeScript`)

// From the service worker (background.js):
chrome.scripting.executeScript({
  target: { tabId },
  world: "MAIN",       // <-- runs in page's JS context
  func: (value) => {
    // This code has full access to the page's JavaScript:
    // - window.* is the page's window
    // - React fiber trees are accessible
    // - Event constructors are the same ones React listens for
    // - Slate's editor instance is reachable
  },
  args: ["hello world"],
});

MAIN world scripts run as if they were part of the page. Events created here are real page events. React processes them normally. This is only available from the service worker via chrome.scripting.executeScript — content scripts cannot opt into MAIN world at runtime.

Key insight: The isolated world vs MAIN world distinction is invisible when automating simple HTML forms. It only becomes critical when the target uses a JavaScript framework (React, Vue, Angular) that manages its own event system. Slate.js is the hardest case because it combines React’s synthetic events with a custom data model that diverges from the DOM.

Tradeoffs

	Isolated World	MAIN World
Access	DOM only	DOM + page JS
Chrome APIs	Full (`chrome.runtime`, etc.)	None
Security	Extension code is protected	Page can see/interfere with your code
Event dispatch	Events invisible to React	Events processed normally
Invocation	Content script runs automatically	Must use `chrome.scripting.executeScript` from SW
Communication	Direct `chrome.runtime.sendMessage`	Must use DOM events or `postMessage` to talk to extension

Background: Slate.js DOM Architecture

Before examining why each approach fails, you need to understand how Slate renders to the DOM.

The empty state

When a Slate editor is empty, the DOM looks like this:

<div data-slate-editor="true" contenteditable="true" role="textbox">
  <div data-slate-node="element">
    <span data-slate-node="text">
      <span data-slate-zero-width="z" data-slate-length="0">
        &#xFEFF;         <!-- zero-width no-break space -->
      </span>
    </span>
  </div>
  <div data-slate-placeholder="true"
       contenteditable="false"
       style="position: absolute; pointer-events: none; ...">
    What do you want to create?   <!-- placeholder text -->
  </div>
</div>

Key elements:

data-slate-editor="true" — The root editor element. This is your automation target.
data-slate-zero-width="z" — A zero-width character (\uFEFF) that Slate uses as a cursor anchor in empty nodes. This is the “nothing here yet” marker.
data-slate-placeholder="true" — The visible placeholder text. It’s a separate div positioned absolutely over the editor content. It is contenteditable="false" so it doesn’t interfere with editing.

After text is inserted

When text exists in the editor:

<div data-slate-editor="true" contenteditable="true" role="textbox">
  <div data-slate-node="element">
    <span data-slate-node="text">
      <span data-slate-string="true">
        A cat wearing a tiny hat   <!-- actual content -->
      </span>
    </span>
  </div>
  <!-- placeholder div is now hidden via style -->
</div>

The critical difference: data-slate-string="true" replaces data-slate-zero-width. The placeholder div is hidden but still present in the DOM.

The model behind the DOM

Slate’s actual data model is a tree of Node objects:

// Simplified Slate model
interface Editor {
  children: Node[];
  selection: Range | null;
  onChange: () => void;
}

interface Element {
  type: string;
  children: Node[];
}

interface Text {
  text: string;
  // ...marks (bold, italic, etc.)
}

The DOM is a projection of this model. When Slate processes a beforeinput event, it:

Reads the inputType and data from the event
Calls the appropriate transform on the model (e.g., Editor.insertText(editor, data))
Triggers onChange
React re-renders the DOM from the new model state

If you modify the DOM directly without going through this pipeline, the model and DOM diverge. Slate’s model still says “empty” while the DOM shows your text. The host application reads the model.

What Failed (and Why)

Each of these approaches was tested against a production Slate.js editor (Google Flow, which uses Slate for its prompt input). Each one fails for a specific, instructive reason.

Attempt 1: `execCommand('insertText')` from Isolated World

// In content.js (isolated world)
function setSlateText(el, text) {
  el.focus();
  document.execCommand("selectAll", false);
  document.execCommand("delete", false);
  document.execCommand("insertText", false, text);
  el.dispatchEvent(new InputEvent("input", {
    bubbles: true,
    cancelable: true,
  }));
}

What happens: Text appears visually in the editor. But it renders inside the data-slate-zero-width span (the empty-state marker), not in a data-slate-string span. The Slate model is unchanged.

Why it fails: execCommand modifies the DOM directly through the browser’s built-in editing commands. These commands bypass Slate’s event pipeline entirely. Slate never sees a beforeinput event, never updates its model, never fires onChange. The host app reads the model on submit and sees an empty editor.

Key insight: execCommand is a DOM operation, not a JavaScript event. It doesn’t trigger React’s synthetic event system regardless of which world you’re in. Even in MAIN world, execCommand('insertText') does the wrong thing — see Attempt 4.

Attempt 2: `dispatchEvent(new InputEvent('beforeinput'))` from Isolated World

// In content.js (isolated world)
const el = document.querySelector("div[data-slate-editor='true']");
el.focus();
el.dispatchEvent(new InputEvent("beforeinput", {
  inputType: "insertText",
  data: "A cat wearing a tiny hat",
  bubbles: true,
  cancelable: true,
}));

What happens: Nothing. The editor state doesn’t change at all.

Why it fails: This is the Chrome extension world isolation problem at its core. The InputEvent constructor in the isolated world is a different class than the InputEvent in the page’s JavaScript context. When this event bubbles up the DOM, React’s event delegation (which runs in the page’s context) doesn’t recognize it as a real beforeinput event.

More precisely: React uses addEventListener on the root element (in the page’s context) to listen for beforeinput. The browser’s event dispatch does deliver the event — DOM events cross the world boundary. But React’s handler checks properties and behavior of the event object, and the isolated-world event object doesn’t satisfy those checks. Slate’s onDOMBeforeInput handler in the Editable component never fires.

Attempt 3: React Fiber Walk + `editor.insertText()` from MAIN World

// In chrome.scripting.executeScript({ world: "MAIN" })
const el = document.querySelector("div[data-slate-editor='true']");

// Walk the React fiber tree to find the Slate editor instance
const fiberKey = Object.keys(el).find(k => k.startsWith("__reactFiber$"));
let fiber = el[fiberKey];

// Walk up the fiber tree looking for the editor in memoizedProps or stateNode
while (fiber) {
  const editor = fiber.memoizedProps?.editor
    || fiber.stateNode?.editor
    || fiber.memoizedState?.memoizedState?.editor;
  if (editor?.insertText) {
    editor.insertText("A cat wearing a tiny hat");
    break;
  }
  fiber = fiber.return;
}

What happens: The text appears in the editor. The Slate model IS updated — editor.insertText() modifies the internal node tree correctly. Slate’s DOM rendering updates to show a data-slate-string span.

But on submit, the host application still reads the old (empty) value.

Why it fails: This is the subtlest failure. The issue is when onChange fires relative to the host app’s React state update cycle.

The host application (Google Flow, in this case) stores the prompt text in its own React state, updated via Slate’s onChange callback:

// Simplified host app code
const [prompt, setPrompt] = useState("");

<Slate
  editor={editor}
  onChange={() => {
    const text = Editor.string(editor, []);
    setPrompt(text);  // Host app tracks prompt in its own state
  }}
>
  <Editable />
</Slate>

When you call editor.insertText() directly, onChange fires. But the state update may be batched by React and not committed before the submit handler reads prompt. There’s also the possibility that onChange is wrapped in a useCallback with stale closure variables, or that the state update is inside a startTransition and treated as low-priority.

The result: Slate’s model has the text, the DOM shows the text, but the host app’s submit handler reads its own React state which hasn’t been updated yet.

Key insight: Calling Slate API methods directly bypasses the host application’s event handling, not just Slate’s. Even if Slate is happy, the app wrapping Slate may not be.

Attempt 4: `execCommand('insertText')` from MAIN World

// In chrome.scripting.executeScript({ world: "MAIN" })
const el = document.querySelector("div[data-slate-editor='true']");
el.focus();
document.execCommand("insertText", false, "A cat wearing a tiny hat");

What happens: The text is inserted, but into the wrong element. The cursor lands in the [data-slate-placeholder] div — the “What do you want to create?” overlay. Text appears inside the placeholder element, which is contenteditable="false", creating an impossible DOM state. The text can’t be deleted. The application rejects the submission with “prompt must be valid.”

Why it fails: When you call el.focus() on the Slate editor root, the browser places the cursor at the first focusable position. But Slate’s placeholder div ([data-slate-placeholder]) is absolutely positioned over the content area. Depending on the layout, focus() may place the cursor in the placeholder div rather than in the actual content area (the [data-slate-node="element"] div).

execCommand('insertText') inserts text at the current cursor position. If the cursor is in the placeholder, the text goes into the placeholder. Since the placeholder is contenteditable="false", the result is a corrupted DOM that Slate can’t reconcile with its model.

Attempt 5: Clicking the wrong submit button

This isn’t a Slate failure, but it’s part of the same debugging session and it’s a common automation trap.

After successfully typing text (using the working approach below), the automation needed to click a submit button:

// The operator step
{ "op": "clickLast", "selector": "button:has(i.google-symbols)" }

What happens: The text is cleared and no generation starts.

Why it fails: The Google Flow interface has multiple buttons with Material Symbols icons:

Settings button (gear icon)
Clear button (x icon) — appears after text is entered
Submit button (arrow_forward icon)

clickLast("button:has(i.google-symbols)") matches all of them and picks the last one in DOM order, which is the clear button (x). The input is erased.

What Works

InputEvent('beforeinput') dispatched from MAIN world, with proper cursor placement via simulated mouse events.

Here’s why this is the correct approach:

MAIN world ensures the InputEvent constructor is the page’s own InputEvent. React and Slate process it normally.
Mouse events before dispatch establish Slate’s internal selection. Slate tracks its own selection state (separate from window.getSelection()). Without a proper Slate selection, beforeinput has nowhere to insert text.
beforeinput with inputType: "insertText" follows Slate’s own input processing path. Slate handles this event in its Editable component’s onDOMBeforeInput handler, which calls Editor.insertText() internally. This updates the model, triggers onChange, and the host app’s state is updated in the same React batch.

// In chrome.scripting.executeScript({ world: "MAIN" })
async function typeIntoSlate(selector, value) {
  const el = document.querySelector(selector);
  if (!el) throw new Error(`Slate editor not found: ${selector}`);

  // Step 1: Establish Slate's selection with real mouse coordinates
  const rect = el.getBoundingClientRect();
  const cx = rect.left + 10;
  const cy = rect.top + rect.height / 2;

  el.focus();

  el.dispatchEvent(new MouseEvent("mousedown", {
    bubbles: true, cancelable: true, button: 0,
    clientX: cx, clientY: cy,
  }));
  el.dispatchEvent(new MouseEvent("mouseup", {
    bubbles: true, cancelable: true, button: 0,
    clientX: cx, clientY: cy,
  }));
  el.dispatchEvent(new MouseEvent("click", {
    bubbles: true, cancelable: true, button: 0,
    clientX: cx, clientY: cy,
  }));

  await new Promise(r => setTimeout(r, 150));

  // Step 2: Dispatch beforeinput — Slate's handler processes this natively
  el.dispatchEvent(new InputEvent("beforeinput", {
    inputType: "insertText",
    data: value,
    bubbles: true,
    cancelable: true,
  }));

  await new Promise(r => setTimeout(r, 150));

  // Step 3: Verify the text landed in Slate's model (not just the DOM)
  const slateStr = Array.from(el.querySelectorAll("[data-slate-string]"))
    .map(n => n.textContent)
    .join("");

  return slateStr.includes(value);
}

Why the mouse events matter

You might think el.focus() is enough. It isn’t. Here’s what focus() alone does:

The browser gives keyboard focus to the contenteditable div
window.getSelection() may or may not point inside the editor
Slate’s internal editor.selection may still be null

Slate updates its selection state in response to mouse events, not just focus. The Editable component has onMouseDown and onClick handlers that call ReactEditor.toSlateRange() to convert the browser selection into a Slate selection. Without this, editor.selection is null, and beforeinput processing with a null selection is a no-op.

The coordinates matter too. Using rect.left + 10 (slightly inside the left edge) and rect.top + rect.height / 2 (vertical center) ensures the click lands in the content area, not on the placeholder overlay.

Why `beforeinput` and not `input`

Slate processes beforeinput events, not input events. The beforeinput event is cancellable and fires before the browser modifies the DOM. Slate intercepts it, prevents the default browser behavior, and makes its own model changes. The input event fires after the browser has already modified the DOM — too late for Slate’s pipeline.

This is actually the modern approach for all contenteditable editors. The Input Events Level 2 specification standardized beforeinput as the correct event for rich text input processing. Slate adopted this as its primary input mechanism starting in 2020.

The Full Implementation

Here’s the production implementation from a Chrome MV3 extension service worker that dispatches typeSlate steps to Slate.js editors. This is the actual code running in a browser automation platform.

The service worker (background.js)

// In the service worker's step dispatcher
async function sendStep(tabId, step) {
  // typeSlate MUST run in MAIN world — isolated-world events
  // can't trigger React's synthetic event system
  if (step.op === "typeSlate") {
    try {
      const [{ result }] = await chrome.scripting.executeScript({
        target: { tabId },
        world: "MAIN",
        func: async (selector, value) => {
          const el = document.querySelector(selector);
          if (!el) {
            return {
              ok: false,
              error: `typeSlate: selector not found: ${selector}`,
            };
          }

          // Focus and click to establish Slate's selection.
          // Real coordinates place the cursor in the content area,
          // not on the placeholder overlay.
          const rect = el.getBoundingClientRect();
          el.focus();
          const cx = rect.left + 10;
          const cy = rect.top + rect.height / 2;

          el.dispatchEvent(new MouseEvent("mousedown", {
            bubbles: true, cancelable: true, button: 0,
            clientX: cx, clientY: cy,
          }));
          el.dispatchEvent(new MouseEvent("mouseup", {
            bubbles: true, cancelable: true, button: 0,
            clientX: cx, clientY: cy,
          }));
          el.dispatchEvent(new MouseEvent("click", {
            bubbles: true, cancelable: true, button: 0,
            clientX: cx, clientY: cy,
          }));

          await new Promise(r => setTimeout(r, 150));

          // Fire beforeinput — Slate's own handler processes this,
          // updating model AND triggering onChange so the host app's
          // React state is properly updated.
          el.dispatchEvent(new InputEvent("beforeinput", {
            inputType: "insertText",
            data: value,
            bubbles: true,
            cancelable: true,
          }));

          await new Promise(r => setTimeout(r, 150));

          // Verify via [data-slate-string] nodes, not el.textContent
          const slateStr = Array.from(
            el.querySelectorAll("[data-slate-string]")
          )
            .map(n => n.textContent)
            .join("");

          const inserted = slateStr.includes(value);

          return {
            ok: inserted,
            source: "beforeinput",
            textAfter: slateStr,
            error: inserted
              ? null
              : `typeSlate: text not in slate strings (got: "${slateStr.slice(0, 80)}")`,
          };
        },
        args: [step.selector, step.value],
      });

      if (!result) return { ok: false, error: "typeSlate: no result" };

      return {
        ok: result.ok,
        error: result.ok ? undefined : result.error,
        outputs: {
          typeSlate_source: result.source ?? null,
          typeSlate_text: result.textAfter ?? null,
        },
        logs: [
          `typeSlate: source=${result.source} inserted=${result.ok} text="${(result.textAfter ?? "").slice(0, 60)}"`,
        ],
      };
    } catch (e) {
      return { ok: false, error: "typeSlate: " + e.message };
    }
  }

  // Other ops dispatch to content script normally...
}

The operator step (JSON DSL)

{
  "op": "typeSlate",
  "selector": "div[data-slate-editor='true']",
  "value": "A photorealistic cat wearing a tiny top hat, studio lighting"
}

Why the content script throws on `typeSlate`

In the content script, typeSlate is defined as an immediate error:

const ops = {
  async typeSlate() {
    throw new Error(
      "typeSlate must be intercepted by background SW — " +
      "content script should never receive this op"
    );
  },
  // ... other ops
};

This is a safety guard. If the service worker’s step router fails to intercept typeSlate and forwards it to the content script, the content script throws rather than silently producing broken behavior. Fail loud, not wrong.

Verification: Reading Slate’s Real Content

After dispatching beforeinput, you need to verify the text actually made it into Slate’s model. The obvious approach — reading el.textContent — is wrong.

Why `el.textContent` lies

// DON'T do this:
const text = el.textContent;
// Returns: "What do you want to create?A cat wearing a tiny hat"
//           ^-- placeholder text ----^^-- actual content --------^

el.textContent concatenates text from ALL child nodes, including the [data-slate-placeholder] div. Even when the placeholder is visually hidden (via CSS), it’s still in the DOM. You get placeholder text glued to actual content.

The correct approach: `[data-slate-string]` elements

// DO this:
const slateStr = Array.from(el.querySelectorAll("[data-slate-string]"))
  .map(n => n.textContent)
  .join("");

// Returns: "A cat wearing a tiny hat"

[data-slate-string="true"] elements only exist when Slate has real text content. They replace the [data-slate-zero-width] elements that mark empty nodes. If [data-slate-string] elements exist and contain your text, the Slate model has been updated.

Checking for the empty state

function isSlateEmpty(editorEl) {
  // If zero-width markers exist and no string nodes exist, the editor is empty
  const hasZeroWidth = editorEl.querySelector("[data-slate-zero-width]") !== null;
  const hasString = editorEl.querySelector("[data-slate-string]") !== null;
  return hasZeroWidth && !hasString;
}

Finding the Right Submit Button

After typing text into a Slate editor, you typically need to click a submit button. When the application uses icon buttons (e.g., Google Material Symbols), CSS selectors are unreliable.

The problem with CSS selectors

// Matches ALL buttons with a Material Symbols icon
document.querySelectorAll("button:has(i.google-symbols)")
// Returns: [settings_btn, clear_btn, submit_btn]

In Google Flow, typing text into the editor causes a clear (x) button to appear. If your automation uses clickLast("button:has(i.google-symbols)"), it clicks the clear button and erases the input.

The solution: match by icon text content

Google Material Symbols renders icons using ligatures. The <i> element’s text content is the icon name:

<button>
  <i class="google-symbols">arrow_forward</i>  <!-- submit -->
</button>
<button>
  <i class="google-symbols">close</i>          <!-- clear -->
</button>
<button>
  <i class="google-symbols">settings</i>       <!-- settings -->
</button>

Match on the icon name:

function clickIcon(iconName) {
  const btn = Array.from(document.querySelectorAll("button"))
    .filter(el => {
      if (el.offsetParent === null) {
        const s = getComputedStyle(el);
        if (s.position !== "fixed") return false;
      }
      return true;
    })
    .find(b =>
      b.querySelector("i.google-symbols")?.textContent?.trim() === iconName
    );

  if (!btn) throw new Error(`clickIcon: icon "${iconName}" not found`);

  btn.scrollIntoView({ behavior: "smooth", block: "center" });
  btn.click();
}

// Usage:
clickIcon("arrow_forward");  // clicks only the submit button

The operator step in JSON DSL:

{ "op": "clickIcon", "icon": "arrow_forward" }

This approach is:

Precise — matches exactly one button
Readable — the step clearly says which icon to click
Resilient — works regardless of DOM order or new buttons being added

Patterns for Other Rich Text Editors

The Slate.js approach generalizes. Here’s how to adapt it for other rich text editor frameworks.

Tiptap / ProseMirror

Tiptap (built on ProseMirror) is simpler to automate because execCommand('insertText') works from the isolated world:

// In content.js (isolated world) — works fine
function setTiptapText(el, text) {
  el.focus();
  document.execCommand("selectAll", false);
  document.execCommand("delete", false);
  document.execCommand("insertText", false, text);
  el.dispatchEvent(new InputEvent("input", {
    bubbles: true,
    cancelable: true,
  }));
}

ProseMirror’s input rules plugin intercepts DOM mutations (via MutationObserver) rather than relying on beforeinput. This means execCommand mutations are visible to ProseMirror even from the isolated world.

Draft.js

Draft.js (Facebook’s editor, now in maintenance mode) uses React’s synthetic onChange. From MAIN world:

// In chrome.scripting.executeScript({ world: "MAIN" })
const el = document.querySelector("[data-editor]");
el.focus();

// Draft.js reads from native beforeinput too (in modern mode)
el.dispatchEvent(new InputEvent("beforeinput", {
  inputType: "insertText",
  data: value,
  bubbles: true,
  cancelable: true,
}));

Draft.js in legacy mode may require the React fiber approach (walking __reactFiber$* keys), as it relied on onChange from contenteditable rather than beforeinput.

CodeMirror 6

CodeMirror 6 handles beforeinput similarly to Slate:

// In chrome.scripting.executeScript({ world: "MAIN" })
const el = document.querySelector(".cm-content");
el.focus();

el.dispatchEvent(new InputEvent("beforeinput", {
  inputType: "insertText",
  data: value,
  bubbles: true,
  cancelable: true,
}));

Comparison table

Editor	Automation Approach	World Required	`execCommand` Works?	Notes
Slate.js	`beforeinput` + mouse events	MAIN	No	Must establish Slate selection first
Tiptap/ProseMirror	`execCommand('insertText')`	Isolated OK	Yes	Uses MutationObserver, not beforeinput
Draft.js (modern)	`beforeinput`	MAIN	No	Similar to Slate
Draft.js (legacy)	React fiber walk	MAIN	No	Need to find ContentState
CodeMirror 6	`beforeinput`	MAIN	Partial	Simpler — no React wrapper typically
Monaco	`editor.setValue()` via fiber	MAIN	No	Not contenteditable — uses own rendering
Plain `contenteditable`	`execCommand('insertText')`	Isolated OK	Yes	Simplest case
Standard `<textarea>`	`.value =` + `dispatchEvent('input')`	Isolated OK	N/A	No rich text

Key insight: The harder the editor works to maintain its own model (separate from the DOM), the harder it is to automate. Slate and Draft.js are the hardest because they’re React-managed AND model-driven. ProseMirror/Tiptap are easier because they observe DOM mutations. Plain contenteditable is trivial.

Anti-Patterns

Don’t	Do Instead	Why
Use `execCommand('insertText')` on Slate editors	Dispatch `beforeinput` from MAIN world	`execCommand` bypasses Slate’s event pipeline; text enters DOM but not the model
Dispatch events from isolated world for React apps	Use `chrome.scripting.executeScript({ world: "MAIN" })`	Isolated-world events are invisible to React’s synthetic event system
Call `editor.insertText()` directly via fiber walk	Let `beforeinput` flow through Slate’s handler	Direct API calls may not trigger the host app’s `onChange` state update
Use `el.textContent` to verify Slate content	Query `[data-slate-string]` elements	`textContent` includes the hidden placeholder text
Use `el.focus()` alone before dispatching input	Simulate mousedown/mouseup/click with real coordinates	Slate needs mouse events to initialize its internal selection state
Use CSS selectors like `:has(i.google-symbols)` for icon buttons	Match by icon `textContent` (e.g., `"arrow_forward"`)	CSS matches all icon buttons; text content is unique per icon
Let content script handle MAIN-world operations	Throw an error in content script; handle in service worker	Fail loud rather than silently producing broken behavior
Use `setTimeout` as the only timing mechanism	Verify the expected DOM state after each step	Race conditions vary by machine; verification is deterministic
Assume `focus()` places cursor in content area	Use coordinates to click inside the editor bounds	Focus may land on placeholder div (absolutely positioned overlay)
Use `clickLast` for buttons that appear dynamically	Use semantic selectors (`clickIcon`, `clickText`)	New buttons (clear, settings) change DOM order unpredictably

Small Examples

Detecting a Slate editor on the page

function findSlateEditor() {
  return document.querySelector("div[data-slate-editor='true']");
}

function isSlateEditor(el) {
  return el?.getAttribute("data-slate-editor") === "true";
}

Clearing a Slate editor before typing

// In MAIN world
async function clearSlate(selector) {
  const el = document.querySelector(selector);
  if (!el) return false;

  const rect = el.getBoundingClientRect();
  el.focus();

  // Select all content
  el.dispatchEvent(new InputEvent("beforeinput", {
    inputType: "deleteContentBackward",
    bubbles: true,
    cancelable: true,
  }));

  // Better: use Ctrl+A then delete
  document.execCommand("selectAll", false);

  el.dispatchEvent(new InputEvent("beforeinput", {
    inputType: "deleteContentBackward",
    bubbles: true,
    cancelable: true,
  }));

  await new Promise(r => setTimeout(r, 100));

  return isSlateEmpty(el);
}

function isSlateEmpty(el) {
  return el.querySelector("[data-slate-zero-width]") !== null
    && el.querySelector("[data-slate-string]") === null;
}

Reading current Slate content

function getSlateText(editorEl) {
  return Array.from(editorEl.querySelectorAll("[data-slate-string]"))
    .map(n => n.textContent)
    .join("");
}

function getSlateContent(editorEl) {
  // Returns structured content (paragraphs, not just flat text)
  const paragraphs = editorEl.querySelectorAll("[data-slate-node='element']");
  return Array.from(paragraphs).map(p => {
    const strings = p.querySelectorAll("[data-slate-string]");
    return Array.from(strings).map(s => s.textContent).join("");
  });
}

Waiting for Slate to be ready

async function waitForSlate(selector, timeoutMs = 10000) {
  const deadline = Date.now() + timeoutMs;
  while (Date.now() < deadline) {
    const el = document.querySelector(selector);
    if (el && el.getAttribute("data-slate-editor") === "true") {
      // Check that Slate has initialized (has child nodes)
      if (el.querySelector("[data-slate-node]")) return el;
    }
    await new Promise(r => setTimeout(r, 200));
  }
  return null;
}

Type and submit as a single operation

// Operator DSL steps for typing into Slate and submitting
const steps = [
  { "op": "waitForSelector", "selector": "div[data-slate-editor='true']" },
  {
    "op": "typeSlate",
    "selector": "div[data-slate-editor='true']",
    "value": "A photorealistic cat in a tiny hat, studio lighting"
  },
  { "op": "sleep", "ms": 300 },
  { "op": "clickIcon", "icon": "arrow_forward" }
];

Handling multi-line input in Slate

// In MAIN world — insert text with line breaks
async function typeMultilineSlate(selector, lines) {
  const el = document.querySelector(selector);
  if (!el) return false;

  // Establish selection
  const rect = el.getBoundingClientRect();
  el.focus();
  el.dispatchEvent(new MouseEvent("mousedown", {
    bubbles: true, cancelable: true, button: 0,
    clientX: rect.left + 10, clientY: rect.top + rect.height / 2,
  }));
  el.dispatchEvent(new MouseEvent("mouseup", {
    bubbles: true, cancelable: true, button: 0,
    clientX: rect.left + 10, clientY: rect.top + rect.height / 2,
  }));
  el.dispatchEvent(new MouseEvent("click", {
    bubbles: true, cancelable: true, button: 0,
    clientX: rect.left + 10, clientY: rect.top + rect.height / 2,
  }));
  await new Promise(r => setTimeout(r, 150));

  for (let i = 0; i < lines.length; i++) {
    // Insert text
    el.dispatchEvent(new InputEvent("beforeinput", {
      inputType: "insertText",
      data: lines[i],
      bubbles: true,
      cancelable: true,
    }));
    await new Promise(r => setTimeout(r, 50));

    // Insert line break (except after last line)
    if (i < lines.length - 1) {
      el.dispatchEvent(new InputEvent("beforeinput", {
        inputType: "insertParagraph",
        bubbles: true,
        cancelable: true,
      }));
      await new Promise(r => setTimeout(r, 50));
    }
  }

  await new Promise(r => setTimeout(r, 150));
  return true;
}

The Architecture: Extension as Step Executor

The typeSlate implementation exists within a larger architecture. Understanding the full picture helps you build your own automation systems.

The three-tier architecture

┌────────────────────┐    poll    ┌─────────────────┐
│   CF Worker        │◄──────────│  Chrome Extension│
│   (Job Queue)      │───step────│  (Service Worker)│
│                    │           │                   │
│  - operators (DSL) │  report   │  ┌─────────────┐ │
│  - job state       │◄──────────│  │ Content.js  │ │
│  - step cursor     │           │  │ (Isolated)  │ │
│  - delivery        │           │  └──────┬──────┘ │
└────────────────────┘           │         │        │
                                 │  For typeSlate:  │
                                 │  ┌─────────────┐ │
                                 │  │ MAIN world  │ │
                                 │  │ (inline fn) │ │
                                 │  └─────────────┘ │
                                 └──────────────────┘

CF Worker holds operators (JSON step programs) and jobs (step cursor + outputs). Stateless API.
Service Worker polls for the next step, routes to the correct tab, dispatches the step.
Content script (isolated world) executes most steps: click, type, wait, observe.
MAIN world (inline function via executeScript) handles steps that need page JS access: typeSlate.

Why the service worker dispatches MAIN world scripts

Content scripts can’t call chrome.scripting.executeScript. Only the service worker has access to the chrome.scripting API. This means:

Content script handles: click, type (for non-React inputs), waitForSelector, navigate, observe*, extract*
Service worker handles: typeSlate, navigate (tab URL change), captureTab

The service worker acts as a router. It inspects the step’s op field and decides whether to forward to the content script (via chrome.tabs.sendMessage) or execute directly (via chrome.scripting.executeScript).

Keeping the service worker alive

Chrome MV3 terminates idle service workers after ~30 seconds. Long-running steps (video generation, waiting for assets) would cause the service worker to die mid-execution. The solution is a heartbeat:

// Heartbeat keeps SW alive during long content script steps
const heartbeat = setInterval(() => {
  chrome.storage.local.set({ __heartbeat: Date.now() });
}, 5000);

try {
  const result = await sendStepToContentScript(tabId, step);
  return result;
} finally {
  clearInterval(heartbeat);
}

Writing to chrome.storage.local counts as activity and prevents termination.

Debugging Checklist

When your Slate automation doesn’t work, check these in order:

Is the event dispatched from MAIN world? If you’re in a content script (isolated world), events won’t reach React. Use chrome.scripting.executeScript({ world: "MAIN" }).
Is Slate’s selection initialized? Check that you dispatched mousedown/mouseup/click with real coordinates before beforeinput. Without a Slate selection, text insertion is a no-op.
Are the coordinates inside the content area? Use getBoundingClientRect() on the [data-slate-editor] element. Click at left + 10, top + height/2 — not the center, which might hit the placeholder overlay on some layouts.
Is the timing sufficient? The 150ms delays between mouse events and beforeinput aren’t arbitrary. Slate’s event handlers are asynchronous (they schedule React state updates). Too fast and the selection isn’t established yet.
Is the text in [data-slate-string] elements? Don’t check el.textContent. Query for [data-slate-string] nodes specifically.
Does the host app’s state update? Some applications wrap Slate’s onChange in debounce or startTransition. If the submit happens within the debounce window, the app state may be stale. Add extra delay before clicking submit.
Are you clicking the right button? Use icon text matching (clickIcon), not position-based selectors (clickLast).

References

Chrome Extension APIs

Content Scripts — Chrome for Developers — Official documentation on isolated worlds and content script capabilities
chrome.scripting API — Chrome for Developers — executeScript with world: "MAIN" documentation
scripting.ExecutionWorld — MDN — Cross-browser specification for execution worlds
Event listeners in Content Scripts and communication between main and isolated CS worlds — w3c/webextensions#241 — W3C discussion on event propagation across worlds
Manifest V3 Content Scripts — Chrome for Developers — Manifest declaration for content scripts

Slate.js

Slate.js Documentation — Official documentation
Editable Component — Slate — Props including onDOMBeforeInput
Event Handling — Slate — How Slate processes DOM events
Migrating — Slate — The shift to beforeinput-based architecture
slate/packages/slate-react/src/components/editable.tsx — Source code for the Editable component’s event handlers
How to change Editor state from outside of react with DOM manipulation? — slate#5003 — Community discussion on external manipulation
Slate Examples — Custom Placeholder — Placeholder rendering behavior

Web Standards

Input Events Level 2 — W3C — The beforeinput event specification
ReactEditor — Slate — Bridge between React and Slate’s model

How to Force React to Fire Events Using Injected JavaScript in Chrome Extensions — javaspring.net — General techniques for React event dispatch
How to access React Props from Chrome Extension — TechKranti — React fiber tree walking from extensions
Update React State from Extension — chromium-extensions — Chromium team guidance on state manipulation

Extension Development Frameworks

Plasmo — Content Scripts — Framework-level handling of content script worlds
WXT — Content Scripts — Alternative framework with MAIN world support
CRXJS — MAIN world issues — Known issues with MAIN world in build tools

Other Rich Text Editors

ProseMirror Guide — Input Rules — How ProseMirror handles input (MutationObserver-based)
Building a rich text editor in React with SlateJS — Kitemaker — Practical Slate.js implementation details