Skip to content
Gary Wu
Go back

The Autonomous Entity Pattern

Edit page

Every complex domain is a hierarchy of entities. Each entity needs a persistent agent with memory, decision-making, and the ability to delegate work down and escalate failures up. Successful fixes crystallize into skills that make the whole hierarchy more capable over time. This is not a software architecture — it is a universal pattern for autonomous operation at scale.


The Shape of Every Complex Domain

Look at any domain where work needs to happen continuously, at scale, without constant human oversight. You will find the same structure:

Software organization:

Company
  Repository × N
    File / Module
      Issue / PR

Brand and content:

Company
  Brand × N
    Platform (YouTube, Twitter, LinkedIn)
      Content piece
        Asset (thumbnail, hook, caption)

Book publishing:

Book
  Part / Section
    Chapter
      Section
        Paragraph

Healthcare:

Hospital
  Department
    Doctor / Care team
      Patient case
        Treatment plan

Business operations:

Company
  Department × N
    Team
      Task / Project
        Subtask

Every one of these is a hierarchy of entities. Every entity has:

The universal problem: how do you keep all these entities moving toward their goals, continuously, without a human starting every conversation?


The Three-Part Answer

1. Persistent Agents at Every Level

Each entity in the hierarchy gets a persistent agent — not a stateless function that runs and exits, but a long-lived process with its own memory, its own schedule, and the ability to act without being prompted.

Built on Cloudflare Durable Objects, each agent is:

OrgPrime (Durable Object)
  ↕ coordinates with
  RepoPrime × N (Durable Object per repo)
    ↕ delegates to
    Dispatcher (CF Worker)
      ↕ routes to
      Runners (CF, CI, Local)

The higher agents think. The lower agents execute. Neither does both.

2. Escalation with Accumulated Context

Work starts at the lowest possible level — the cheapest, fastest executor. When it fails, it escalates. But escalation is not “retry with a better model.” It is a structured protocol:

Executor Level 0 fails
  → Advisor Level 1 diagnoses: "Here's what went wrong, here's what to try"
  → Re-queue Level 0 with new instructions
  → Level 0 retries with enriched context
  → Still fails?
  → Advisor Level 2 diagnoses: reads Level 0 failures + Level 1 advice
  → Produces better instructions, accounting for why Level 1's advice was insufficient
  → ...
  → Human (terminal: Telegram alert, full context attached)

The accumulated failure context travels with the work item:

{
  escalation_history: [
    { level: 0, label: "Static template", error: "..." },
    { level: 0, label: "Level 0 retry with L1 advice", error: "..." }
  ],
  escalation_advice: [
    { from_level: 1, instructions: "...", reasoning: "..." },
    { from_level: 2, instructions: "...", reasoning: "..." }
  ]
}

Higher advisors see the complete failure narrative, not just the last error. By the time a human sees it, they have everything they need to understand what was tried and why it failed.

The key insight: Higher models are expensive because they are good at diagnosis and reasoning — not because they are better at mechanical execution. Use them for diagnosis. Use cheap models for execution. The expensive model’s output is instructions, not implementation.

3. Skill Crystallization

When an advisor’s instructions lead to a successful fix, that knowledge should not be discarded. It gets written to a skill registry:

await writeSkill({
  pattern: {
    entity_type: "repo",
    signals: ["monorepo", "typescript", "packages/*"],
    job_type: "add-biome"
  },
  instructions: "Use root biome.json with packages/* overrides...",
  source: "escalation-l1",
  confidence: 1.0
})

The next time an executor handles the same job type on a similar entity, it loads matching skills and injects them into its prompt. The executor succeeds on the first attempt, without escalation — because the advisor’s insight has become permanent operational knowledge.

The compound effect:

WeekL0 success rateEscalation costThroughput
140%Baseline
465%−35%1.8×
1285%−70%3.2×
5295%+−90%10×+

The system gets cheaper and more capable simultaneously — not because models improved, but because accumulated expertise is being reused.


The Skill Inheritance Tree

Skills are not flat. They inherit across the hierarchy, from universal to specific:

Universal skills        (100% reuse — applies everywhere)
  Vertical skills       (80% reuse — e.g. SaaS vs e-commerce)
    Category skills     (70% reuse — e.g. developer tools vs B2B)
      Niche skills      (60% reuse — specific audience patterns)
        Entity skills   (20% unique — crystallized from this entity's own history)

When an executor loads skills for a task, it loads from all levels of this tree:

A skill that works for all TypeScript monorepos lives at the “category” level. A skill discovered for one specific repo’s quirky config lives at the “entity” level. Both are loaded, ranked by confidence, injected into the executor’s prompt.

The reuse percentages are not fixed — they emerge from the data. If a skill written at the “niche” level gets applied successfully to 50 entities across niches, it migrates up to the “category” level. Skills earn their place in the hierarchy through demonstrated utility.


Applied to Brand and Content

The brand system is a direct instance of this pattern. Every brand is an entity with its own DO, its own state, its own skill registry:

CompanyPrimeDO          ← brand strategy, voice, overall goals
  BrandPrimeDO × N      ← ICP, positioning, competitors, brand-specific skills
    PlatformPrimeDO     ← per platform (YouTube, Twitter, LinkedIn, Instagram)
      ContentPrimeDO    ← per content piece: draft → publish → performance
        AssetPrimeDO    ← thumbnail, hook, caption variants

Content creation as a job:

ContentPrimeDO wakes on schedule
  → State: "weekly video not yet drafted"
  → Submits job: type="draft-script", platform="youtube", brand="example-brand"

Dispatcher routes to executor:
  Level 0: load brand skills + platform skills → generate script from template
  → Performance: 40k views (target: 100k)
  → Escalate for advice

  Level 1 advisor reads: script template + performance data
  → "Hook was question-format; this audience responds to statement hooks.
     Also, intro is 45s — cut to 20s for retention."
  → Re-queue Level 0 with new instructions

  Level 0 retries: statement hook, 20s intro
  → Performance: 95k views ✓

  Skill crystallized:
    { pattern: { brand: "example-brand", platform: "youtube", audience: "finance" },
      instructions: "Use statement hooks. Keep intro under 20s.",
      source: "escalation-l1", confidence: 1.0 }

The brand’s content quality improves automatically. The skills registry fills with crystallized knowledge about what works for each brand’s specific audience. Over time, Level-0 executes correctly on the first attempt because it has months of accumulated expertise in its prompt.

The brand skill hierarchy:

Universal content skills    (hooks, CTAs, pacing that work everywhere)
  Platform skills           (YouTube: retention curves, thumbnail CTR patterns)
    Vertical skills         (finance content: authority signals, compliance language)
      Brand skills          (this brand: statement hooks, 20s intros, blue thumbnails)

When a competitor succeeds with a new format, the “vertical” skill is updated. The brand’s executor inherits it immediately. When this brand’s audience develops a preference for longer deep-dives, a brand-specific skill captures it without overriding the universal guidance.


Applied to Books

A book is a hierarchy of entities where quality at each level depends on quality at the levels below:

BookPrimeDO             ← thesis, audience, voice, overall arc
  PartPrimeDO           ← narrative arc, dependencies on other parts
    ChapterPrimeDO      ← chapter goal, argument, word target
      SectionPrimeDO    ← claim, evidence, draft state

Chapter drafting as a job:

ChapterPrimeDO: "Chapter 3 not yet drafted"
  → Job: type="draft-chapter", context={book_thesis, chapter_goal, prior_chapters}

  Level 0: outline template → section-by-section draft
  → Human review: "Chapter lacks concrete examples; too abstract"
  → Escalate

  Level 1 advisor: "Abstract argument needs 2-3 concrete case studies before the framework
    is introduced. Reader needs to feel the problem before accepting the solution."
  → Re-queue Level 0 with new instructions

  Level 0 retries: opens with case studies, then introduces framework
  → Human review: approved ✓

  Skill crystallized:
    { pattern: { book_type: "business-framework", chapter_type: "framework-intro" },
      instructions: "Lead with 2-3 case studies. Introduce framework after reader feels the problem.",
      source: "escalation-l1" }

The next chapter that introduces a new framework automatically loads this skill. The author’s structural insight — earned from one difficult chapter — propagates to every subsequent chapter and every future book with the same pattern.

The author’s voice guide is not written by the author — it is crystallized from every edit the editor (advisor) ever made:

Book-level skills:      author's voice, sentence rhythm, transition patterns
Chapter-level skills:   chapter structure patterns that worked for this book
Section-level skills:   argument patterns this audience responds to

By book three, Level-0 drafts chapters that read like the author. The editor’s role shifts from structural correction to fine-tuning.


Applied to Software Organizations

This is where the architecture originated. The mulan dispatcher system implements exactly this pattern:

OrgPrimeDO (garywu/brain)
  RepoPrimeDO × 41 repos
    Dispatcher (CF Worker)
      CF Runner (github-api jobs)
      CI Runner (shell jobs via GitHub Actions)
      Local Runner (Claude SDK jobs)

Job lifecycle:

RepoPrimeDO scans repo → detects "missing biome.json"
  → submits job: type="add-biome", repo="garywu/example"

Dispatcher routes to CF Runner:
  Level 0: static template → creates biome.json
  → Next CI run fails: "biome.json extends path wrong for monorepo"
  → Escalate

  Level 1 (CF Workers AI): diagnoses → "monorepo needs packages/* in extends"
  → Retry Level 0 with new instructions → succeeds

  Skill crystallized:
    { pattern: { job_type: "add-biome", signals: ["monorepo"] },
      instructions: "Add packages/* override to biome.json extends",
      source: "escalation-l1" }

Next monorepo: Level 0 loads skill → first attempt succeeds

The skill registry is the accumulation of every quirk, every edge case, every non-standard configuration encountered across 41 repos. OpenClaw, a leading agent framework, has 5,400+ skills in its registry. This architecture produces that registry automatically, from real operations.


The Universal Interface

Every entity in the hierarchy, regardless of domain, exposes the same interface:

interface EntityPrime {
  // Perception: what is the current state of this entity?
  scan(): Promise<EntityState>

  // Decision: what needs to change?
  evaluate(state: EntityState): Promise<WorkItem[]>

  // Delegation: submit work to the layer below
  delegate(items: WorkItem[]): Promise<void>

  // Memory: what has been tried, what worked
  history(): Promise<Attempt[]>
  skills(): Promise<Skill[]>

  // Escalation: receive results, learn, crystallize
  onComplete(result: WorkResult): Promise<void>
}

The domain changes (repos, brands, chapters, patients). The interface does not.

This is why the architecture is worth abstracting into a framework. You build it once — the DO hierarchy, the escalation protocol, the skill registry, the inheritance tree. Then you instantiate it for any domain by providing:

  1. The entity definition (what fields describe this entity)
  2. The signal definitions (what conditions indicate work is needed)
  3. The job types (what kinds of work can be done)
  4. The executor prompts (how to describe the work to an LLM)

The learning loop, escalation protocol, skill crystallization, and DO infrastructure are shared across all instances.


The Implementation Stack

LayerTechnologyPurpose
Persistent agentsCloudflare Durable ObjectsOne DO per entity, SQLite memory, self-scheduling alarms
Skill registryD1 databaseShared across all DOs, queryable by pattern
Model routingAPI MOMCentralized escalation, billing, model selection
Executor runtimeCF Workers, GitHub Actions, Local runnerDomain-specific execution
Event propagationWebhooks, DO alarmsEntities wake on state change, not polling

API MOM is the critical centralization layer. Agents do not know which model executes their jobs — they declare an escalation level and let API MOM route to the right model/provider (CF Workers AI, OpenRouter, Claude, GPT-4, local subscription). API MOM tracks cost per job, per level, per skill — so you can see exactly what the learning loop is worth financially.


What Changes When You Apply This

Without this pattern:

With this pattern:

The core shift: Failures are not costs. They are the mechanism by which the system learns. A system that has never failed has never learned anything. A system that fails regularly and crystallizes every fix becomes, over time, more capable than any individual expert — because it has encountered and solved more edge cases than any individual could accumulate in a career.



Reference Implementation


Edit page
Share this post on:

Previous Post
The Tight Loop: Observability and Action