Prime: A Conversational Control Plane

The human talks to the business. The business talks to the code. Every conversation makes the system smarter.

Most autonomous agent frameworks solve the wrong problem. They make a single agent that can code faster. The actual problem is different: a human runs a business with 40 repos, dozens of brands, continuous drift, and no time to start individual coding sessions for every repo that needs work. The human needs to talk at the business level — “Income Coach isn’t converting” — and have that translate into concrete code changes across multiple repos, automatically, with memory of what was tried and why.

This article describes Prime — a hierarchical control plane where persistent AI agents manage an organization autonomously. The human talks to the org-level agent about business priorities. The org agent decomposes intent into repo-level tasks. Per-repo agents execute, learn, and report back. The system gets smarter with every cycle.

Built on Cloudflare Agents SDK, Durable Objects, and a Napkin-inspired memory system that uses the best model for retrieval — not a separate embedding model.

Open Table of Contents

Why This Exists
The Vision: Business Language In, Code Changes Out
Architecture Overview
- The Three Separations
The Agent Hierarchy
The Conversation Interface
Memory: Napkin-Inspired Progressive Disclosure
The Wake Cycle
- The Single LLM Call Principle
Skill System: SKILL.md
- Skill Lifecycle
- Skill Storage
Auto-Learning: Knowledge Distillation
The Run Sheet: Control Plane to Data Plane
- Run Sheet vs Current Policy Rules
The Dispatcher: Dumb Resource Manager
Competitive Position
Implementation: Current State and Migration
- What’s Built (garywu/mulan)
- Migration Path
Schema and Data Model
- D1 (Shared, Org-Wide) — Unchanged
- DO SQLite (Private, Per-Agent) — New
References

Why This Exists

The Conversation Gap

Every agent framework in the 18-framework survey solves some version of “human tells agent what to code.” OpenClaw connects to 20+ chat platforms. Claude Agent SDK gives you terminal access. CrewAI orchestrates teams. LangGraph checkpoints multi-step workflows.

None of them solve: human talks about business → system produces code changes across multiple repos.

The gap is not in execution capability. Modern agents can write code, fix CI, add configurations, create PRs. The gap is in decomposition — translating business intent into concrete technical work across a portfolio of repos, with memory of what was tried, awareness of what’s blocked, and intelligence to know when NOT to act.

The Architecture Gap

OpenClaw has 302K stars and 5,400+ skills. It is also a single-agent system that requires a running machine, forgets everything between sessions (unless you manually maintain MEMORY.md), and has no concept of organizational hierarchy.

The Cloudflare Agents SDK provides the missing primitive: Durable Objects as persistent agents. A DO has its own SQLite database, survives indefinitely, hibernates at zero cost, wakes instantly on events, and is always addressable. This is the container that makes always-on agents economically viable.

But the Agents SDK is infrastructure, not architecture. It gives you the building blocks. This article describes what you build with them.

What Changes

Before (current)	After (Prime)
Human starts a Claude Code session per repo	Human talks to OrgPrime about business priorities
Agent forgets when session ends	Agents have persistent memory across all sessions
One repo at a time	40 repos managed simultaneously
Manual task decomposition	Business intent auto-decomposes into repo tasks
Dispatcher does everything (scan, evaluate, dispatch)	Control plane thinks, data plane executes
No conversation history	Every decision recorded, auditable, learnable
System is idle unless human is talking	System is always on, always aware

The Vision: Business Language In, Code Changes Out

Human: "Income Coach isn't converting. The onboarding flow is
        too long and the value prop isn't clear on the landing page."

OrgPrime reasons:
  - This is a business-level concern spanning multiple repos
  - income-coach repo: onboarding flow is code
  - brand-systems repo: value prop is brand positioning
  - frontasy repo: landing page renders the positioning

OrgPrime decomposes:
  1. garywu/brand-systems → create issue: "sharpen Income Coach
     value prop — current messaging doesn't communicate immediate
     value to first-time visitors"
  2. garywu/income-coach → create issue: "simplify onboarding —
     reduce steps from 5 to 2, defer profile completion to after
     first value delivery"
  3. garywu/frontasy → create issue: "update Income Coach landing
     page copy to match new positioning" (blocked on #1)

Each RepoPrime:
  - Reads its CLAUDE.md for repo context
  - Reads the issue OrgPrime created
  - Reasons about implementation approach
  - Submits jobs to the dispatcher
  - Reports outcomes back to OrgPrime

OrgPrime tracks:
  - All three repos progressing toward the same business goal
  - Dependencies (frontasy blocked on brand-systems)
  - Whether the business intent was actually addressed

This is not a hypothetical. Every piece of this — DO hierarchy, job dispatch, GitHub issues as memory, signal scanning, CI-aware merge — is either built or partially built in garywu/mulan. What’s missing is the intelligence layer: the LLM reasoning that converts business language into technical decomposition.

Architecture Overview

Human (WebSocket / Telegram / CLI)
  │
  ▼
OrgPrime DO (one per GitHub org)
  │  ├── Persistent SQLite memory
  │  ├── Conversation history
  │  ├── Business context + priorities
  │  ├── Cross-repo awareness
  │  └── Decomposes intent → repo tasks
  │
  ├── RepoPrime DO × N (one per repo)
  │     ├── Persistent SQLite memory
  │     ├── CLAUDE.md identity
  │     ├── Signal awareness (CI, standards, PRs)
  │     ├── Attempt history
  │     ├── Skill registry
  │     └── Reasons about repo-specific work
  │
  ▼
Dispatcher CF Worker (one per org)
  │  ├── Reads run sheet from OrgPrime
  │  ├── Manages runner capacity
  │  ├── Deduplicates jobs
  │  ├── Records every attempt
  │  └── Reports outcomes to RepoPrimes
  │
  ├── CF Runner (inline, free)
  ├── CI Runner (GitHub Actions, deterministic)
  └── Local Runner (Claude Agent SDK, AI-driven)

The Three Separations

1. Control Plane vs Data Plane

The control plane (OrgPrime + RepoPrimes) thinks. The data plane (Dispatcher + Runners) executes. Neither does both. This separation means:

A bug in dispatch logic doesn’t affect reasoning
A bug in reasoning doesn’t break job execution
Each layer can be improved independently

2. Business Level vs Code Level

OrgPrime speaks business language. RepoPrimes speak code language. The translation happens at the OrgPrime → RepoPrime boundary via GitHub issues. Issues are the API between business intent and technical execution.

3. Memory vs Execution

Memory lives in DO SQLite (private to each agent) and GitHub issues (shared, auditable). Execution lives in the dispatcher and runners. Memory survives indefinitely. Execution is ephemeral — jobs start, run, complete, and their outcomes feed back into memory.

The Agent Hierarchy

OrgPrime DO — The Business Agent

One per GitHub organization. The human’s primary interface.

Knows:

All repos in the portfolio and their current state
Business priorities and strategic context
Cross-repo dependencies and patterns
Conversation history with the human
Which RepoPrimes are active, blocked, or idle

Does:

Receives business-level input from human
Decomposes into repo-level tasks (via GitHub issues)
Produces the run sheet (prioritized work backlog)
Coordinates cross-repo work
Escalates to human when something needs judgment
Tracks whether business intent was actually delivered

Does NOT:

Execute any code
Manage runners or capacity
Make repo-specific technical decisions (delegated to RepoPrimes)

Wake triggers:

Human conversation (WebSocket)
RepoPrime escalation (DO RPC)
Scheduled alarm (daily review)
Dispatcher flag (job failed 3x)

RepoPrime DO — The Repo Agent

One per managed repository. Autonomous within its scope.

Knows:

This repo’s CLAUDE.md (identity, rules, constraints)
Current signal states (CI, biome, commitlint, etc.)
Open issues and PRs
Attempt history (what was tried, what failed, why)
Skills applicable to this repo

Does:

Reads issues created by OrgPrime or humans
Reasons about implementation approach (one LLM call)
Submits jobs to dispatcher
Comments on GitHub issues (records decisions and outcomes)
Flags OrgPrime when something needs cross-repo attention
Distills knowledge from completed jobs into memory

Does NOT:

Execute code directly (delegates to runners via dispatcher)
Make business-level decisions (defers to OrgPrime)
Communicate with other RepoPrimes (goes through OrgPrime)

Wake triggers:

GitHub webhook (issue labeled, CI failed, PR merged)
OrgPrime delegation (new issue created)
Dispatcher notification (job completed/failed)
Scheduled alarm (1h/6h/24h adaptive)

Dispatcher — The Resource Manager

One per organization. Pure mechanics, zero intelligence.

Does:

Reads run sheet from OrgPrime
Checks runner availability and capacity
Deduplicates against pending/running jobs
Dispatches top N jobs within capacity
Records every attempt with outcome
Reports outcomes to RepoPrimes
Scans repo signals via GitHub API (proxied through API Mom)
Generates README dashboard

Does NOT:

Decide what to work on (reads run sheet)
Evaluate policy (OrgPrime decides priorities)
Create jobs (RepoPrimes create jobs)
Reason about failures (RepoPrimes reason)

The Conversation Interface

WebSocket via Agents SDK

The Cloudflare Agents SDK provides native WebSocket support with hibernation. OrgPrime maintains a persistent WebSocket connection to the human’s client:

export class OrgPrime extends Agent<Env, OrgState> {
  // WebSocket message from human
  async onMessage(connection: Connection, message: string) {
    // 1. Store message in conversation history (DO SQLite)
    this.sql`INSERT INTO conversations VALUES (
      ${crypto.randomUUID()}, 'human', ${message}, ${Date.now()}
    )`

    // 2. Build context (progressive disclosure from memory)
    const context = await this.buildContext(message)

    // 3. One LLM call — reason about business intent
    const decision = await generateObject({
      model: this.getModel(),
      schema: OrgDecisionSchema,
      system: context.pinnedContext,
      prompt: this.buildPrompt(context, message),
    })

    // 4. Execute decisions (create issues, delegate to RepoPrimes)
    await this.executeDecisions(decision.actions)

    // 5. Respond to human
    connection.send(JSON.stringify({
      type: 'response',
      summary: decision.summary,
      actions: decision.actions.map(a => a.reason),
    }))

    // 6. Distill new knowledge from this conversation turn
    await this.distillConversation(message, decision)
  }

  // WebSocket hibernation — zero cost when human isn't talking
  async onClose(connection: Connection) {
    // Connection state persists in DO SQLite
    // Next connection resumes with full history
  }
}

Multi-Platform Support

OrgPrime’s WebSocket is the canonical interface. Platform adapters translate:

Platform	Adapter	Status
CLI (terminal)	Direct WebSocket	Priority 1
Telegram	Bot webhook → OrgPrime HTTP → WebSocket	Built (@brewdbot)
Web dashboard	WebSocket from browser	Future
Slack	Slack Events API → OrgPrime HTTP	Future

The adapter is thin — it translates platform message format to OrgPrime’s WebSocket protocol. All intelligence lives in OrgPrime.

Conversation Memory

Every conversation turn is stored in OrgPrime’s DO SQLite:

CREATE TABLE conversations (
  id TEXT PRIMARY KEY,
  role TEXT NOT NULL,        -- 'human' | 'org-prime' | 'system'
  content TEXT NOT NULL,
  related_repos TEXT,        -- JSON array of repos mentioned
  related_issues TEXT,       -- JSON array of issues referenced
  created_at INTEGER NOT NULL
);

This gives OrgPrime:

Full conversation history across sessions (survives hibernation)
Ability to reference prior conversations (“remember when we discussed…”)
Context for understanding follow-up messages
Audit trail of every human ↔ system interaction

Memory: Napkin-Inspired Progressive Disclosure

Traditional RAG bolts a smaller, dumber embedding model onto a capable LLM to pre-filter information. This inverts the decision hierarchy — the least capable model makes the most important decision (what context to retrieve).

Inspired by the Napkin memory system, Prime uses progressive disclosure — the LLM itself navigates a structured knowledge base using its full reasoning capability.

The Four Levels

Level 0: Pinned Context (always loaded, <500 tokens)

Loaded on every wake cycle. Contains only what the agent MUST know:

For OrgPrime:

# Org Context
- 40 repos across garywu org
- 3-tier execution: CF (free) → CI (deterministic) → Local (AI)
- Budget: $X/day across all repos
- Current focus: [from last human conversation]
- Blocked: [repos waiting on human input]

For RepoPrime:

# [repo-name] Context
- Stack: TypeScript, Cloudflare Workers, D1
- Priority: P1 (revenue-generating)
- Current state: CI passing, 3 open automated issues
- Last human direction: [from OrgPrime delegation]

This is the equivalent of CLAUDE.md but distilled to what matters RIGHT NOW.

Level 1: Keyword Map (loaded on wake, ~200 tokens)

A TF-IDF weighted taxonomy of the agent’s memory, generated from DO SQLite:

decisions/
  keywords: onboarding, conversion, brand, positioning
  notes: 12
attempts/
  keywords: fix-ci, biome, commitlint, tsc-errors
  notes: 45
patterns/
  keywords: rate-limit, timeout, dependency, cascade
  notes: 8
skills/
  keywords: add-biome, fix-commitlint, fix-husky, diagnose-ci
  notes: 10

The LLM reads this map and decides which folders to search. No embedding model involved — the best model makes the retrieval decision.

TF-IDF weighting:

Headings: 3x weight
Note titles: 2x weight
Body text: 1x weight
Terms appearing across all folders: suppressed (not distinctive)

Level 2: Search (on-demand, BM25)

When the LLM needs more context, it searches memory:

-- BM25 search over memory notes
SELECT title, snippet, folder, updated_at
FROM memory_fts
WHERE memory_fts MATCH ?
ORDER BY rank * recency_weight * backlink_score
LIMIT 10;

Key design decisions (from Napkin):

Return matching lines only — no surrounding context (saves tokens)
Hide numeric scores — prevents the LLM from anchoring on numbers instead of reasoning about semantic fit
Recency weighting — newer notes rank higher (temporal decay without explicit pruning)
Backlink scoring — notes referenced by other notes rank higher (markdown PageRank)

Level 3: Full Read (on-demand)

Complete note content, only when Level 2 points to something relevant:

SELECT content FROM memory WHERE id = ?;

This is the equivalent of reading a full GitHub issue or a complete SKILL.md file. The LLM navigates here deliberately, not by accident.

Why Not Vector Search?

At our scale (40 repos, ~50 memory notes per repo = ~2,000 total notes), vector search adds complexity without benefit:

Dimension	Vector Search	Progressive Disclosure
Retrieval intelligence	Embedding model (smaller, dumber)	The LLM itself (best model available)
Infrastructure	Embedding pipeline + vector DB	DO SQLite FTS5 (built-in, free)
Debugging	Opaque cosine similarities	Readable keyword maps + BM25
Update cost	Re-embed on every change	No pipeline, FTS5 auto-updates
Cold start	Needs embeddings computed	Works immediately from text

For corpora of 100K+ documents, vector search is necessary. For organizational memory at our scale, it’s overhead that produces worse results.

DO SQLite Schema

Each agent (OrgPrime and every RepoPrime) has its own SQLite database:

-- Memory notes (the knowledge base)
CREATE TABLE memory (
  id TEXT PRIMARY KEY,
  folder TEXT NOT NULL,      -- 'decisions', 'attempts', 'patterns', 'skills'
  title TEXT NOT NULL,
  content TEXT NOT NULL,
  backlinks TEXT DEFAULT '[]',  -- JSON array of note IDs this links to
  source TEXT,               -- 'distillation' | 'human' | 'job-outcome'
  created_at INTEGER NOT NULL,
  updated_at INTEGER NOT NULL
);

-- FTS5 index for BM25 search
CREATE VIRTUAL TABLE memory_fts USING fts5(
  title, content, folder,
  content=memory, content_rowid=rowid
);

-- Keyword map cache (regenerated on write)
CREATE TABLE keyword_map (
  folder TEXT PRIMARY KEY,
  keywords TEXT NOT NULL,    -- JSON array of {term, weight} sorted by weight
  note_count INTEGER NOT NULL,
  updated_at INTEGER NOT NULL
);

-- Conversation history (OrgPrime only)
CREATE TABLE conversations (
  id TEXT PRIMARY KEY,
  role TEXT NOT NULL,
  content TEXT NOT NULL,
  related_repos TEXT,
  related_issues TEXT,
  created_at INTEGER NOT NULL
);

-- Working state
CREATE TABLE working (
  key TEXT PRIMARY KEY,
  value TEXT NOT NULL,
  updated_at INTEGER NOT NULL
);

-- Decision log (audit trail)
CREATE TABLE decisions (
  id TEXT PRIMARY KEY,
  reasoning TEXT NOT NULL,
  actions TEXT NOT NULL,      -- JSON array
  wake_reason TEXT NOT NULL,
  decided_at INTEGER NOT NULL
);

The Wake Cycle

When an agent wakes (from alarm, webhook, conversation, or delegation):

1. Load Pinned Context (Level 0)
   → Read from working memory: current plan, in-flight jobs, last decision
   → For RepoPrime: read CLAUDE.md (cached in SQLite, refresh if >24h)

2. Load Keyword Map (Level 1)
   → Generated from memory table, cached in keyword_map table
   → ~200 tokens of navigational context

3. Understand the Trigger
   → What happened? (alarm, webhook event, human message, delegation)
   → Search memory (Level 2) for relevant history
   → Read specific notes (Level 3) if needed

4. Reason (ONE LLM call)
   → Input: pinned context + keyword map + trigger + relevant memory
   → Output: structured decision (actions + reasoning + next wake time)
   → Constraint: max 5 actions per wake cycle

5. Act
   → Submit jobs to dispatcher
   → Create/comment on GitHub issues
   → Delegate to RepoPrimes (OrgPrime only)
   → Flag OrgPrime (RepoPrime only)
   → Update working memory

6. Distill
   → Extract knowledge from this cycle into memory notes
   → Update keyword map

7. Schedule Next Wake
   → Work pending: 1h alarm
   → Nothing pending: 6h alarm
   → Blocked on human: 24h alarm
   → Just had conversation: 10min alarm (responsiveness)

The Single LLM Call Principle

Each wake cycle makes exactly ONE LLM call for reasoning. Not per event, not per repo, not per job. One call with full context, producing a structured decision.

This is critical for cost control and coherence:

Cost: At 40 repos with 6h default alarms, that’s ~160 LLM calls/day across all agents. At Haiku pricing, that’s cents per day.
Coherence: One call sees the full picture. Multiple calls per wake risk contradictory decisions.

The model selection follows the Wallet layer pattern (API Mom):

OrgPrime: Sonnet (business reasoning requires strong capability)
RepoPrime: Haiku for routine wakes, Sonnet for complex situations
Workers AI (free) for simple signal evaluation

Skill System: SKILL.md

Adopted from OpenClaw’s proven pattern: skills are markdown files, not code.

# skills/fix-commitlint/SKILL.md
---
name: fix-commitlint
description: Add conventional commit linting to a repo
runner: ci
signals: [commitlint]
requires:
  files: [package.json]
confidence: 0.95
last_success: 2026-03-20
success_rate: 47/50
---

## Instructions

1. Install @commitlint/cli and @commitlint/config-conventional
2. Create commitlint.config.cjs extending config-conventional
3. Add commitlint to husky pre-commit hook
4. Verify: echo "fix: test" | npx commitlint

## Error Patterns

- If husky is not installed, run fix-husky skill first
- If package.json has no "prepare" script, add "prepare": "husky"
- If commitlint.config.cjs conflicts with existing config, check
  for .commitlintrc.json and remove it

## Learned From

- garywu/frontasy#12 (2026-03-15): initial implementation
- garywu/niche-fi#8 (2026-03-18): husky dependency discovered
- garywu/svg-generators#30 (2026-03-21): config format conflict

Skill Lifecycle

1. Manual Creation
   Human or agent writes SKILL.md for a known procedure

2. Selective Injection
   RepoPrime's Level 1 keyword map includes skill names
   LLM reads relevant skills before reasoning about a job

3. Crystallization (Auto-Learning)
   Job succeeds → distillation extracts the procedure
   → Creates or updates SKILL.md with new error patterns
   → Updates success_rate and last_success

4. Skill Inheritance
   Universal skills (add-biome) → apply to all repos
   Vertical skills (fix-cloudflare-worker) → apply to CF Worker repos
   Repo-specific skills → apply to one repo only

Skill Storage

Skills live in two places:

Org-level skills: In OrgPrime’s DO SQLite (shared across repos)
Repo-level skills: In each RepoPrime’s DO SQLite (repo-specific)

RepoPrime inherits org-level skills and can override them with repo-specific versions.

Auto-Learning: Knowledge Distillation

Inspired by Agent Zero’s auto-learning pattern, adapted for structured markdown and DO SQLite.

When Distillation Happens

Trigger	What Gets Distilled
Job completed (success)	Procedure → skill, outcome → attempt note
Job completed (failure)	Error pattern → skill update, blocker → pattern note
Human conversation	Business context → decision note, priority change → working memory
Signal change detected	State transition → pattern note
RepoPrime escalation	Cross-repo pattern → OrgPrime pattern note

The Distillation Prompt

You are a knowledge distiller for the {agent_name} agent.

Given this event:
{event_type}: {event_summary}
{event_details}

And the current memory structure:
{keyword_map}

Extract knowledge into one of these categories:
- decisions/  — why something was decided, with context
- attempts/   — what was tried, outcome, what was learned
- patterns/   — recurring patterns (errors, dependencies, signals)
- skills/     — repeatable procedures (SKILL.md format)

Rules:
- Link to existing notes using [[note-title]] when relevant
- Use YAML frontmatter with: title, folder, source, related_repos
- If updating an existing note, return the note ID and the update
- Be concise — memory notes should be <200 words
- Include ONLY information not obvious from the code itself

Temporal Decay

Memory notes are not deleted. They decay naturally through BM25 recency weighting:

-- Recency weight: notes updated recently rank higher
-- Half-life: 30 days (a note from 30 days ago scores 50% of a fresh one)
SELECT *,
  rank * (0.5 + 0.5 * EXP(-0.693 * (unixepoch('now') - updated_at) / 2592000.0))
  AS weighted_rank
FROM memory_fts
WHERE memory_fts MATCH ?
ORDER BY weighted_rank DESC;

This means:

Recent knowledge surfaces first
Old knowledge is still searchable but deprioritized
No pruning logic needed — the ranking handles it
A note that gets updated (referenced, linked) resets its decay

The Run Sheet: Control Plane to Data Plane

The run sheet is OrgPrime’s output — a prioritized list of work for the dispatcher.

interface RunSheetItem {
  rank: number
  repo: string
  signal: string
  jobType: string
  runner: 'cf' | 'ci' | 'local'
  reason: string           // why this matters (business context)
  approach: string         // how to do it (technical guidance)
  issueNumber?: number     // tracking issue
  cooldownHours: number    // don't retry before this
  blockedBy?: string[]     // other run sheet items that must complete first
  businessGoal?: string    // which human conversation spawned this
}

Run Sheet vs Current Policy Rules

Current (POLICY_RULES)	Prime (Run Sheet)
Static if/else in code	Dynamic, regenerated each OrgPrime wake
Signal → job type mapping	Business intent → prioritized work list
Same rules for all repos	Per-repo reasoning by RepoPrime
No business context	Links work to business goals
No dependency tracking	Explicit blockedBy relationships
No cooldown intelligence	Cooldown based on attempt history

The Dispatcher: Dumb Resource Manager

The dispatcher’s role shrinks significantly in the Prime architecture. It becomes a pure execution engine:

What Stays

Scan repo signals via GitHub API (proxied through API Mom)
Drain pending jobs to runners (CF, CI, Local)
Record attempt outcomes in D1
Batch-commit README dashboard
Dispatch via repository_dispatch to CI runner
Report job outcomes to RepoPrime DOs

What Moves to Prime

Policy evaluation → OrgPrime produces run sheet
Job creation → RepoPrimes submit jobs
Priority logic → OrgPrime reasons about priorities
Failure analysis → RepoPrimes reason about failures

What’s New

Read run sheet from OrgPrime (replaces policy evaluation)
Notify RepoPrime on job completion (via DO RPC)
Respect blockedBy dependencies in run sheet

Competitive Position

vs OpenClaw (302K stars)

Dimension	OpenClaw	Prime
Scope	Single agent, single repo	Hierarchical: org + N repos
Persistence	Local process, dies when stopped	DO hibernation, zero cost when idle
Memory	File-based + SQLite + vector search	DO SQLite + progressive disclosure (no embeddings)
Skills	5,400+ in registry (SKILL.md)	Adopts SKILL.md pattern, starts with ~10
Conversation	20+ platform adapters	WebSocket + Telegram (extensible)
Multi-repo	No	Native — OrgPrime coordinates 40 repos
Cost control	None	API Mom intelligent router + daily budgets
Business decomposition	No — human must specify repo and task	OrgPrime decomposes business intent
Auto-learning	No (manual MEMORY.md)	Agent Zero-style distillation
Always-on	Requires running machine	Cloudflare edge, $0 when idle

Prime’s advantage: Hierarchical decomposition + always-on + cost control. OpenClaw’s advantage: Ecosystem size + platform coverage.

vs Agent Zero (12K stars)

Dimension	Agent Zero	Prime
Auto-learning	FAISS embeddings, auto-extract	BM25 + progressive disclosure, auto-distill
Hierarchy	Superior/subordinate chain	OrgPrime/RepoPrime with DO persistence
Persistence	FAISS files on disk	DO SQLite (survives indefinitely)
Execution	Docker containers	3-tier (CF/CI/Local)

Prime’s advantage: Persistent state without infrastructure, edge deployment. Agent Zero’s advantage: More mature auto-learning implementation.

vs LangGraph (10K stars)

Dimension	LangGraph	Prime
State	PostgresSaver checkpoints	DO SQLite (no external DB needed)
Graph	Arbitrary node graphs	Two-level hierarchy (simpler, sufficient)
Human-in-loop	Interrupt nodes	WebSocket conversation
Persistence	Requires PostgreSQL	Built into DO (zero config)

Prime’s advantage: No infrastructure requirements, always-on without a server. LangGraph’s advantage: More flexible graph patterns, larger community.

The Unique Position

No framework in the survey combines:

Always-on persistence (DO hibernation)
Hierarchical agent coordination (org → repo)
Business-to-code decomposition (conversation → issues → jobs)
Cost-controlled execution (API Mom routing)
Auto-learning memory (distillation into DO SQLite)

This is not a general-purpose agent framework. It is an organizational control plane — purpose-built for the specific problem of managing a portfolio of software repos from business-level conversations.

Implementation: Current State and Migration

What’s Built (garywu/mulan)

Component	Status	Notes
Dispatcher CF Worker	Live	Scans, dispatches, commits README
BrainDO (→ OrgPrime)	Partial	Alarm, run sheet, flag escalation. No LLM, no WebSocket.
RepoPrimeDO	Partial	Boot/wake/plan-result routes. No Agents SDK, no SQLite, no LLM.
D1 schema	Live	jobs, repos, signal_history, runners, pr_outcomes
CF Runner	Live	Inline job execution in worker
CI Runner	Live	GitHub Actions deterministic handlers
Local Runner	Partial	Claude Agent SDK, claims jobs
GitHub API proxy	Live	Centralized via github-fetch.ts + API Mom
Telegram bot	Live	@brewdbot, send-only
Signal scanning	Live	12 checks per repo, smart scan optimization
Batch commit	Live	Git Trees API, single commit per tick
PAUSED kill switch	Live	Emergency brake for dispatcher

Migration Path

Phase 1: Foundation (fix current implementation)

Migrate BrainDO and RepoPrimeDO to Cloudflare Agents SDK (extends Agent)
Add DO SQLite memory tables (replace KV storage)
Wire Agents SDK alarm scheduling (replace manual alarm)
Move policy evaluation out of dispatcher into OrgPrime

Phase 2: Memory (Napkin-inspired)

Implement 4-level progressive disclosure in DO SQLite
Add FTS5 index for BM25 search
Build keyword map generator
Implement temporal decay weighting
Add distillation on job completion

Phase 3: Conversation

Add WebSocket to OrgPrime via Agents SDK
Implement conversation history in DO SQLite
Build business intent → repo task decomposition (LLM call)
Wire Telegram adapter to OrgPrime WebSocket
Add one LLM call per wake cycle to RepoPrime

Phase 4: Skills and Learning

Implement SKILL.md format in DO SQLite
Build skill injection into RepoPrime’s LLM context
Add crystallization: successful job → skill extraction
Implement skill inheritance (org → repo)

Phase 5: Run Sheet

OrgPrime generates run sheet from LLM reasoning
Dispatcher reads run sheet (replaces POLICY_RULES)
Add blockedBy dependency tracking
Link run sheet items to business goals from conversations

Schema and Data Model

D1 (Shared, Org-Wide) — Unchanged

-- Signal states for all repos (dispatcher writes, Primes read)
CREATE TABLE repos (remote TEXT PRIMARY KEY, state TEXT NOT NULL);

-- Job queue (RepoPrimes submit, dispatcher executes)
CREATE TABLE jobs (
  id TEXT PRIMARY KEY, type TEXT, repo TEXT, status TEXT,
  needs TEXT, payload TEXT, result TEXT,
  priority INTEGER, created_at INTEGER, updated_at INTEGER,
  completed_at INTEGER, claimed_by TEXT
);

-- Attempt history (dispatcher writes, Primes read)
CREATE TABLE signal_history (
  id TEXT PRIMARY KEY, repo TEXT, signal TEXT,
  prev TEXT, curr TEXT, changed_at INTEGER
);

-- PR outcomes (dispatcher writes, Primes read for regression detection)
CREATE TABLE pr_outcomes (
  pr_url TEXT PRIMARY KEY, job_type TEXT, repo TEXT,
  baseline_signals TEXT, outcome TEXT, merged_at INTEGER
);

-- Run sheet (OrgPrime writes, dispatcher reads)
CREATE TABLE run_sheet (
  rank INTEGER NOT NULL, repo TEXT NOT NULL,
  signal TEXT NOT NULL, job_type TEXT NOT NULL,
  runner TEXT NOT NULL, reason TEXT NOT NULL,
  approach TEXT NOT NULL, issue_number INTEGER,
  cooldown_hours INTEGER DEFAULT 24,
  blocked_by TEXT, business_goal TEXT,
  updated_at INTEGER NOT NULL
);

DO SQLite (Private, Per-Agent) — New

See the Memory section for the complete schema.

References

This Project

garywu/mulan — Implementation repository
mulan#131 — RFC: Prime as Durable Object
mulan#123 — Epic: Autonomous Org Maintenance
atlas#404 — Brain as org control plane DO

Architecture Articles

Autonomous Agent Frameworks Compared — 18-framework survey
The Autonomous Entity Pattern — The meta-framework
Prime: Persistent Org-Level AI Agents on Cloudflare — Prior article (superseded by this one)
The Three-Layer AI Agent Architecture — Container / Brain / Wallet
Never Fail Twice: Escalation Ladder — Skill crystallization

External

Napkin Memory System — Progressive disclosure memory for agents
Cloudflare Agents SDK — DO-based agent runtime
OpenClaw — 302K-star agent framework (benchmark)
Agent Zero — Auto-learning hierarchical agents