Prime: Persistent Org-Level AI Agents on Cloudflare

A control plane architecture for autonomous multi-repo organizations — built on Cloudflare Agents SDK, Durable Objects, and GitHub as long-term memory.

Most AI coding assistants are reactive: you talk to them, they act, the session ends and everything is forgotten. That model works for one-off tasks but breaks down when you have 40 repos, continuous drift, and no human available to start every conversation.

This article describes a different model: Prime — a hierarchy of persistent AI agents, one per repo and one per org, that run autonomously, remember everything, and coordinate without human intervention. Built on the Cloudflare Agents SDK, each Prime is a Durable Object with its own SQLite database, self-scheduling alarms, and the intelligence to decide when to act and when to wait.

Open Table of Contents

The Problem
The Mental Model: Control Plane vs Data Plane
The Agent Hierarchy
What Prime Is — and Is Not
Memory: Four Layers
Eyes and Ears: Perception
The Wake Cycle
Implementation: Three Layers
Code: RepoPrime Implementation
Wrangler Configuration
The Run Sheet: Brain to Dispatcher Handoff
The Dispatcher: Dumb Resource Manager
Attempt History: Never Retry Blindly
Priority: User-Driven, Brain-Enforced
What Goes Away
References
Implementation Sequence

The Problem

You have 40 repos. Across them:

5 have CI failing
11 are missing basic standards (biome, commitlint, dependabot)
3 have open issues labeled automated that nobody has touched
2 have Mulan PRs open that need merging

A dashboard tells you this. Nothing fixes it.

You start a conversation with Claude Code. It fixes some things. You end the conversation. Nothing runs until you start the next one.

This is the fundamental problem: the machine only works because a human is talking to it. That is not autonomy. That is a very smart assistant waiting for instructions.

The fix is not “make the assistant smarter.” The fix is architectural: give the system persistent agency — agents that are always present, always aware, and capable of deciding on their own when to act.

The Mental Model: Control Plane vs Data Plane

Every production system with complex state needs this separation:

Control Plane — makes decisions. Knows the desired state. Detects drift. Decides what to do about it. Runs infrequently. Produces a plan.

Data Plane — executes decisions. Moves data, runs jobs, applies fixes. Runs constantly. Follows the plan.

Kubernetes has this: the controller manager (control plane) reconciles desired vs actual state. Kubelets (data plane) execute on each node.

Your org needs the same separation:

Control Plane (Prime agents)
  Org Prime DO — knows the whole portfolio, produces run sheet
  Repo Prime DO × N — knows one repo, decides what needs doing

Data Plane (Dispatcher + Runners)
  Dispatcher CF Worker — reads run sheet, dispatches jobs within capacity
  CF Runner — handles simple file ops (add-biome, add-dependabot)
  CI Runner — handles shell jobs via GitHub Actions
  Local Runner — handles Claude SDK jobs (complex code changes)

The control plane thinks. The data plane executes. Neither does both.

The Agent Hierarchy

Org Prime DO (one per GitHub org/account)
  ├── Knows: all repos, cross-repo patterns, org-wide priorities
  ├── Produces: run sheet — ordered, prioritized work backlog
  ├── Wakes: daily alarm, or when repo Prime flags something
  └── Delegates to: N Repo Prime DOs

Repo Prime DO (one per repo)
  ├── Knows: this repo's signals, issue history, CLAUDE.md rules
  ├── Decides: what to fix, what to skip, what to flag up
  ├── Wakes: self-alarm (6h/1h/24h), GitHub webhook, org Prime
  └── Delegates to: Dispatcher job queue

Dispatcher CF Worker (one per org)
  ├── Reads: run sheet from Org Prime
  ├── Manages: runner capacity, job deduplication
  ├── Dispatches: top N jobs within capacity constraints
  └── Reports: outcomes back to Repo Primes

Each Repo Prime is the autonomous agent for its repo. The same intelligence as a Claude Code session — but always-on, always aware, not dependent on a human starting a conversation.

What Prime Is — and Is Not

What Prime is

A presence — always addressable at a stable ID. Any part of the system can wake it.
Stateful — its SQLite database survives indefinitely. Prime never forgets.
Self-scheduling — sets its own alarms. No external cron needed.
Intelligent — makes decisions via LLM, not hardcoded rules.
Selective — receives notifications but chooses whether to act.

What Prime is NOT

Not continuously running — Prime hibernates between wakes. Zero CPU, zero cost when idle. This is not a failure mode; it is the design.
Not a task queue consumer — Prime does not act on every event. It reasons about events and decides.
Not a cron job — Crons fire on schedule regardless of context. Prime wakes, reads the world, and decides if action is needed.

The Hibernation Model

A common misconception: “how does it keep running if it’s on a Worker?”

Durable Objects hibernate. Between events (alarms, HTTP requests), the DO is not consuming CPU. Its SQLite state persists in Cloudflare’s storage. When an event arrives — an alarm fires, a webhook arrives, the dispatcher flags a problem — the DO instantiates in milliseconds, reads its state, handles the event, and hibernates again.

This is identical to how a human expert works: available, responsive, knowledgeable — but not sitting in a spinning loop burning energy.

Memory: Four Layers

Prime needs memory at multiple timescales and visibility levels.

Layer 1: Working Memory — DO SQLite (private to this Prime)

Fast, ephemeral from an operational perspective, but persistent across DO hibernation/wake cycles.

-- Current plan and working context
CREATE TABLE working_memory (
  key TEXT PRIMARY KEY,
  value TEXT,
  updated_at INTEGER
);

-- Recent decisions (last 48h)
CREATE TABLE decisions (
  id TEXT PRIMARY KEY,
  reasoning TEXT NOT NULL,  -- why Prime decided what it decided
  actions TEXT NOT NULL,    -- JSON array of actions taken
  decided_at INTEGER NOT NULL
);

-- Loaded identity (CLAUDE.md cache)
CREATE TABLE repo_context (
  claude_md TEXT NOT NULL,
  last_loaded INTEGER NOT NULL
);

This is Prime’s short-term memory. If the DO is evicted and recreated, it reconstructs from GitHub issues (episodic memory) and CLAUDE.md (semantic memory).

Layer 2: Episodic Memory — GitHub Issues

This is the primary long-term memory. Human-readable, auditable, permanent.

Every problem Prime is aware of has a GitHub issue:

Title: the problem
Body: evolving understanding (Prime updates it as it learns more)
Comments: every attempt — what was tried, what happened, what was learned
Labels: state (automated, in-progress, blocked, needs-human)
Open/closed: whether the problem is active or resolved

Why GitHub issues and not D1? Because:

They survive everything — DO eviction, conversation end, machine reboot
They are human-readable — a human can understand what Prime tried and why
They are the existing communication channel between Prime and humans
If Prime is reset, it reads the issue history and reconstructs its understanding

GitHub issues are not Prime’s inbox. They are Prime’s memory.

Layer 3: Semantic Memory — CLAUDE.md and Repo Files

Static knowledge that defines Prime’s identity for this specific repo.

What this repo is and what it’s for
Rules Prime must follow
Domain knowledge (stack, conventions, constraints)
Standards (what the org’s signals mean for this repo)

Prime reads CLAUDE.md on first wake per session. It is the foundation everything else builds on. Without CLAUDE.md, Prime has no identity for this repo.

Layer 4: Shared Memory — D1 (org-wide)

Machine-readable org-wide index owned by the dispatcher.

-- Current signal states for all repos
CREATE TABLE repos (
  remote TEXT PRIMARY KEY,
  state TEXT NOT NULL  -- JSON: all signal values
);

-- Attempt history
CREATE TABLE attempts (
  id TEXT PRIMARY KEY,
  repo TEXT NOT NULL,
  signal TEXT NOT NULL,
  runner_type TEXT NOT NULL,
  status TEXT NOT NULL,  -- "success" | "failed" | "blocked"
  reason TEXT,
  attempted_at INTEGER NOT NULL
);

-- Run sheet from Org Prime
CREATE TABLE run_sheet (
  rank INTEGER NOT NULL,
  repo TEXT NOT NULL,
  signal TEXT NOT NULL,
  runner TEXT NOT NULL,
  reason TEXT NOT NULL,
  approach TEXT NOT NULL,
  cooldown_hours INTEGER NOT NULL DEFAULT 24,
  updated_at INTEGER NOT NULL
);

D1 is not Prime’s memory — it is the data plane’s state that Prime queries. Prime reads from it (to understand current signal states and attempt history) and writes to it as a side effect (via the dispatcher reporting outcomes).

Eyes and Ears: Perception

Prime needs two modes of perception.

Reactive (Ears) — Push Model

Events that wake Prime immediately:

Source	Event	What Prime does
GitHub webhook	Issue labeled `automated`	Read issue, decide if action needed
GitHub webhook	CI failed on main	Read CI log, assess severity
GitHub webhook	PR merged/closed	Update working memory, close tracking issue
GitHub webhook	Push to main	Check if any signals changed
Dispatcher	Job completed (success/fail/blocked)	Update episodic memory (comment on issue)
Dispatcher	Job failed 3x	Flag Org Prime, update issue to `needs-human`
Org Prime	Delegation (cross-repo task)	Read context, incorporate into plan
User	Conversation message	Highest priority — update priorities, re-plan

Polling (Eyes) — Pull Model

Prime actively reads when it wakes on alarm:

GitHub API: current signal states (CI, biome, commitlint, etc.)
Open issues: what problems exist, what’s labeled automated
Open PRs: what’s in flight (Mulan PRs, Dependabot PRs)
Recent git log: what changed since last wake
D1 attempts table: what has been tried, what failed
Dispatcher runner status: what capacity exists

The distinction matters. Ears tell Prime something happened. Eyes tell Prime what the world looks like now. Prime needs both — events alone miss drift; polling alone misses urgency.

The Notification Decision

Prime receives notifications but is not obligated to act. This is the fundamental difference between an agent and a task queue consumer.

A dumb system: webhook fires → dispatch a job.

Prime: webhook fires → Prime wakes → reads current state → checks working memory → decides.

Examples:

CI failed on a branch where Mulan already has an open PR → Prime knows this, already tracking it. No new action.
CI failed on main after a human push → Prime investigates, creates tracking issue.
New issue labeled automated → Prime reads it, checks if already in its plan, decides whether to act now or queue it.
Dependabot PR opened → Prime checks if it conflicts with anything in flight.

Intelligence means knowing when NOT to act.

The Wake Cycle

When Prime wakes (from alarm, webhook, or external call):

1. Load Identity
   → Read CLAUDE.md from repo (cached in DO SQLite, refresh if >24h old)
   → Read working memory (what was I doing? what's in flight?)

2. Read the World (eyes — only on alarm wake, not on every webhook)
   → Query D1 for current signal states
   → Fetch open issues labeled automated or in-progress
   → Check open Mulan PRs
   → Query D1 attempts for recent history

3. Understand the Event (if reactive wake)
   → What happened?
   → Is this already in my current plan?
   → Does it change my assessment?

4. Reason (one LLM call)
   → Input: identity + world state + event + history + current plan
   → Output: list of decisions with reasoning
   → Constraint: max 5 actions per wake cycle

5. Act
   → Submit jobs to dispatcher
   → Comment on GitHub issues (write episodic memory)
   → Update DO working memory
   → Flag Org Prime if needed

6. Schedule Next Wake
   → Work pending → 1h alarm
   → Nothing pending → 6h alarm
   → Blocked waiting on human → 24h alarm

The LLM call happens once per wake cycle. Not per event, not per job. Prime reasons holistically about everything it knows, then acts.

Implementation: Three Layers

This maps directly to the three-layer architecture from garywu/three-layer-ai-agent-architecture:

Layer 1: Container (Cloudflare Agents SDK)
  pnpm add agents  ← https://www.npmjs.com/package/agents
  → Durable Object runtime
  → Built-in SQLite (DO-private memory)
  → Alarm API (self-scheduling)
  → WebSocket (real-time connection to user sessions)
  → Hibernation (zero cost when idle)

Layer 2: Brain (Vercel AI SDK → API Mom intelligent router)
  pnpm add ai @ai-sdk/anthropic  ← https://sdk.vercel.ai/docs
  → generateObject() with Zod schema + structured decision output
  → One call per wake cycle
  → Prime passes capability hint only — never specifies a model
  → API Mom routes: Workers AI (free) → OpenRouter free → Haiku → Sonnet
  → See: garywu/api-mom-intelligent-router

Layer 3: Wallet (API Mom proxy + intelligent router)
  → All LLM calls routed through centralized proxy
  → Per-Prime cost attribution
  → Daily spend limits enforced
  → Prevents runaway spend across N repo Primes

Code: RepoPrime Implementation

Base State

// src/types.ts
export interface PrimeState {
  repoSlug: string           // "garywu/frontasy"
  status: 'idle' | 'reasoning' | 'acting' | 'blocked' | 'waiting-human'
  currentPlan: PlanItem[]    // what Prime intends to do
  inFlight: string[]         // job IDs currently running
  lastWakeReason: string     // why it woke last
  lastDecision: string       // summary of last reasoning
  costToday: number
  costBudget: number
  wakeCount: number
}

export interface PlanItem {
  signal: string
  action: string
  runner: 'cf' | 'ci' | 'local'
  issueNumber?: number       // tracking issue in GitHub
  reason: string
}

export interface Decision {
  actions: Array<{
    type: 'submit-job' | 'create-issue' | 'comment-issue' | 'flag-org' | 'wait'
    signal?: string
    runner?: string
    issueNumber?: number
    content?: string
    reason: string
  }>
  nextWakeHours: number       // how long until next alarm
  summary: string             // one-line summary of reasoning
}

The Agent Class

// src/agents/repo-prime.ts
import { Agent } from 'agents'
import { generateObject } from 'ai'
import { anthropic } from '@ai-sdk/anthropic'
import { z } from 'zod'
import type { Env, PrimeState, Decision } from '../types'

const DecisionSchema = z.object({
  actions: z.array(z.object({
    type: z.enum(['submit-job', 'create-issue', 'comment-issue', 'flag-org', 'wait']),
    signal: z.string().optional(),
    runner: z.enum(['cf', 'ci', 'local']).optional(),
    issueNumber: z.number().optional(),
    content: z.string().optional(),
    reason: z.string(),
  })),
  nextWakeHours: z.number(),
  summary: z.string(),
})

export class RepoPrime extends Agent<Env, PrimeState> {
  initialState: PrimeState = {
    repoSlug: '',
    status: 'idle',
    currentPlan: [],
    inFlight: [],
    lastWakeReason: 'init',
    lastDecision: '',
    costToday: 0,
    costBudget: 2.0,  // $2/day max per repo
    wakeCount: 0,
  }

  async onStart() {
    // Initialize SQLite tables
    this.sql`CREATE TABLE IF NOT EXISTS working_memory (
      key TEXT PRIMARY KEY, value TEXT, updated_at INTEGER
    )`
    this.sql`CREATE TABLE IF NOT EXISTS decisions (
      id TEXT PRIMARY KEY, reasoning TEXT, actions TEXT, decided_at INTEGER
    )`
    this.sql`CREATE TABLE IF NOT EXISTS repo_context (
      claude_md TEXT, last_loaded INTEGER
    )`

    // Schedule first wake
    await this.schedule(3600, 'wakeAndReason', { reason: 'scheduled' })
  }

  // --- Scheduled wake ---

  async wakeAndReason({ reason }: { reason: string }) {
    this.setState({ ...this.state, status: 'reasoning', lastWakeReason: reason, wakeCount: this.state.wakeCount + 1 })

    try {
      const context = await this.buildContext()
      const { object: decision } = await generateObject({
        model: anthropic('claude-sonnet-4-6'),
        schema: DecisionSchema,
        system: context.claudeMd,
        prompt: this.buildPrompt(context),
      })

      // Record decision
      this.sql`INSERT INTO decisions VALUES (
        ${crypto.randomUUID()}, ${decision.summary}, ${JSON.stringify(decision.actions)}, ${Date.now()}
      )`

      this.setState({ ...this.state, status: 'acting', lastDecision: decision.summary })

      // Execute decisions
      await this.executeDecisions(decision.actions)

      this.setState({ ...this.state, status: 'idle' })

      // Schedule next wake
      await this.schedule(decision.nextWakeHours * 3600, 'wakeAndReason', { reason: 'scheduled' })

    } catch (err) {
      this.setState({ ...this.state, status: 'idle' })
      await this.schedule(3600, 'wakeAndReason', { reason: 'retry-after-error' })
    }
  }

  // --- Event-driven wake (from webhook/dispatcher) ---

  async onRequest(request: Request): Promise<Response> {
    const url = new URL(request.url)

    if (url.pathname.endsWith('/wake') && request.method === 'POST') {
      const body = await request.json() as { event: string; data?: unknown }
      // Don't re-reason on every webhook — only if it changes our picture
      if (this.shouldActOnEvent(body.event)) {
        await this.wakeAndReason({ reason: body.event })
      }
      return Response.json({ ok: true })
    }

    if (url.pathname.endsWith('/status')) {
      return Response.json(this.state)
    }

    return Response.json({ error: 'not found' }, { status: 404 })
  }

  // --- Context building ---

  private async buildContext() {
    // Load CLAUDE.md (cached, refresh if >24h old)
    const [ctxRow] = [...this.sql`SELECT * FROM repo_context LIMIT 1`]
    let claudeMd = ctxRow?.claude_md as string ?? ''
    if (!ctxRow || (Date.now() - Number(ctxRow.last_loaded)) > 86_400_000) {
      claudeMd = await this.fetchClaudeMd()
      this.sql`DELETE FROM repo_context`
      this.sql`INSERT INTO repo_context VALUES (${claudeMd}, ${Date.now()})`
    }

    // Read world state
    const [repoState, openIssues, attempts] = await Promise.all([
      this.fetchRepoState(),
      this.fetchOpenIssues(),
      this.fetchAttemptHistory(),
    ])

    return { claudeMd, repoState, openIssues, attempts }
  }

  private buildPrompt(context: Awaited<ReturnType<typeof this.buildContext>>): string {
    const failing = Object.entries(context.repoState)
      .filter(([, v]) => v === false || String(v).includes('failing'))
      .map(([k]) => k)

    return `
You are Prime for ${this.state.repoSlug}.
Current problems: ${failing.join(', ') || 'none'}
Open automated issues: ${context.openIssues.length}
Recent attempts: ${JSON.stringify(context.attempts.slice(0, 10))}
In flight: ${this.state.inFlight.join(', ') || 'none'}
Cost today: $${this.state.costToday.toFixed(4)} / $${this.state.costBudget}

Decide what to do. Be selective — don't act on everything at once.
Prefer CF runner (free, immediate) over local runner (costs tokens).
Don't re-attempt anything that recently failed without new information.
`.trim()
  }

  private shouldActOnEvent(event: string): boolean {
    // Don't reason on every single event — only meaningful changes
    const actOnEvents = ['ci-failed-main', 'issue-labeled-automated', 'job-failed', 'job-blocked']
    return actOnEvents.some(e => event.includes(e))
  }

  // --- Action execution ---

  private async executeDecisions(actions: Decision['actions']) {
    for (const action of actions) {
      switch (action.type) {
        case 'submit-job':
          await this.submitDispatcherJob(action)
          break
        case 'comment-issue':
          await this.commentOnIssue(action.issueNumber!, action.content!)
          break
        case 'create-issue':
          await this.createTrackingIssue(action)
          break
        case 'flag-org':
          await this.flagOrgPrime(action.reason)
          break
        case 'wait':
          // Intentionally do nothing — Prime decided to wait
          break
      }
    }
  }

  private async submitDispatcherJob(action: Decision['actions'][0]) {
    const res = await fetch(`${this.env.DISPATCHER_URL}/jobs`, {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${this.env.DISPATCHER_SECRET}`,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        type: `fix-${action.signal}`,
        repo: this.state.repoSlug,
        needs: action.runner === 'cf' ? ['github-api'] : ['claude'],
        priority: 2,
        payload: { source: 'prime', reason: action.reason },
      }),
    })
    if (res.ok) {
      const job = await res.json() as { id: string }
      this.setState({ ...this.state, inFlight: [...this.state.inFlight, job.id] })
    }
  }

  // --- Fetch helpers (GitHub API, D1 queries) ---

  private async fetchClaudeMd(): Promise<string> {
    const res = await fetch(
      `https://api.github.com/repos/${this.state.repoSlug}/contents/CLAUDE.md`,
      { headers: { Authorization: `Bearer ${this.env.GITHUB_TOKEN}`, 'User-Agent': 'prime-agent' } }
    )
    if (!res.ok) return '(no CLAUDE.md)'
    const data = await res.json() as { content: string }
    return atob(data.content.replace(/\n/g, ''))
  }

  private async fetchRepoState(): Promise<Record<string, unknown>> {
    const res = await fetch(
      `${this.env.DISPATCHER_URL}/repos/${encodeURIComponent(this.state.repoSlug)}`,
      { headers: { Authorization: `Bearer ${this.env.DISPATCHER_SECRET}` } }
    )
    return res.ok ? res.json() : {}
  }

  private async fetchOpenIssues(): Promise<unknown[]> {
    const res = await fetch(
      `https://api.github.com/repos/${this.state.repoSlug}/issues?labels=automated&state=open&per_page=20`,
      { headers: { Authorization: `Bearer ${this.env.GITHUB_TOKEN}`, 'User-Agent': 'prime-agent' } }
    )
    return res.ok ? res.json() : []
  }

  private async fetchAttemptHistory(): Promise<unknown[]> {
    const res = await fetch(
      `${this.env.DISPATCHER_URL}/attempts/${encodeURIComponent(this.state.repoSlug)}`,
      { headers: { Authorization: `Bearer ${this.env.DISPATCHER_SECRET}` } }
    )
    return res.ok ? res.json() : []
  }

  private async commentOnIssue(number: number, body: string): Promise<void> {
    await fetch(
      `https://api.github.com/repos/${this.state.repoSlug}/issues/${number}/comments`,
      {
        method: 'POST',
        headers: {
          Authorization: `Bearer ${this.env.GITHUB_TOKEN}`,
          'Content-Type': 'application/json',
          'User-Agent': 'prime-agent',
        },
        body: JSON.stringify({ body }),
      }
    )
  }

  private async createTrackingIssue(action: Decision['actions'][0]): Promise<void> {
    await fetch(
      `https://api.github.com/repos/${this.state.repoSlug}/issues`,
      {
        method: 'POST',
        headers: {
          Authorization: `Bearer ${this.env.GITHUB_TOKEN}`,
          'Content-Type': 'application/json',
          'User-Agent': 'prime-agent',
        },
        body: JSON.stringify({
          title: `fix: ${action.signal} — ${action.reason}`,
          labels: ['automated'],
          body: `Prime created this issue to track: ${action.reason}\n\n_Source: Prime wake cycle ${this.state.wakeCount}_`,
        }),
      }
    )
  }

  private async flagOrgPrime(reason: string): Promise<void> {
    await fetch(`${this.env.ORG_PRIME_URL}/flag`, {
      method: 'POST',
      headers: { Authorization: `Bearer ${this.env.DISPATCHER_SECRET}`, 'Content-Type': 'application/json' },
      body: JSON.stringify({ repo: this.state.repoSlug, reason }),
    })
  }
}

Worker Entry Point

// src/index.ts
import { routeAgentRequest } from 'agents'
import type { Env } from './types'

export { RepoPrime } from './agents/repo-prime'
export { OrgPrime } from './agents/org-prime'

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    // Health check
    if (new URL(request.url).pathname === '/health') {
      return Response.json({ ok: true })
    }
    // All /agents/* routes handled by SDK
    const agentResponse = await routeAgentRequest(request, env)
    if (agentResponse) return agentResponse
    return Response.json({ error: 'not found' }, { status: 404 })
  },
} satisfies ExportedHandler<Env>

Wrangler Configuration

// wrangler.jsonc
{
  "name": "org-prime",
  "main": "src/index.ts",
  "compatibility_date": "2025-12-01",
  "compatibility_flags": ["nodejs_compat"],

  "durable_objects": {
    "bindings": [
      { "name": "REPO_PRIME", "class_name": "RepoPrime" },
      { "name": "ORG_PRIME", "class_name": "OrgPrime" }
    ]
  },
  "migrations": [
    { "tag": "v1", "new_sqlite_classes": ["RepoPrime", "OrgPrime"] }
  ],

  "d1_databases": [
    {
      "binding": "DB",
      "database_name": "org-prime-shared",
      "database_id": "..."
    }
  ]
}

Addressing one Prime instance from another:

// Get (or create) a Prime for a specific repo
const id = env.REPO_PRIME.idFromName('garywu/frontasy')
const prime = env.REPO_PRIME.get(id)
await prime.fetch('https://prime/wake', {
  method: 'POST',
  body: JSON.stringify({ event: 'ci-failed-main' })
})

The Run Sheet: Brain to Dispatcher Handoff

The Org Prime produces a run sheet — the interface between the control plane (intelligence) and the data plane (execution). The dispatcher reads this; it does not produce it.

[
  {
    "rank": 1,
    "repo": "garywu/niche-fi",
    "signal": "ci",
    "runner": "local",
    "reason": "CI failing on main blocks all other automation on this repo",
    "approach": "Investigate TypeScript errors, fix type issues, verify CI passes",
    "cooldown_hours": 72,
    "added_at": "2026-03-20T07:00:00Z"
  },
  {
    "rank": 2,
    "repo": "garywu/frontasy",
    "signal": "biome",
    "runner": "cf",
    "reason": "Quick win — idempotent file add, CF runner handles it in seconds",
    "approach": "Add standard biome.json",
    "cooldown_hours": 24,
    "added_at": "2026-03-20T07:00:00Z"
  }
]

The run sheet is owned by Org Prime. The dispatcher is not allowed to modify it — only read it and report outcomes.

The Dispatcher: Dumb Resource Manager

The dispatcher has no intelligence. It does not decide what to do. It only decides whether it can do it right now:

For each item in run sheet (by rank):
  1. Job already pending/running for (repo, signal)? → skip
  2. Last attempt blocked, nothing changed? → skip
  3. Cooldown not elapsed? → skip
  4. Required runner available and under capacity? → dispatch
  5. Record attempt in D1

Capacity limits:

CF runner: unlimited (stateless, free)
CI runner: 3 concurrent (GitHub Actions minutes)
Local runner: 2 concurrent (Claude API cost)

The dispatcher reports outcomes to the relevant Repo Prime DO. The Repo Prime updates its tracking issues (episodic memory) and flags the Org Prime if needed.

Attempt History: Never Retry Blindly

CREATE TABLE attempts (
  id TEXT PRIMARY KEY,
  repo TEXT NOT NULL,
  signal TEXT NOT NULL,
  runner_type TEXT NOT NULL,
  job_id TEXT,
  status TEXT NOT NULL,   -- "success" | "failed" | "blocked"
  reason TEXT,            -- what went wrong, or what was blocking
  attempted_at INTEGER NOT NULL
);

The dispatcher checks this before every dispatch. Three failed attempts with the same reason → the dispatcher flags the Repo Prime → the Repo Prime updates the GitHub issue to needs-human and sets its next wake to 24h (waiting for human).

This prevents the system from burning runner budget on problems it cannot solve.

Priority: User-Driven, Brain-Enforced

Priority is not hardcoded. It comes from conversation with the user and is held by the brain.

CREATE TABLE priorities (
  key TEXT PRIMARY KEY,   -- "repo:garywu/frontasy" or "signal:ci"
  weight INTEGER NOT NULL,
  note TEXT,              -- reason (from conversation)
  set_at INTEGER NOT NULL
);

Examples of user-driven priority:

“CI failing → stop everything else until green” — Org Prime weights signal:ci at 100
“Focus on garywu/frontasy this week” — Org Prime weights repo:garywu/frontasy at 80
“Don’t touch seo-edge right now, it’s in a release freeze” — Org Prime removes it from run sheet

These decisions live in Org Prime’s DO SQLite. The dispatcher reads the run sheet and sees the results — it never reads the priorities table directly.

What Goes Away

Once Prime is running:

Deleted from garywu/scram-jet:

scripts/janitor-*.ts — replaced by dispatcher CF runner + Mulan executor
scripts/gen-org-index.ts — replaced by dispatcher README generation
scripts/checks.ts — signal checks move to dispatcher scanner
.github/workflows/rescan.yml in garywu/_readme — replaced by dispatcher cron

Deleted from garywu/_readme:

data/repos.jsonl — replaced by D1 repos table
data/changelog.jsonl — replaced by D1 signal_history table

No longer needed:

Manual conversation to start the daemon
Manual gh issue create to trigger Mulan
Manual scanning via wrangler dev

The machine runs itself.

References

Cloudflare

Cloudflare Agents SDK — The agents npm package. Durable Object abstraction with built-in SQLite, alarm API, WebSocket hibernation, and routeAgentRequest().
Durable Objects overview — Persistent, always-addressable stateful compute. Each instance has a unique stable ID and a private SQLite database.
Durable Objects: SQLite storage — this.ctx.storage.sql API. DO-private, survives hibernation/eviction.
Durable Objects: Alarms — storage.setAlarm(). Self-scheduling without external cron. The mechanism that makes Prime wake on its own.
Durable Objects: Hibernation — Zero CPU between events. The DO is not “sleeping” — it is not instantiated. State persists in storage.
Cloudflare D1 — Serverless SQLite at the edge. Shared org state (repos, attempts, run_sheet, priorities).
Workers Cron Triggers — wrangler.jsonc triggers.crons for 30-minute dispatcher scheduling.
Cloudflare Workers compatibility flags: nodejs_compat — Required for crypto and Node.js built-ins in Workers.
agents npm package — pnpm add agents. The Cloudflare Agents SDK. Agent<Env, State> base class, routeAgentRequest(), this.sql.

AI / LLM

Vercel AI SDK — pnpm add ai. generateObject() with Zod schema for structured LLM output. Used for Prime’s single-call-per-wake reasoning.
@ai-sdk/anthropic — pnpm add @ai-sdk/anthropic. Anthropic provider for Vercel AI SDK. anthropic('claude-sonnet-4-6').
Anthropic API: Models overview — Current model IDs. claude-sonnet-4-6 for production agents.
Anthropic API: Tool use — Structured tool calling. Used indirectly via generateObject() schema enforcement.

GitHub

GitHub REST API: Issues — Prime’s primary external memory interface. Create, comment, label, close issues. Every problem = one issue; every attempt = one comment.
GitHub REST API: Contents — GET /repos/{owner}/{repo}/contents/{path}. How Prime fetches CLAUDE.md. Response includes base64-encoded content.
GitHub Webhooks — Push events, CI status, PR events, issue label events. The “ears” of the reactive wake model.
GitHub Actions: Workflow triggers — CI status events that trigger Prime wakes.

Architecture Concepts

Kubernetes: Control plane components — The canonical implementation of control plane / data plane separation. Controller manager (desired state) vs kubelets (execution). Same pattern applied here.
The Twelve-Factor App: Processes — Stateless processes + backing stores. DO SQLite is the backing store; Prime instances are stateless between wakes.

garywu/cloudflare-durable-objects-patterns — Control Plane / Data Plane meta-pattern, hibernation model, four DO patterns. Origin of the architectural split applied here.
garywu/three-layer-ai-agent-architecture — Container / Brain / Wallet separation. Cost tracking, the $47 surprise bill. The three-layer model Prime implements.
garywu/cloudflare-autonomous-pipeline — Trigger matching (cron vs DO alarm vs Queue), D1 at scale, deploy readiness. Dispatcher scheduling patterns.
garywu/autonomous-agent-frameworks — 18-framework comparison (OpenClaw, Devin, etc.) — where Prime fits in the landscape.
garywu/agent-swarm — Working implementation of CF Agents SDK + OpenDash control plane. Prime builds directly on this foundation.
garywu/api-mom-intelligent-router — Four-tier routing: Workers AI (free) → OpenRouter free → paid API → subscription quota via runners. The router layer Prime calls — model selection is fully abstracted away from agent code.

Implementation Sequence

Phase 1 — Persistent local runner (WSL systemd service). The machine must run without conversation.
Phase 2 — Dispatcher owns scanning + README generation. Replaces rescan.yml.
Phase 3 — Attempt history. Dispatcher stops retrying blocked problems.
Phase 4 — Repo Prime DO. The core. One agent per repo, always-on.
Phase 5 — Org Prime DO. Aggregates repo Primes, produces run sheet.
Phase 6 — GitHub webhooks. Event-driven wakes supplement alarm-driven cycles.
Phase 7 — Scram-jet cleanup. Remove janitors and rescan.yml.

Each phase delivers value independently. The system gets progressively more autonomous with each phase.