A control plane architecture for autonomous multi-repo organizations — built on Cloudflare Agents SDK, Durable Objects, and GitHub as long-term memory.
Most AI coding assistants are reactive: you talk to them, they act, the session ends and everything is forgotten. That model works for one-off tasks but breaks down when you have 40 repos, continuous drift, and no human available to start every conversation.
This article describes a different model: Prime — a hierarchy of persistent AI agents, one per repo and one per org, that run autonomously, remember everything, and coordinate without human intervention. Built on the Cloudflare Agents SDK, each Prime is a Durable Object with its own SQLite database, self-scheduling alarms, and the intelligence to decide when to act and when to wait.
Table of Contents
Open Table of Contents
- The Problem
- The Mental Model: Control Plane vs Data Plane
- The Agent Hierarchy
- What Prime Is — and Is Not
- Memory: Four Layers
- Eyes and Ears: Perception
- The Wake Cycle
- Implementation: Three Layers
- Code: RepoPrime Implementation
- Wrangler Configuration
- The Run Sheet: Brain to Dispatcher Handoff
- The Dispatcher: Dumb Resource Manager
- Attempt History: Never Retry Blindly
- Priority: User-Driven, Brain-Enforced
- What Goes Away
- References
- Implementation Sequence
The Problem
You have 40 repos. Across them:
- 5 have CI failing
- 11 are missing basic standards (biome, commitlint, dependabot)
- 3 have open issues labeled
automatedthat nobody has touched - 2 have Mulan PRs open that need merging
A dashboard tells you this. Nothing fixes it.
You start a conversation with Claude Code. It fixes some things. You end the conversation. Nothing runs until you start the next one.
This is the fundamental problem: the machine only works because a human is talking to it. That is not autonomy. That is a very smart assistant waiting for instructions.
The fix is not “make the assistant smarter.” The fix is architectural: give the system persistent agency — agents that are always present, always aware, and capable of deciding on their own when to act.
The Mental Model: Control Plane vs Data Plane
Every production system with complex state needs this separation:
Control Plane — makes decisions. Knows the desired state. Detects drift. Decides what to do about it. Runs infrequently. Produces a plan.
Data Plane — executes decisions. Moves data, runs jobs, applies fixes. Runs constantly. Follows the plan.
Kubernetes has this: the controller manager (control plane) reconciles desired vs actual state. Kubelets (data plane) execute on each node.
Your org needs the same separation:
Control Plane (Prime agents)
Org Prime DO — knows the whole portfolio, produces run sheet
Repo Prime DO × N — knows one repo, decides what needs doing
Data Plane (Dispatcher + Runners)
Dispatcher CF Worker — reads run sheet, dispatches jobs within capacity
CF Runner — handles simple file ops (add-biome, add-dependabot)
CI Runner — handles shell jobs via GitHub Actions
Local Runner — handles Claude SDK jobs (complex code changes)
The control plane thinks. The data plane executes. Neither does both.
The Agent Hierarchy
Org Prime DO (one per GitHub org/account)
├── Knows: all repos, cross-repo patterns, org-wide priorities
├── Produces: run sheet — ordered, prioritized work backlog
├── Wakes: daily alarm, or when repo Prime flags something
└── Delegates to: N Repo Prime DOs
Repo Prime DO (one per repo)
├── Knows: this repo's signals, issue history, CLAUDE.md rules
├── Decides: what to fix, what to skip, what to flag up
├── Wakes: self-alarm (6h/1h/24h), GitHub webhook, org Prime
└── Delegates to: Dispatcher job queue
Dispatcher CF Worker (one per org)
├── Reads: run sheet from Org Prime
├── Manages: runner capacity, job deduplication
├── Dispatches: top N jobs within capacity constraints
└── Reports: outcomes back to Repo Primes
Each Repo Prime is the autonomous agent for its repo. The same intelligence as a Claude Code session — but always-on, always aware, not dependent on a human starting a conversation.
What Prime Is — and Is Not
What Prime is
- A presence — always addressable at a stable ID. Any part of the system can wake it.
- Stateful — its SQLite database survives indefinitely. Prime never forgets.
- Self-scheduling — sets its own alarms. No external cron needed.
- Intelligent — makes decisions via LLM, not hardcoded rules.
- Selective — receives notifications but chooses whether to act.
What Prime is NOT
- Not continuously running — Prime hibernates between wakes. Zero CPU, zero cost when idle. This is not a failure mode; it is the design.
- Not a task queue consumer — Prime does not act on every event. It reasons about events and decides.
- Not a cron job — Crons fire on schedule regardless of context. Prime wakes, reads the world, and decides if action is needed.
The Hibernation Model
A common misconception: “how does it keep running if it’s on a Worker?”
Durable Objects hibernate. Between events (alarms, HTTP requests), the DO is not consuming CPU. Its SQLite state persists in Cloudflare’s storage. When an event arrives — an alarm fires, a webhook arrives, the dispatcher flags a problem — the DO instantiates in milliseconds, reads its state, handles the event, and hibernates again.
This is identical to how a human expert works: available, responsive, knowledgeable — but not sitting in a spinning loop burning energy.
Memory: Four Layers
Prime needs memory at multiple timescales and visibility levels.
Layer 1: Working Memory — DO SQLite (private to this Prime)
Fast, ephemeral from an operational perspective, but persistent across DO hibernation/wake cycles.
-- Current plan and working context
CREATE TABLE working_memory (
key TEXT PRIMARY KEY,
value TEXT,
updated_at INTEGER
);
-- Recent decisions (last 48h)
CREATE TABLE decisions (
id TEXT PRIMARY KEY,
reasoning TEXT NOT NULL, -- why Prime decided what it decided
actions TEXT NOT NULL, -- JSON array of actions taken
decided_at INTEGER NOT NULL
);
-- Loaded identity (CLAUDE.md cache)
CREATE TABLE repo_context (
claude_md TEXT NOT NULL,
last_loaded INTEGER NOT NULL
);
This is Prime’s short-term memory. If the DO is evicted and recreated, it reconstructs from GitHub issues (episodic memory) and CLAUDE.md (semantic memory).
Layer 2: Episodic Memory — GitHub Issues
This is the primary long-term memory. Human-readable, auditable, permanent.
Every problem Prime is aware of has a GitHub issue:
- Title: the problem
- Body: evolving understanding (Prime updates it as it learns more)
- Comments: every attempt — what was tried, what happened, what was learned
- Labels: state (
automated,in-progress,blocked,needs-human) - Open/closed: whether the problem is active or resolved
Why GitHub issues and not D1? Because:
- They survive everything — DO eviction, conversation end, machine reboot
- They are human-readable — a human can understand what Prime tried and why
- They are the existing communication channel between Prime and humans
- If Prime is reset, it reads the issue history and reconstructs its understanding
GitHub issues are not Prime’s inbox. They are Prime’s memory.
Layer 3: Semantic Memory — CLAUDE.md and Repo Files
Static knowledge that defines Prime’s identity for this specific repo.
- What this repo is and what it’s for
- Rules Prime must follow
- Domain knowledge (stack, conventions, constraints)
- Standards (what the org’s signals mean for this repo)
Prime reads CLAUDE.md on first wake per session. It is the foundation everything else builds on. Without CLAUDE.md, Prime has no identity for this repo.
Layer 4: Shared Memory — D1 (org-wide)
Machine-readable org-wide index owned by the dispatcher.
-- Current signal states for all repos
CREATE TABLE repos (
remote TEXT PRIMARY KEY,
state TEXT NOT NULL -- JSON: all signal values
);
-- Attempt history
CREATE TABLE attempts (
id TEXT PRIMARY KEY,
repo TEXT NOT NULL,
signal TEXT NOT NULL,
runner_type TEXT NOT NULL,
status TEXT NOT NULL, -- "success" | "failed" | "blocked"
reason TEXT,
attempted_at INTEGER NOT NULL
);
-- Run sheet from Org Prime
CREATE TABLE run_sheet (
rank INTEGER NOT NULL,
repo TEXT NOT NULL,
signal TEXT NOT NULL,
runner TEXT NOT NULL,
reason TEXT NOT NULL,
approach TEXT NOT NULL,
cooldown_hours INTEGER NOT NULL DEFAULT 24,
updated_at INTEGER NOT NULL
);
D1 is not Prime’s memory — it is the data plane’s state that Prime queries. Prime reads from it (to understand current signal states and attempt history) and writes to it as a side effect (via the dispatcher reporting outcomes).
Eyes and Ears: Perception
Prime needs two modes of perception.
Reactive (Ears) — Push Model
Events that wake Prime immediately:
| Source | Event | What Prime does |
|---|---|---|
| GitHub webhook | Issue labeled automated | Read issue, decide if action needed |
| GitHub webhook | CI failed on main | Read CI log, assess severity |
| GitHub webhook | PR merged/closed | Update working memory, close tracking issue |
| GitHub webhook | Push to main | Check if any signals changed |
| Dispatcher | Job completed (success/fail/blocked) | Update episodic memory (comment on issue) |
| Dispatcher | Job failed 3x | Flag Org Prime, update issue to needs-human |
| Org Prime | Delegation (cross-repo task) | Read context, incorporate into plan |
| User | Conversation message | Highest priority — update priorities, re-plan |
Polling (Eyes) — Pull Model
Prime actively reads when it wakes on alarm:
- GitHub API: current signal states (CI, biome, commitlint, etc.)
- Open issues: what problems exist, what’s labeled
automated - Open PRs: what’s in flight (Mulan PRs, Dependabot PRs)
- Recent git log: what changed since last wake
- D1 attempts table: what has been tried, what failed
- Dispatcher runner status: what capacity exists
The distinction matters. Ears tell Prime something happened. Eyes tell Prime what the world looks like now. Prime needs both — events alone miss drift; polling alone misses urgency.
The Notification Decision
Prime receives notifications but is not obligated to act. This is the fundamental difference between an agent and a task queue consumer.
A dumb system: webhook fires → dispatch a job.
Prime: webhook fires → Prime wakes → reads current state → checks working memory → decides.
Examples:
- CI failed on a branch where Mulan already has an open PR → Prime knows this, already tracking it. No new action.
- CI failed on main after a human push → Prime investigates, creates tracking issue.
- New issue labeled
automated→ Prime reads it, checks if already in its plan, decides whether to act now or queue it. - Dependabot PR opened → Prime checks if it conflicts with anything in flight.
Intelligence means knowing when NOT to act.
The Wake Cycle
When Prime wakes (from alarm, webhook, or external call):
1. Load Identity
→ Read CLAUDE.md from repo (cached in DO SQLite, refresh if >24h old)
→ Read working memory (what was I doing? what's in flight?)
2. Read the World (eyes — only on alarm wake, not on every webhook)
→ Query D1 for current signal states
→ Fetch open issues labeled automated or in-progress
→ Check open Mulan PRs
→ Query D1 attempts for recent history
3. Understand the Event (if reactive wake)
→ What happened?
→ Is this already in my current plan?
→ Does it change my assessment?
4. Reason (one LLM call)
→ Input: identity + world state + event + history + current plan
→ Output: list of decisions with reasoning
→ Constraint: max 5 actions per wake cycle
5. Act
→ Submit jobs to dispatcher
→ Comment on GitHub issues (write episodic memory)
→ Update DO working memory
→ Flag Org Prime if needed
6. Schedule Next Wake
→ Work pending → 1h alarm
→ Nothing pending → 6h alarm
→ Blocked waiting on human → 24h alarm
The LLM call happens once per wake cycle. Not per event, not per job. Prime reasons holistically about everything it knows, then acts.
Implementation: Three Layers
This maps directly to the three-layer architecture from garywu/three-layer-ai-agent-architecture:
Layer 1: Container (Cloudflare Agents SDK)
pnpm add agents ← https://www.npmjs.com/package/agents
→ Durable Object runtime
→ Built-in SQLite (DO-private memory)
→ Alarm API (self-scheduling)
→ WebSocket (real-time connection to user sessions)
→ Hibernation (zero cost when idle)
Layer 2: Brain (Vercel AI SDK → API Mom intelligent router)
pnpm add ai @ai-sdk/anthropic ← https://sdk.vercel.ai/docs
→ generateObject() with Zod schema + structured decision output
→ One call per wake cycle
→ Prime passes capability hint only — never specifies a model
→ API Mom routes: Workers AI (free) → OpenRouter free → Haiku → Sonnet
→ See: garywu/api-mom-intelligent-router
Layer 3: Wallet (API Mom proxy + intelligent router)
→ All LLM calls routed through centralized proxy
→ Per-Prime cost attribution
→ Daily spend limits enforced
→ Prevents runaway spend across N repo Primes
Code: RepoPrime Implementation
Base State
// src/types.ts
export interface PrimeState {
repoSlug: string // "garywu/frontasy"
status: 'idle' | 'reasoning' | 'acting' | 'blocked' | 'waiting-human'
currentPlan: PlanItem[] // what Prime intends to do
inFlight: string[] // job IDs currently running
lastWakeReason: string // why it woke last
lastDecision: string // summary of last reasoning
costToday: number
costBudget: number
wakeCount: number
}
export interface PlanItem {
signal: string
action: string
runner: 'cf' | 'ci' | 'local'
issueNumber?: number // tracking issue in GitHub
reason: string
}
export interface Decision {
actions: Array<{
type: 'submit-job' | 'create-issue' | 'comment-issue' | 'flag-org' | 'wait'
signal?: string
runner?: string
issueNumber?: number
content?: string
reason: string
}>
nextWakeHours: number // how long until next alarm
summary: string // one-line summary of reasoning
}
The Agent Class
// src/agents/repo-prime.ts
import { Agent } from 'agents'
import { generateObject } from 'ai'
import { anthropic } from '@ai-sdk/anthropic'
import { z } from 'zod'
import type { Env, PrimeState, Decision } from '../types'
const DecisionSchema = z.object({
actions: z.array(z.object({
type: z.enum(['submit-job', 'create-issue', 'comment-issue', 'flag-org', 'wait']),
signal: z.string().optional(),
runner: z.enum(['cf', 'ci', 'local']).optional(),
issueNumber: z.number().optional(),
content: z.string().optional(),
reason: z.string(),
})),
nextWakeHours: z.number(),
summary: z.string(),
})
export class RepoPrime extends Agent<Env, PrimeState> {
initialState: PrimeState = {
repoSlug: '',
status: 'idle',
currentPlan: [],
inFlight: [],
lastWakeReason: 'init',
lastDecision: '',
costToday: 0,
costBudget: 2.0, // $2/day max per repo
wakeCount: 0,
}
async onStart() {
// Initialize SQLite tables
this.sql`CREATE TABLE IF NOT EXISTS working_memory (
key TEXT PRIMARY KEY, value TEXT, updated_at INTEGER
)`
this.sql`CREATE TABLE IF NOT EXISTS decisions (
id TEXT PRIMARY KEY, reasoning TEXT, actions TEXT, decided_at INTEGER
)`
this.sql`CREATE TABLE IF NOT EXISTS repo_context (
claude_md TEXT, last_loaded INTEGER
)`
// Schedule first wake
await this.schedule(3600, 'wakeAndReason', { reason: 'scheduled' })
}
// --- Scheduled wake ---
async wakeAndReason({ reason }: { reason: string }) {
this.setState({ ...this.state, status: 'reasoning', lastWakeReason: reason, wakeCount: this.state.wakeCount + 1 })
try {
const context = await this.buildContext()
const { object: decision } = await generateObject({
model: anthropic('claude-sonnet-4-6'),
schema: DecisionSchema,
system: context.claudeMd,
prompt: this.buildPrompt(context),
})
// Record decision
this.sql`INSERT INTO decisions VALUES (
${crypto.randomUUID()}, ${decision.summary}, ${JSON.stringify(decision.actions)}, ${Date.now()}
)`
this.setState({ ...this.state, status: 'acting', lastDecision: decision.summary })
// Execute decisions
await this.executeDecisions(decision.actions)
this.setState({ ...this.state, status: 'idle' })
// Schedule next wake
await this.schedule(decision.nextWakeHours * 3600, 'wakeAndReason', { reason: 'scheduled' })
} catch (err) {
this.setState({ ...this.state, status: 'idle' })
await this.schedule(3600, 'wakeAndReason', { reason: 'retry-after-error' })
}
}
// --- Event-driven wake (from webhook/dispatcher) ---
async onRequest(request: Request): Promise<Response> {
const url = new URL(request.url)
if (url.pathname.endsWith('/wake') && request.method === 'POST') {
const body = await request.json() as { event: string; data?: unknown }
// Don't re-reason on every webhook — only if it changes our picture
if (this.shouldActOnEvent(body.event)) {
await this.wakeAndReason({ reason: body.event })
}
return Response.json({ ok: true })
}
if (url.pathname.endsWith('/status')) {
return Response.json(this.state)
}
return Response.json({ error: 'not found' }, { status: 404 })
}
// --- Context building ---
private async buildContext() {
// Load CLAUDE.md (cached, refresh if >24h old)
const [ctxRow] = [...this.sql`SELECT * FROM repo_context LIMIT 1`]
let claudeMd = ctxRow?.claude_md as string ?? ''
if (!ctxRow || (Date.now() - Number(ctxRow.last_loaded)) > 86_400_000) {
claudeMd = await this.fetchClaudeMd()
this.sql`DELETE FROM repo_context`
this.sql`INSERT INTO repo_context VALUES (${claudeMd}, ${Date.now()})`
}
// Read world state
const [repoState, openIssues, attempts] = await Promise.all([
this.fetchRepoState(),
this.fetchOpenIssues(),
this.fetchAttemptHistory(),
])
return { claudeMd, repoState, openIssues, attempts }
}
private buildPrompt(context: Awaited<ReturnType<typeof this.buildContext>>): string {
const failing = Object.entries(context.repoState)
.filter(([, v]) => v === false || String(v).includes('failing'))
.map(([k]) => k)
return `
You are Prime for ${this.state.repoSlug}.
Current problems: ${failing.join(', ') || 'none'}
Open automated issues: ${context.openIssues.length}
Recent attempts: ${JSON.stringify(context.attempts.slice(0, 10))}
In flight: ${this.state.inFlight.join(', ') || 'none'}
Cost today: $${this.state.costToday.toFixed(4)} / $${this.state.costBudget}
Decide what to do. Be selective — don't act on everything at once.
Prefer CF runner (free, immediate) over local runner (costs tokens).
Don't re-attempt anything that recently failed without new information.
`.trim()
}
private shouldActOnEvent(event: string): boolean {
// Don't reason on every single event — only meaningful changes
const actOnEvents = ['ci-failed-main', 'issue-labeled-automated', 'job-failed', 'job-blocked']
return actOnEvents.some(e => event.includes(e))
}
// --- Action execution ---
private async executeDecisions(actions: Decision['actions']) {
for (const action of actions) {
switch (action.type) {
case 'submit-job':
await this.submitDispatcherJob(action)
break
case 'comment-issue':
await this.commentOnIssue(action.issueNumber!, action.content!)
break
case 'create-issue':
await this.createTrackingIssue(action)
break
case 'flag-org':
await this.flagOrgPrime(action.reason)
break
case 'wait':
// Intentionally do nothing — Prime decided to wait
break
}
}
}
private async submitDispatcherJob(action: Decision['actions'][0]) {
const res = await fetch(`${this.env.DISPATCHER_URL}/jobs`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${this.env.DISPATCHER_SECRET}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
type: `fix-${action.signal}`,
repo: this.state.repoSlug,
needs: action.runner === 'cf' ? ['github-api'] : ['claude'],
priority: 2,
payload: { source: 'prime', reason: action.reason },
}),
})
if (res.ok) {
const job = await res.json() as { id: string }
this.setState({ ...this.state, inFlight: [...this.state.inFlight, job.id] })
}
}
// --- Fetch helpers (GitHub API, D1 queries) ---
private async fetchClaudeMd(): Promise<string> {
const res = await fetch(
`https://api.github.com/repos/${this.state.repoSlug}/contents/CLAUDE.md`,
{ headers: { Authorization: `Bearer ${this.env.GITHUB_TOKEN}`, 'User-Agent': 'prime-agent' } }
)
if (!res.ok) return '(no CLAUDE.md)'
const data = await res.json() as { content: string }
return atob(data.content.replace(/\n/g, ''))
}
private async fetchRepoState(): Promise<Record<string, unknown>> {
const res = await fetch(
`${this.env.DISPATCHER_URL}/repos/${encodeURIComponent(this.state.repoSlug)}`,
{ headers: { Authorization: `Bearer ${this.env.DISPATCHER_SECRET}` } }
)
return res.ok ? res.json() : {}
}
private async fetchOpenIssues(): Promise<unknown[]> {
const res = await fetch(
`https://api.github.com/repos/${this.state.repoSlug}/issues?labels=automated&state=open&per_page=20`,
{ headers: { Authorization: `Bearer ${this.env.GITHUB_TOKEN}`, 'User-Agent': 'prime-agent' } }
)
return res.ok ? res.json() : []
}
private async fetchAttemptHistory(): Promise<unknown[]> {
const res = await fetch(
`${this.env.DISPATCHER_URL}/attempts/${encodeURIComponent(this.state.repoSlug)}`,
{ headers: { Authorization: `Bearer ${this.env.DISPATCHER_SECRET}` } }
)
return res.ok ? res.json() : []
}
private async commentOnIssue(number: number, body: string): Promise<void> {
await fetch(
`https://api.github.com/repos/${this.state.repoSlug}/issues/${number}/comments`,
{
method: 'POST',
headers: {
Authorization: `Bearer ${this.env.GITHUB_TOKEN}`,
'Content-Type': 'application/json',
'User-Agent': 'prime-agent',
},
body: JSON.stringify({ body }),
}
)
}
private async createTrackingIssue(action: Decision['actions'][0]): Promise<void> {
await fetch(
`https://api.github.com/repos/${this.state.repoSlug}/issues`,
{
method: 'POST',
headers: {
Authorization: `Bearer ${this.env.GITHUB_TOKEN}`,
'Content-Type': 'application/json',
'User-Agent': 'prime-agent',
},
body: JSON.stringify({
title: `fix: ${action.signal} — ${action.reason}`,
labels: ['automated'],
body: `Prime created this issue to track: ${action.reason}\n\n_Source: Prime wake cycle ${this.state.wakeCount}_`,
}),
}
)
}
private async flagOrgPrime(reason: string): Promise<void> {
await fetch(`${this.env.ORG_PRIME_URL}/flag`, {
method: 'POST',
headers: { Authorization: `Bearer ${this.env.DISPATCHER_SECRET}`, 'Content-Type': 'application/json' },
body: JSON.stringify({ repo: this.state.repoSlug, reason }),
})
}
}
Worker Entry Point
// src/index.ts
import { routeAgentRequest } from 'agents'
import type { Env } from './types'
export { RepoPrime } from './agents/repo-prime'
export { OrgPrime } from './agents/org-prime'
export default {
async fetch(request: Request, env: Env): Promise<Response> {
// Health check
if (new URL(request.url).pathname === '/health') {
return Response.json({ ok: true })
}
// All /agents/* routes handled by SDK
const agentResponse = await routeAgentRequest(request, env)
if (agentResponse) return agentResponse
return Response.json({ error: 'not found' }, { status: 404 })
},
} satisfies ExportedHandler<Env>
Wrangler Configuration
// wrangler.jsonc
{
"name": "org-prime",
"main": "src/index.ts",
"compatibility_date": "2025-12-01",
"compatibility_flags": ["nodejs_compat"],
"durable_objects": {
"bindings": [
{ "name": "REPO_PRIME", "class_name": "RepoPrime" },
{ "name": "ORG_PRIME", "class_name": "OrgPrime" }
]
},
"migrations": [
{ "tag": "v1", "new_sqlite_classes": ["RepoPrime", "OrgPrime"] }
],
"d1_databases": [
{
"binding": "DB",
"database_name": "org-prime-shared",
"database_id": "..."
}
]
}
Addressing one Prime instance from another:
// Get (or create) a Prime for a specific repo
const id = env.REPO_PRIME.idFromName('garywu/frontasy')
const prime = env.REPO_PRIME.get(id)
await prime.fetch('https://prime/wake', {
method: 'POST',
body: JSON.stringify({ event: 'ci-failed-main' })
})
The Run Sheet: Brain to Dispatcher Handoff
The Org Prime produces a run sheet — the interface between the control plane (intelligence) and the data plane (execution). The dispatcher reads this; it does not produce it.
[
{
"rank": 1,
"repo": "garywu/niche-fi",
"signal": "ci",
"runner": "local",
"reason": "CI failing on main blocks all other automation on this repo",
"approach": "Investigate TypeScript errors, fix type issues, verify CI passes",
"cooldown_hours": 72,
"added_at": "2026-03-20T07:00:00Z"
},
{
"rank": 2,
"repo": "garywu/frontasy",
"signal": "biome",
"runner": "cf",
"reason": "Quick win — idempotent file add, CF runner handles it in seconds",
"approach": "Add standard biome.json",
"cooldown_hours": 24,
"added_at": "2026-03-20T07:00:00Z"
}
]
The run sheet is owned by Org Prime. The dispatcher is not allowed to modify it — only read it and report outcomes.
The Dispatcher: Dumb Resource Manager
The dispatcher has no intelligence. It does not decide what to do. It only decides whether it can do it right now:
For each item in run sheet (by rank):
1. Job already pending/running for (repo, signal)? → skip
2. Last attempt blocked, nothing changed? → skip
3. Cooldown not elapsed? → skip
4. Required runner available and under capacity? → dispatch
5. Record attempt in D1
Capacity limits:
- CF runner: unlimited (stateless, free)
- CI runner: 3 concurrent (GitHub Actions minutes)
- Local runner: 2 concurrent (Claude API cost)
The dispatcher reports outcomes to the relevant Repo Prime DO. The Repo Prime updates its tracking issues (episodic memory) and flags the Org Prime if needed.
Attempt History: Never Retry Blindly
CREATE TABLE attempts (
id TEXT PRIMARY KEY,
repo TEXT NOT NULL,
signal TEXT NOT NULL,
runner_type TEXT NOT NULL,
job_id TEXT,
status TEXT NOT NULL, -- "success" | "failed" | "blocked"
reason TEXT, -- what went wrong, or what was blocking
attempted_at INTEGER NOT NULL
);
The dispatcher checks this before every dispatch. Three failed attempts with the same reason → the dispatcher flags the Repo Prime → the Repo Prime updates the GitHub issue to needs-human and sets its next wake to 24h (waiting for human).
This prevents the system from burning runner budget on problems it cannot solve.
Priority: User-Driven, Brain-Enforced
Priority is not hardcoded. It comes from conversation with the user and is held by the brain.
CREATE TABLE priorities (
key TEXT PRIMARY KEY, -- "repo:garywu/frontasy" or "signal:ci"
weight INTEGER NOT NULL,
note TEXT, -- reason (from conversation)
set_at INTEGER NOT NULL
);
Examples of user-driven priority:
- “CI failing → stop everything else until green” — Org Prime weights
signal:ciat 100 - “Focus on garywu/frontasy this week” — Org Prime weights
repo:garywu/frontasyat 80 - “Don’t touch seo-edge right now, it’s in a release freeze” — Org Prime removes it from run sheet
These decisions live in Org Prime’s DO SQLite. The dispatcher reads the run sheet and sees the results — it never reads the priorities table directly.
What Goes Away
Once Prime is running:
Deleted from garywu/scram-jet:
scripts/janitor-*.ts— replaced by dispatcher CF runner + Mulan executorscripts/gen-org-index.ts— replaced by dispatcher README generationscripts/checks.ts— signal checks move to dispatcher scanner.github/workflows/rescan.ymlin garywu/_readme — replaced by dispatcher cron
Deleted from garywu/_readme:
data/repos.jsonl— replaced by D1 repos tabledata/changelog.jsonl— replaced by D1 signal_history table
No longer needed:
- Manual conversation to start the daemon
- Manual
gh issue createto trigger Mulan - Manual scanning via
wrangler dev
The machine runs itself.
References
Cloudflare
- Cloudflare Agents SDK — The
agentsnpm package. Durable Object abstraction with built-in SQLite, alarm API, WebSocket hibernation, androuteAgentRequest(). - Durable Objects overview — Persistent, always-addressable stateful compute. Each instance has a unique stable ID and a private SQLite database.
- Durable Objects: SQLite storage —
this.ctx.storage.sqlAPI. DO-private, survives hibernation/eviction. - Durable Objects: Alarms —
storage.setAlarm(). Self-scheduling without external cron. The mechanism that makes Prime wake on its own. - Durable Objects: Hibernation — Zero CPU between events. The DO is not “sleeping” — it is not instantiated. State persists in storage.
- Cloudflare D1 — Serverless SQLite at the edge. Shared org state (repos, attempts, run_sheet, priorities).
- Workers Cron Triggers —
wrangler.jsonctriggers.cronsfor 30-minute dispatcher scheduling. - Cloudflare Workers compatibility flags:
nodejs_compat— Required forcryptoand Node.js built-ins in Workers. - agents npm package —
pnpm add agents. The Cloudflare Agents SDK.Agent<Env, State>base class,routeAgentRequest(),this.sql.
AI / LLM
- Vercel AI SDK —
pnpm add ai.generateObject()with Zod schema for structured LLM output. Used for Prime’s single-call-per-wake reasoning. - @ai-sdk/anthropic —
pnpm add @ai-sdk/anthropic. Anthropic provider for Vercel AI SDK.anthropic('claude-sonnet-4-6'). - Anthropic API: Models overview — Current model IDs.
claude-sonnet-4-6for production agents. - Anthropic API: Tool use — Structured tool calling. Used indirectly via
generateObject()schema enforcement.
GitHub
- GitHub REST API: Issues — Prime’s primary external memory interface. Create, comment, label, close issues. Every problem = one issue; every attempt = one comment.
- GitHub REST API: Contents —
GET /repos/{owner}/{repo}/contents/{path}. How Prime fetches CLAUDE.md. Response includes base64-encoded content. - GitHub Webhooks — Push events, CI status, PR events, issue label events. The “ears” of the reactive wake model.
- GitHub Actions: Workflow triggers — CI status events that trigger Prime wakes.
Architecture Concepts
- Kubernetes: Control plane components — The canonical implementation of control plane / data plane separation. Controller manager (desired state) vs kubelets (execution). Same pattern applied here.
- The Twelve-Factor App: Processes — Stateless processes + backing stores. DO SQLite is the backing store; Prime instances are stateless between wakes.
Related Articles (garywu)
- garywu/cloudflare-durable-objects-patterns — Control Plane / Data Plane meta-pattern, hibernation model, four DO patterns. Origin of the architectural split applied here.
- garywu/three-layer-ai-agent-architecture — Container / Brain / Wallet separation. Cost tracking, the $47 surprise bill. The three-layer model Prime implements.
- garywu/cloudflare-autonomous-pipeline — Trigger matching (cron vs DO alarm vs Queue), D1 at scale, deploy readiness. Dispatcher scheduling patterns.
- garywu/autonomous-agent-frameworks — 18-framework comparison (OpenClaw, Devin, etc.) — where Prime fits in the landscape.
- garywu/agent-swarm — Working implementation of CF Agents SDK + OpenDash control plane. Prime builds directly on this foundation.
- garywu/api-mom-intelligent-router — Four-tier routing: Workers AI (free) → OpenRouter free → paid API → subscription quota via runners. The router layer Prime calls — model selection is fully abstracted away from agent code.
Implementation Sequence
- Phase 1 — Persistent local runner (WSL systemd service). The machine must run without conversation.
- Phase 2 — Dispatcher owns scanning + README generation. Replaces rescan.yml.
- Phase 3 — Attempt history. Dispatcher stops retrying blocked problems.
- Phase 4 — Repo Prime DO. The core. One agent per repo, always-on.
- Phase 5 — Org Prime DO. Aggregates repo Primes, produces run sheet.
- Phase 6 — GitHub webhooks. Event-driven wakes supplement alarm-driven cycles.
- Phase 7 — Scram-jet cleanup. Remove janitors and rescan.yml.
Each phase delivers value independently. The system gets progressively more autonomous with each phase.