Skip to content
Gary Wu
Go back

A Repo Is Context

Edit page

A project is nothing but context. Git is the right medium for context — content-addressable, versioned, permanent. Harnesses compete on compiling context into prompts; the shape of context itself is still wild. The next standardization layer for autonomous agents sits between git and the harness: a common topology for what a repo contains, generated from one canonical source, readable by any harness.


Table of Contents

Open Table of Contents

1. Every System Is a Compiler; Every Compiler Needs Input

Every LLM system is secretly a prompt compiler. Given accumulating state and a goal, it produces a prompt; the model runs the prompt. Formally: P = f(H, G). The argument, with receipts, is made in the-harness-is-a-prompt-compiler. Models are vendor-fungible; harnesses — the f in that equation — are where engineering matters.

The harness thesis has a corollary that nobody has named yet: H is the input to f, and nobody has standardized H.

Every serious harness reads some pinned files, loads some memory, retrieves some documents, walks some directory. Claude Code reads CLAUDE.md and scans the working tree. OpenCode reads AGENTS.md. Cursor reads .cursor/rules/. Gemini CLI reads GEMINI.md. Copilot reads .github/copilot-instructions.md. OpenClaw reads MEMORY.md + daily logs + skills directory. Each harness makes its own bets about what’s worth loading and from where.

What every harness assumes but nobody enforces: there is a place called a repo, and a shape to that place, and the quality of f depends on the shape being coherent. Switch harnesses and most of the context breaks, because each harness expected different files in different shapes. The field has implicitly standardized on “read markdown from a git repo” and explicitly standardized on nothing else.

This is the gap. Harnesses compete on f. Models compete on pretraining. Nobody competes on H, because nobody thinks of H as a designed artifact. But a repo is exactly that: a shaped body of context that a harness walks and compiles. If H is messy, no amount of f sophistication recovers it. If H is coherent, even a mediocre f produces good output.

2. The Five-Layer Stack

Once you see H as first-class, the full stack of an autonomous-agent system separates cleanly:

Layer 5 — Model:     LLM(P) → output                             (vendor-fungible)
Layer 4 — Harness:   f(H, G) → P                                 (the compiler; where harnesses compete)
Layer 3 — Index:     Mnemosi / gbrain: graph + FTS + vector      (queryable view over H)
Layer 2 — Standard:  canonical topology of H                     (the open lane)
Layer 1 — Git:       content-addressable, versioned, permanent   (the substrate)

Read bottom-up. Layer 1 is where H physically lives — the filesystem, versioned. Layer 2 is the shape of that filesystem — which files, in which directories, with which schemas. Layer 3 is the query layer on top of the shape — semantic search, graph, full-text — so harnesses can retrieve slices of H without re-walking the tree. Layer 4 is the compiler itself; it reads Layer 3’s index and Layer 2’s canonical paths, then produces P. Layer 5 is the model call.

Most of the industry’s discourse lives at Layers 4 and 5. Most of the industry’s engineering effort lives at Layer 4. Meanwhile the layer that determines whether Layer 4 has anything coherent to compile — Layer 2 — is undefined territory.

3. Why Git Is the Substrate

The substrate claim is sharper than “use a filesystem.” It’s specifically: git.

Git is a content-addressable, versioned, permanent, auditable, distributed object store. Every blob keyed by SHA-1 (now SHA-256). Every commit a durable externalization. Every checkout a restore. Every merge a consolidation. Every clone a full durable copy. Most agent infrastructure reinvents one or two of these properties inside a database. Git already has all five, has had them for twenty years, and every developer on earth already speaks it.

SubstrateContent-addressableVersionedPermanentAuditableDistributedQueryable
Postgres (Letta, LangGraph Store)NoWeak (app-level only)Depends on backupsHard (no native log)No (single master)Yes
Neo4j + Postgres (Zep, Graphiti)NoNoDependsLimitedNoYes
Vector DB (Pinecone, Weaviate, Qdrant)By embedding hash onlyNoDependsNoNoYes (vector)
Local FS + SQLite (OpenClaw MEMORY.md, MoltWorker)Only if git-backedOnly if git-backedOnly if git-backedOnly if git-backedOnly if git-backedPartial
GitYes (SHA object DB)YesYes (every clone is a copy)Yes (git log)Yes (federated)No — index separately

Git loses on one axis: you cannot SELECT * FROM context WHERE topic = 'x'. That is fine. Index into git; do not replace it. The index lives at Layer 3 — Mnemosi (UberMesh’s hybrid graph + semantic search + FTS layer) or gbrain (PGLite + pgvector + hybrid search) sits on top of git and provides the queryable view without becoming the system of record.

This matters because every other agent-memory product is making the opposite architectural bet: they move H into their database and lose version history, distribution, and auditability in exchange for query speed. Letta, MemGPT, Zep, Graphiti, LangGraph Store — all make this trade. Git+index keeps all six properties. The query speed gap closes with a decent index. The versioning and audit advantages compound forever.

Content-addressing is the principle that keeps us honest. The UberMesh invariants already bake this in: “same input + same pipeline = same output key.” That is literally how git blobs work. Running the same generator against the same H should produce the same P. Running the same P against the same model should produce (approximately) the same output. Content-addressing at every layer turns the agent pipeline from a pile of side-effects into a reproducible function.

4. What’s Already Standardized, What Isn’t

Two things landed in the last twelve months that change the terrain.

AGENTS.md became the standard instruction file. The spec at agents.md is deliberately minimal: markdown at repo root, any headings, treat it as “README for agents.” The steward is the Linux Foundation’s Agentic AI Foundation. Adopters: OpenAI (Codex), Google (Gemini CLI, Jules), Sourcegraph (originated it), Factory (Droid), Cursor, Windsurf, GitHub Copilot coding agent, Aider, Kilo Code, opencode, Devin. Claude Code reads AGENTS.md when present but still prefers CLAUDE.md — dual-loading is the current pattern. Effectively, the “where should agent instructions live” war is over. The answer is AGENTS.md, with tool-specific overlays.

Rulesync became the canonical-source → tool-specific-file fanout. rulesync takes a canonical source in .rulesync/*.md and generates CLAUDE.md, AGENTS.md, .cursor/rules/*.mdc, GEMINI.md, .github/instructions/*.instructions.md, .clinerules/, .windsurfrules, Copilot directives, Roo Code files — twenty-plus tools. It has an import command for the reverse direction. Abandon rulesync and generated files remain. Coverage extends to a unified SKILL.md for skills across harnesses. This is an impressive piece of work; the “instruction file fanout” is effectively solved.

What has not been standardized is the rest of the repo:

The gap is not instruction files. The gap is repo topology — the directory shape and file schemas that every harness walks, every agent writes to, every fleet coordinates through.

5. The Shape of Context

A well-formed repo in this framing has a defined topology. Minimum layout:

repo-root/
├── AGENTS.md                    — the landed standard: generated or canonical
├── CLAUDE.md                    — Claude Code overlay (may be generated from AGENTS.md)
├── GEMINI.md                    — Gemini overlay (optional)
├── .cursor/rules/               — Cursor overlay (generated)
├── .github/copilot-instructions.md  — Copilot overlay (generated)

├── HARNESS.md                   — the compiler definition: inputs, f-steps, budget, cadence
├── MEMORY.md                    — durable working memory (three-moves externalize target)

├── hq/                          — operations (what the agent DOES)
│   ├── agent.yaml               — identity, schedule, infrastructure, goals
│   ├── status.json              — runtime status
│   ├── sops/                    — standard operating procedures
│   ├── runbooks/                — incident/deploy playbooks
│   ├── rfcs/                    — decisions under discussion
│   └── decisions/               — decisions made (append-only)

├── wiki/                        — knowledge (what the agent KNOWS)
│   ├── SCHEMA.md                — wiki-local rules
│   ├── RESOLVER.md              — decision tree for filing knowledge
│   ├── index.md                 — L1 cache: page catalog
│   ├── log.md                   — chronological record
│   ├── entities/                — one page per entity
│   ├── concepts/                — patterns and principles
│   └── decisions/               — knowledge about past decisions

├── skills/                      — tools the agent can call
│   └── <skill-name>/SKILL.md    — per rulesync SKILL.md pattern

├── subagents/                   — delegated personas (Claude Code subagent shape)

└── src/                         — the code (itself a form of context)

Each directory has a schema. agent.yaml is zod-validated. Wiki pages have frontmatter with type, status, tier, tags. RFCs have header tables. Skills have manifests. The point is not that every repo needs every directory — a pure library with no operational cycles has no hq/ — but that when a directory exists, its shape is defined and machine-readable.

The canonical source for tool-specific instruction files (CLAUDE.md, .cursor/rules/, GEMINI.md, etc.) can be either:

  1. AGENTS.md itself — write it, generate the rest with rulesync.
  2. A higher-level source — e.g., hq/agent.yaml + a template — that generates AGENTS.md and the tool-specific variants.

Either works. The key is that no single tool-specific file is the source of truth; humans and agents edit one canonical artifact and regenerate.

6. The Three Moves, as Git Operations

Autonomy is substrate discipline makes the claim that every working autonomous agent implements the same three moves: externalize before collapse, restore on wake, consolidate periodically. If git is the substrate, each move maps to a git primitive:

MoveGit primitiveWhat it looks like in practice
Externalizegit commitStop-hook or pre-compact trigger writes current state to MEMORY.md / wiki/ / hq/status.json, then commits. The commit is the durable externalize.
Restoregit checkout (implicit: repo clone at wake)Boot-time: agent harness loads HARNESS.md + AGENTS.md + MEMORY.md + relevant wiki pages. The filesystem at HEAD is the restored state.
Consolidategit rebase / squash + wiki-page updatePeriodic (cron / alarm): agent reads raw history.jsonl / log.md, synthesizes into wiki-page compiled-truth sections, writes a consolidated commit. Noisy history gets squashed.

Three consequences fall out of this mapping.

First: time travel is free. Check out an older commit, get the older harness and older H. Reproduce old behavior exactly. Debug a past failure by walking the commit log. This is impossible in DB-backed agent-memory systems; in git it is the default.

Second: fleet coherence is a merge strategy. When two agents work on related context — Jane writes to wiki/ for an analysis, Mulan writes to hq/rfcs/ for the same initiative — their commits merge in git the way any developer’s commits would. Conflicts surface as conflicts. The coordination layer is the same one humans already use.

Third: substrate discipline becomes auditable. git log -- MEMORY.md shows every externalize. The commit message shows the reason. A lint can verify that the agent committed before context would have overflowed (stop-hook present), that boot-time restore reads the right files (harness.yaml references MEMORY.md), that consolidation actually happens on schedule (commits to wiki/ at expected cadence). The discipline is enforceable because it’s visible.

7. Claude Code Has the Best Harness Today; H Is Still Wild

Claude Code is currently the strongest shipped harness. Not because its f is the most sophisticated in the literature — Hermes + Honcho and Prime have richer per-wake compilation — but because Claude Code has operationalized harness concerns that most research frameworks haven’t: subagents with persistent memory, skills as a first-class primitive, hooks (stop-hook, pre-compact-hook, session-start-hook) that let the harness extend itself, auto-memory, MCP integration, session resumption. Subagents with their own CLAUDE.md files create a nested harness structure that works. The tooling is tight.

And yet. Every Claude Code repo shapes H differently. The CLAUDE.md in one repo says one thing; the next says another. Skills are in skills/ in one repo, .claude/skills/ in another, loose markdown in a third. Subagents are configured inconsistently. The wiki, if one exists, is wherever the team put it. Subtree references across repos don’t follow a pattern.

A great harness compiling messy H still produces messy output. Claude Code’s maturity on f doesn’t fix the fact that H is bespoke per repo. When the team adds OpenCode alongside Claude Code, the opencode harness reads AGENTS.md but misses all the context that lives only in CLAUDE.md. When an engineer opens Cursor on the same repo, .cursor/rules/ is empty. The harness investment doesn’t transfer because the context shape isn’t portable.

This is the fleet problem. Forty repos, each with its own H shape, consumed by five harnesses, each with its own f. The Cartesian product is chaos. The fix is not to pick one harness — the harnesses compete on real axes and will continue to. The fix is to standardize H so any harness walks it coherently.

8. The Open Lane: Repo-Topology Standardization

The gap is now named precisely:

  1. AGENTS.md standardizes the top-level instruction file.
  2. rulesync standardizes the fanout of instruction files to tool-specific locations.
  3. Nothing standardizes the rest of the repo: hq/, wiki/, skills/, memory/, subagents/, the harness definition itself.

The tool we want to build sits in that third slot. At a minimum it does six things:

  1. Scaffolds new repos with the canonical topology (hq/ + wiki/ + skills/ + HARNESS.md + AGENTS.md + everything above). Multiple project-type templates: BusinessUnit, Worker, Library, Agent, ClawPlugin. All git-native from birth.
  2. Lints existing repos against the topology. Missing directories flagged. Schema violations caught. Wiki iron-laws (back-links bidirectional, citations present, staleness detected) enforced. Three-moves check: does this repo externalize, restore, consolidate? Does it do so via git primitives?
  3. Migrates existing repos into the topology. Detects pre-standard shape, generates synthetic answers from existing files, writes manifests, applies migrations, files the resulting PR. Drift from older schema versions upgraded via versioned codemods.
  4. Generates tool-specific instruction files from one canonical source — either delegating to rulesync (probably the right call) or implementing the subset we need. The canonical source is whatever the repo declares; the generated overlays update automatically.
  5. Keeps the fleet coherent. Cross-repo drift detection. When a schema version bumps, file PRs across every affected repo. Scorecard output for compliance. Compatibility matrix showing which agents can dispatch to which, based on shared topology contracts.
  6. Exposes itself as MCP tools so any agent (Claude Code, OpenCode, Cursor, Claude Agent SDK, custom) can call validate, lint, scaffold, migrate programmatically. The tool becomes part of every agent’s toolbelt, not a separate workflow.

Everything is git-native. No datastore. The tool’s own manifest (hq/.standard/manifest.json) is a git-tracked file. Drift detection is git diff. Migration is PRs. There is no “sync server” — git is the sync.

The positioning relative to prior art is clean:

9. Implications for Builders

For anyone building autonomous agents or running agent fleets, the implications cascade:

  1. Stop building DB-backed agent memory. You are reinventing a worse git. Commit the agent’s working memory. Use git history as the episodic log. Index with gbrain or Mnemosi when query speed matters. Keep the substrate under version control.

  2. Make the harness an artifact, not a deployment detail. Write HARNESS.md. State what H is, what G is, what f does, what the token budget is, what the consolidation cadence is. Test harness changes via git branches. Let agents time-travel by checking out old commits.

  3. AGENTS.md is the canonical instruction entry point. Write it, version it, ship it. Let tool-specific overlays (CLAUDE.md, .cursor/rules/, GEMINI.md) be generated or at worst dual-loaded. Do not privilege any one harness’s instruction file as the source of truth.

  4. Standardize the repo topology. If the repo is context, the shape of context matters as much as the content. Pick a topology — hq/ + wiki/ + skills/ + memory/ + HARNESS.md + AGENTS.md, or equivalent — and stick to it across every repo. Cross-repo coherence compounds the more repos share shape.

  5. Index into git; do not supplant it. Semantic search, graph, FTS all belong on top of git, not replacing it. Mnemosi and gbrain are the index layer; git is the store of record. When you need SELECT-style retrieval, you add an index, not a new substrate.

  6. Lint the three moves. Every agent repo should pass a lint that checks externalize (commit on exit / pre-compact), restore (boot-time load of HARNESS.md, AGENTS.md, MEMORY.md, recent wiki pages), and consolidate (scheduled rebase + wiki compile). If your agent is missing any of the three, it will not stay coherent past a few hours.

  7. Agent commits look like human commits. Every durable state change — wiki page update, memory compaction, decision record — is a commit with a structured message. Agent identity goes in the commit author field. git log is auditable substrate discipline.

10. Open Questions

  1. What’s the name of the tool? “the standard” is a working placeholder, not a final name. Candidates worth considering:

    • Canon — authoritative corpus; fits the mythological/literary naming the org already uses (Mulan, Atlas, Jane, Hermes, Mnemosi, Prime). Conflict with Canon the camera company, different space.
    • Keep — stronghold of a castle; durable; git-metaphor-compatible.
    • Armature — the internal skeleton of a sculpture; literally “what holds the shape.” Distinctive; no obvious conflicts.
    • Primer — the foundation coat / the charge that triggers; short.
    • Rosetta — for the “one canonical source, many tool-specific files” angle; Apple owns significant mindshare here.
    • Helix — structural DNA metaphor. The naming evaluation process in org memory (feedback_naming_evaluation) applies: namelix, TLD check, S-F rank, multilingual audit.
  2. Canonical source format. Is the source AGENTS.md itself (and we just generate overlays), or a higher-level artifact (hq/agent.yaml + templates) from which AGENTS.md and everything else is derived? The latter is more powerful but more invasive.

  3. Non-git workflows. Is anyone actually running agents outside git? If yes, does the tool degrade gracefully (filesystem-only mode) or hard-require git? Probably hard-require, but worth confirming.

  4. Linux Foundation AGENTS.md spec trajectory. If the Foundation extends AGENTS.md to cover directory shape, do we become the reference implementation or contribute the spec upstream? Monitor.

  5. Relationship to rulesync. Depend on it for instruction-file fanout, or reimplement? If we depend, we inherit rulesync’s trajectory; if we reimplement, we duplicate solid work. Probably: depend initially, reimplement only if rulesync’s roadmap diverges.

  6. Relationship to projen. Projen’s typed-project model is the right implementation technique. But projen’s “generated files are read-only” rule is wrong for our world. Can we borrow the model without the rigidity? Yes — use typed project classes, but apply copier-style 3-way merge on update so files stay hand-editable.

  7. Scope of the first release. Internal-only (our 40 repos) vs. public OSS from day one? The internal path lets us iterate on shape. The OSS path accelerates feedback and adoption. Probably: internal-only for three months, then open-source if the shape holds.

Prior internal articles — this article depends on them:

Org standards this article builds on:

External — the landed standards:

External — prior-art tools:

External — harnesses in the landscape:


A repo is nothing but context. Git is the medium. The shape is the work.


Edit page
Share this post on:

Previous Post
Autonomy Is Substrate Discipline
Next Post
Two Classes of Agents: Codebase-Native vs Workers-Native