A design document that lives in the owning repo, cross-references its dependencies, and turns accepted proposals into buildable issues across every affected project. Not bureaucracy — a coordination layer for 40 repos that can’t afford to lose context.
Table of Contents
Open Table of Contents
- 1. The Problem
- 2. Why Not Just GitHub Issues
- 3. The RFC: Request for Comments
- 4. The HQ Convention
- 5. RFC Format
- 6. Real Example: Four RFCs, One Capability Architecture
- 7. The Research Phase
- 8. From RFC to Issues
- 9. AI Agents as RFC Authors
- 10. What This Is Not
- 11. Project Prefix Registry
- 12. Related Articles
- Summary
1. The Problem
You have 40 repos. A new feature — capability architecture — touches five of them: Scalable Media (the storage layer), API Mom (the registry), Video Factory (the first service to decompose), Scram-Jet (the invoke operator), and Mulan (the orchestrator that drives everything).
Where does the design live?
Not in any one repo’s issues. Not in a Notion page that will be out of date in two weeks. Not in a Slack thread that nobody will ever find. If you start building without a shared design document, here is what happens:
- Video Factory decomposes into capabilities, but the capability schema doesn’t match what API Mom expects.
- Scram-Jet builds the invoke operator against a registry interface that API Mom changes a week later.
- Mulan tries to drive both and discovers that nobody agreed on how a capability signals completion.
- Three weeks of work, two breaking changes, and one cross-repo debugging session later, you converge on an interface that should have been written down on day one.
The problem is not technical. It is coordinative. Features that span repos need design documents that span repos.
2. Why Not Just GitHub Issues
GitHub Issues are scoped to a single repository. For single-repo features, they work well. For cross-repo features, they break down in predictable ways.
An issue in video-factory that says “decompose into capabilities” is useful for tracking work in video-factory. It does not explain:
- What the capability registry in api-mom looks like
- What the invoke operator in scram-jet expects from a registered capability
- What Media Store path format scalable-media uses for capability I/O
- How these four pieces interlock
You can add cross-repo links in the issue body. But a chain of links is not a design. It is a scavenger hunt. An engineer coming in fresh has to open five tabs, read five different issues, and assemble the design in their head. If any piece changes, the others are not automatically updated.
The deeper problem: issues are work tracking, not design capture. An issue says “do this.” A design document says “here is why, here is how it fits together, here is what we decided and what we rejected.” Those are different documents. Conflating them produces issues that are either bloated with design detail or divorced from the reasoning that produced them.
3. The RFC: Request for Comments
An RFC is a formal design document. It proposes a specific change, explains its motivation, specifies the implementation, and identifies its dependencies. It lives in the owning repo’s hq/rfcs/ folder, where “owning repo” means the repo responsible for the primary design.
Every RFC has a status lifecycle:
Draft → Proposed → Accepted → Implementing → Done
Draft: Author is writing. Not ready for review. May change substantially.
Proposed: Author considers the design ready. Others comment. The owner decides — this is not consensus, it is consultation.
Accepted: Design is finalized. Issues are created in all affected repos. Implementation begins.
Implementing: Work is in progress across the affected repos. RFC is the authoritative reference.
Done: All implementation issues are closed. RFC is archived as the record of the decision.
A sixth status exists for proposals that were reviewed and rejected:
Rejected: The design was considered and declined. The RFC remains as a record of what was tried and why it did not proceed. This is a valid, valuable outcome — a rejected RFC prevents the same proposal from being re-litigated without new information.
4. The HQ Convention
Every repo in the org gets an hq/ folder. This is the org’s footprint in each repo — a standard location for identity, research, and design documents regardless of which language or framework the repo uses.
repo/
hq/
manifest.json ← identity: prefix, type, org membership
rfcs/ ← design proposals authored from this repo
research/ ← supporting research referenced by RFCs
The manifest.json establishes the repo’s place in the org:
{
"prefix": "VF",
"name": "video-factory",
"type": "service",
"org": "garywu",
"description": "Video production pipeline and render coordination"
}
The prefix is the org-wide unique identifier used in RFC numbering (RFC-VF-001), issue references, and the project prefix registry. It is short, stable, and unambiguous — you never wonder which repo “VF” refers to.
The hq/ convention solves a discoverability problem: anyone who clones any repo in the org immediately knows where to look for design documents and org identity. There is no “where does this stuff live?” question. It lives in hq/.
5. RFC Format
Every RFC follows the same template. Consistency matters because RFCs are read across repos and teams — a familiar structure reduces the cognitive load of consuming an unfamiliar proposal.
# RFC-{PREFIX}-{NUMBER}: {Title}
**Status:** Draft
**Created:** {date}
**Author:** {name or agent}
**Related RFCs:** RFC-XX-NNN (dependency), RFC-YY-NNN (cross-reference)
## Summary
One paragraph. What does this RFC propose and why does it matter?
## Motivation
What problem is this solving? What breaks or is impossible without this change?
Be specific. Reference existing issues or prior design attempts if relevant.
## Specification
The technical design. Schemas, interfaces, API shapes, data flows.
This is the substantive section — it should be detailed enough that an
engineer could begin implementation from it without further clarification.
## Dependencies
Which other RFCs must be accepted and at least partially implemented
before this RFC can proceed? List them with RFC IDs and what specifically
this RFC depends on.
## Affected Projects
Which repos will require implementation work when this RFC is accepted?
For each: what is the work, approximately how large, and which issues
will be created?
## Alternatives Considered
What other approaches were evaluated? Why were they rejected?
This section prevents re-litigation and documents the reasoning.
## Open Questions
What is still undecided? This is not a sign of an incomplete RFC —
it is a record of what will be decided during implementation.
The template is intentionally concise. An RFC is not a thesis. The Specification section should be long enough to specify the design and short enough to be read in one sitting. If you find yourself writing 20 pages, you probably have two RFCs.
6. Real Example: Four RFCs, One Capability Architecture
The capability architecture feature requires four RFCs. Each lives in the owning repo, cross-references the others, and produces a distinct set of implementation issues.
The RFCs
RFC-SM-001: Content-Addressable Media Store
- Repo:
scalable-media - Status: Accepted
- Dependencies: none
- Proposes a content-addressable storage layer for all capability I/O, backed by R2 with NFS fallback. Defines the
media://URL scheme used by all capabilities. - Affected projects: scalable-media (primary implementation), api-mom (registers SM as a platform dependency), mulan (update job dispatch to pass SM paths)
RFC-AM-001: Capability Registry
- Repo:
api-mom - Status: Accepted
- Dependencies: RFC-SM-001 (capabilities use SM for I/O)
- Proposes the capability registration protocol: POST /v1/capabilities on startup, heartbeat every 60s, DELETE on graceful shutdown. Defines the cost model schema and the routing algorithm that selects among registered instances.
- Affected projects: api-mom (primary implementation), scram-jet (invoke operator must route through AM), video-factory (capabilities must register with AM)
RFC-VF-001: Video Factory Decomposition
- Repo:
video-factory - Status: Implementing
- Dependencies: RFC-SM-001, RFC-AM-001
- Proposes decomposing
render-server/server.ts(770 lines) into seven self-registering capability scripts. Defines the capability contract (POST /exec, GET /health, GET /spec) and the bootstrap lifecycle. - Affected projects: video-factory (decompose server, write capability scripts), scram-jet (pipeline definition for render-video), api-mom (validate routing with real capabilities)
RFC-SJ-001: Pipeline Invoke Operator
- Repo:
scram-jet - Status: Draft
- Dependencies: RFC-AM-001, RFC-SM-001
- Proposes the
invokeoperator for Scram-Jet pipelines — the mechanism by which a pipeline step calls a capability via API Mom and receives the result. Defines the dependency graph evaluation, fan-out/fan-in semantics, and retry policy. - Affected projects: scram-jet (primary implementation), mulan (update job creation to use pipeline YAML format)
The Dependency Graph
RFC-SM-001 (Media Store)
│
├──► RFC-AM-001 (Capability Registry)
│ │
│ ├──► RFC-VF-001 (VF Decomposition)
│ │
│ └──► RFC-SJ-001 (Invoke Operator)
│
└──► RFC-VF-001 (direct dep on SM paths)
RFC-SJ-001 (direct dep on SM paths)
RFC-SM-001 has no dependencies. Everything depends on it. You build it first. RFC-AM-001 depends only on SM and can begin once SM’s path schema is finalized — it does not need SM to be deployed. RFC-VF-001 and RFC-SJ-001 both depend on AM-001 and can be built in parallel once the registry interface is accepted.
This graph is visible in every RFC’s Dependencies section. Any engineer reading RFC-SJ-001 knows immediately that they need to understand RFC-AM-001 and RFC-SM-001 first. No Slack archaeology required.
How Each RFC Creates Issues
When RFC-AM-001 is accepted, the following issues are created:
api-mom#47: Implement capability registration endpoint (POST /v1/capabilities) — Epicapi-mom#48: Implement heartbeat receiver and TTL-based expiryapi-mom#49: Implement cost-aware routing algorithmscram-jet#31: Update invoke operator to route through AM registry — references RFC-AM-001, api-mom#47video-factory#22: Add self-registration to capability bootstrap scripts — references RFC-AM-001, api-mom#47mulan#88: Pass AM_URL to all dispatched jobs
Each issue includes a link to the RFC. Each issue in a dependent repo links back to the epic in the owning repo. An engineer working on scram-jet #31 can trace back to the full design in one click.
7. The Research Phase
Before an RFC is written, research is done. The Research Phase is not optional — it is what separates a proposal that survives review from one that gets rejected for missing prior art.
For the capability registry RFC, the research phase produced four reports:
hq/research/capability-registry-prior-art.md — How Kubernetes service registry, Consul, and Eureka handle registration, heartbeat, and TTL expiry. What works. What is over-engineered for our scale.
hq/research/capability-cost-models.md — How Vast.ai, RunPod, and Replicate expose cost metadata. What fields matter for routing decisions. What the minimum viable cost schema looks like.
hq/research/capability-protocol-options.md — Comparison of gRPC reflection, OpenAPI /spec endpoint, and JSON Schema for capability self-description. Tradeoffs on tooling, complexity, and portability.
hq/research/capability-registry-github-projects.md — Five open-source capability registry implementations surveyed. Two are relevant; three are abandoned. Links to specific issues where the maintainers discuss the hard problems.
Research agents can be launched in parallel. While one agent surveys GitHub projects, another reads vendor documentation, another reviews blog posts and design articles. The agent running the RFC process collects their outputs into hq/research/ before writing the first word of the RFC.
The RFC’s Alternatives Considered section draws directly from the research reports. When reviewers ask “why not Consul?”, the answer is in capability-registry-prior-art.md and the RFC can cite it. Research reports are not embedded in the RFC — they are referenced. The RFC stays readable; the depth is available for anyone who wants it.
8. From RFC to Issues
When an RFC moves from Proposed to Accepted, a single coordinated action creates the implementation work:
Step 1: Create the epic issue in the owning repo.
The epic title matches the RFC title. The body contains a one-paragraph summary and a checklist of child issues to be created across affected repos. The epic is the tracking artifact — anyone watching it sees the full cross-repo progress.
Step 2: Create child issues in all affected repos.
Each child issue is buildable — scoped to one repo, describing one unit of work, with acceptance criteria that can be verified without reading the full RFC. The issue body includes:
See: RFC-AM-001 (hq/rfcs/RFC-AM-001.md in api-mom)
Epic: api-mom#47
Step 3: Link the epic checklist to the child issues.
The epic issue body is updated with links to each child issue. GitHub renders this as a progress bar. The epic is done when all children are done.
Step 4: Close the RFC as Done when all issues are closed.
The RFC gains a final entry:
**Status:** Done
**Closed:** 2026-04-15
**Implementation:** api-mom#47-49, scram-jet#31, video-factory#22, mulan#88
The RFC is now a permanent record: here is the problem we were solving, here is the design we chose, here is the work that executed it. Three months from now, when someone asks “why does the registry TTL expiry work this way?”, the answer is in RFC-AM-001, not in someone’s memory.
9. AI Agents as RFC Authors
In this system, AI agents — Mulan, Claude Code — are first-class participants in the RFC process. They do not just implement what RFC authors specify. They write RFCs.
An agent can:
- Draft an RFC from a conversation. The user describes a problem. The agent produces a complete RFC draft in the appropriate
hq/rfcs/file, fills in the template, identifies dependencies, and lists affected projects. - Launch parallel research agents. Before drafting, the agent spawns sub-agents: one for prior art, one for vendor APIs, one for GitHub projects, one for blog posts. Research reports land in
hq/research/. The RFC references them. - Create issues across repos automatically. When an RFC is accepted, the agent calls the GitHub API to create the epic and all child issues, filling in the cross-references and checklist links.
- Build the implementation. With issues created and the RFC as specification, the agent begins implementation — one issue at a time, one repo at a time.
The RFC process gives agents a structured interface for proposing and building features. Without it, an agent proposing a multi-repo feature has nowhere to put the design. It either embeds the design in a conversation (lost when the session ends), writes it in an issue (wrong abstraction), or skips design and starts coding (the most expensive kind of wrong).
With the RFC process, an agent’s proposal is durable, reviewable, and traceable. The human reviews the RFC, not a wall of code. Feedback goes into the RFC’s Open Questions section, which the agent resolves before implementation. The feedback loop is tight and documented.
10. What This Is Not
Not bureaucracy. An RFC is one to two pages, not fifty. The template is eight sections, most of which are short. Writing an RFC for a small cross-repo change takes thirty minutes. Writing an RFC for a large architectural change takes a few hours. Both are faster than debugging the three weeks of misaligned assumptions that occur without one.
Not waterfall. You do not need all questions answered before you start. The Open Questions section exists precisely to name what you do not know yet. An RFC with three open questions is a normal RFC. Implementation often resolves those questions. When it does, the RFC is updated with the resolution — it is a living document through the Implementing phase.
Not consensus. Comments and questions during the Proposed phase are welcome and useful. But the owner of the owning repo decides. One person has the final call. The RFC process is not a vote. If you comment and your comment is not incorporated, that is a valid outcome. Rejected concerns are noted in Alternatives Considered or Open Questions with an explanation.
Not permanent. An RFC can be superseded. If RFC-AM-001 turns out to be wrong six months in, RFC-AM-002 can propose a revised design. AM-002 references AM-001 and explains what changed and why. The old RFC is not deleted — it is a record of where you were. The new RFC is a record of where you went.
Not only for big features. An RFC is appropriate any time a design decision touches more than one repo and the decision will not be obvious from reading the code. That threshold is lower than it sounds. If you would write more than two paragraphs explaining the change to a new team member, it belongs in an RFC.
11. Project Prefix Registry
Prefixes are assigned at repo creation and registered in the org manifest. They are short (two to three characters), uppercase, and unique across the org.
| Prefix | Repo | Type | Description |
|---|---|---|---|
| SM | scalable-media | platform | Content-addressable media storage |
| VF | video-factory | service | Video production pipeline |
| SJ | scram-jet | platform | Pipeline execution and invoke operator |
| AM | api-mom | platform | Capability registry and intelligent router |
| MU | mulan | platform | Org-level AI orchestrator |
| AT | atlas | platform | Mapping and spatial data services |
| OM | omni-extension | service | Browser extension for org integration |
The prefix registry lives in the org-level manifest, not in any individual repo. Adding a new repo to the org means: claim a prefix, add a row to this table, create the hq/manifest.json. The prefix is then available for RFC numbering immediately.
RFC numbers are sequential within a prefix: RFC-AM-001, RFC-AM-002, RFC-AM-003. They never reset. A rejected RFC occupies its number and remains as a record. There are no gaps; gaps are confusing.
12. Related Articles
-
Capability Primitive: Decomposing Monoliths — The technical content behind RFC-VF-001. Shows exactly what the decomposition of Video Factory produces and why the capability contract looks the way it does.
-
API Mom: Intelligent Router — The technical content behind RFC-AM-001. The capability registry design, cost-aware routing, and heartbeat protocol.
-
Recurring Automation Governance — Governance for autonomous systems. The RFC process and the recurring run review serve the same purpose at different scales: structured review before committing to a course of action.
-
Python PEP Process — The original formulation of the “propose, comment, decide, implement” workflow that Python has used to evolve the language for 30 years. The RFC process described here is a lightweight adaptation for multi-repo engineering orgs.
-
Rust RFC Process — The Rust community’s RFC process, which produces some of the most carefully reasoned language design documents available. Notable for how it handles the Alternatives Considered section — every major Rust RFC documents the paths not taken with the same rigor as the path that was.
Summary
The RFC process for multi-repo ecosystems is a coordination layer, not a gate. It solves a specific problem: design for features that touch multiple repos has no natural home, so it either lives nowhere (causing misalignment) or lives in every repo separately (causing drift).
The solution is a standard: one design document per feature, living in the owning repo, cross-referencing its dependencies, turning into buildable issues when accepted. The hq/rfcs/ convention gives the design a permanent home. The status lifecycle makes progress visible. The issue creation step makes implementation traceable.
| Element | What it solves |
|---|---|
hq/rfcs/ folder | Design has a standard home in every repo |
| RFC-{PREFIX}-{NUMBER} | Unambiguous cross-repo reference format |
| Status lifecycle | Progress is visible without reading the document |
| Related RFCs field | Dependency graph is explicit, not implicit |
| Research phase | Alternatives Considered is evidence-based, not guesswork |
| Issue creation on accept | Design connects to buildable work automatically |
The cost of writing an RFC: thirty minutes to a few hours.
The cost of not writing one: three weeks of misaligned implementation, two breaking interface changes, and a cross-repo debugging session that could have been a paragraph in the Specification section.
This article is part of the garywu engineering knowledge base. See also: garywu/_readme/articles/recurring-automation-governance (governance for autonomous systems).
Last updated: 2026-03-27.