Org Status: 🟡 Dormant Cloudflare: N/A Last Audited: 2026-04-28
The autonomous agent ecosystem exploded in early 2026. OpenClaw became the fastest-growing open-source project in GitHub history, crossing 302K stars in 60 days. A wave of lightweight alternatives followed — NanoClaw, PicoClaw, NullClaw, ZeroClaw, NanoBot, TinyClaw — each making different tradeoffs around memory, skill systems, communication patterns, and execution models. Meanwhile, enterprise frameworks like LangGraph, CrewAI, AutoGen, and the OpenAI Agents SDK matured toward production readiness.
This article systematically compares 18 agent frameworks across every dimension that matters: architecture, memory, skills, communication, execution, state management, and ecosystem. The goal is not to crown a winner — it is to map the design space so you can pick the right patterns for your system.
What you will learn:
- How each framework implements memory (short-term, long-term, vector search, auto-learning)
- How skills and tools are defined, composed, and shared across agents
- What communication patterns exist between agents and with humans
- How execution models differ (local CLI, cloud, serverless, edge)
- Which architectural patterns emerge as consensus across frameworks
- Recommendations by use case with honest tradeoffs
- The Problem
- Framework Overview
- Deep Dives
- Master Feature Matrix
- Memory Systems Compared
- Skills and Tools Compared
- Communication Patterns Compared
- Execution Models Compared
- State Management Compared
- Ecosystem Compared
- Architecture Patterns That Emerge
- Implementation Deep Dives
- Anti-Patterns
- Recommendations by Use Case
- References
Building an autonomous agent that works reliably requires solving at least six hard problems simultaneously:
- Memory — How does the agent remember what happened last session? Last week? How does it decide what to forget?
- Skills — How do you teach an agent a repeatable procedure? Can skills be shared, versioned, composed?
- Communication — How do agents talk to each other? How does a human interrupt or redirect?
- Execution — Where does the agent run? How do you sandbox it? How do you control cost?
- State — Can an agent resume from a checkpoint? Can it survive a crash?
- Coordination — In multi-agent systems, who decides what runs next?
No single framework solves all six perfectly. Most are strong in one or two areas and weak in the rest. The frameworks that emerged in 2025-2026 represent the first generation of serious attempts at production-grade autonomous agents, and they make radically different design choices.
What changes if you get this right
A well-chosen agent architecture lets you:
- Run autonomous workflows that span hours or days without human intervention
- Route tasks to the cheapest capable model (Haiku for simple, Opus for complex)
- Resume from failures without losing progress
- Share learned procedures across agents without copy-pasting prompts
- Scale from a single local agent to a distributed multi-agent system
What happens if you get it wrong
- Agents that forget everything between sessions
- Skills that are brittle, non-transferable prompt hacks
- Communication that is either missing (agents can’t coordinate) or chaotic (message storms)
- Execution that blows through token budgets with no circuit breaker
- State that vanishes on crash, forcing full restarts
The Claw Family (OpenClaw-derived)
The single biggest event in the agent framework space was OpenClaw’s explosive growth. Peter Steinberger’s personal AI assistant went from obscure side project to 302K GitHub stars in 60 days, surpassing React’s 10-year record. Steinberger joined OpenAI in February 2026 and moved the project to an open-source foundation.
This spawned an entire family of lightweight alternatives:
| Framework | Language | Binary Size | RAM | Startup | Stars | Focus |
|---|---|---|---|---|---|---|
| OpenClaw | TypeScript | ~200MB (Node) | ~1GB | ~5s | 302K | Full-featured personal AI assistant |
| NanoClaw | TypeScript | ~50MB (Node) | ~200MB | ~3s | 22K | Container-isolated, Claude-native |
| NanoBot | Python | N/A | ~100MB | ~2s | 27K | Ultra-light, knowledge graph memory |
| PicoClaw | Go | <10MB | <10MB | <1s | 18K | Edge/IoT, $10 hardware |
| ZeroClaw | Rust | ~16MB | ~5MB | <50ms | 17K | Trait-driven, pluggable everything |
| NullClaw | Zig | 678KB | ~1MB | <2ms | 12K | Smallest possible, hardware peripherals |
| TinyClaw | TypeScript | ~60MB (Node) | ~300MB | ~3s | 8K | Multi-agent teams, collaboration |
Enterprise / Research Frameworks
| Framework | Language | Stars | Focus |
|---|---|---|---|
| LangGraph | Python/TS | 10K | Stateful agent graphs, checkpointing |
| CrewAI | Python | 46K | Multi-agent orchestration, role-based |
| AutoGen / Microsoft Agent Framework | Python/.NET | 38K | Enterprise multi-agent, async messaging |
| OpenAI Agents SDK | Python/TS | 15K | Production evolution of Swarm |
| MetaGPT | Python | 42K | SOP-driven software company simulation |
| Claude Agent SDK | Python/TS | N/A | Claude Code tooling as a framework |
| Pydantic AI | Python | 15K | Type-safe agents, structured output |
| Agent Zero | Python | 12K | Auto-learning, hierarchical subordinates |
| Cloudflare Agents SDK | TypeScript | N/A | Durable Objects as agents, edge-native |
Legacy / Educational
| Framework | Language | Stars | Status |
|---|---|---|---|
| AutoGPT | Python/TS | 170K | Pivoted to low-code platform |
| BabyAGI | Python | 20K | Experimental, self-building |
| SuperAGI | Python | 15K | Stalled since Jan 2024 |
| Swarm (OpenAI) | Python | 18K | Replaced by Agents SDK |
OpenClaw
What it is: The most popular autonomous AI agent framework. Runs locally, connects to 20+ messaging platforms, and provides a hub-and-spoke architecture with a local WebSocket gateway.
Strengths:
- Massive ecosystem: 5,400+ skills in the official registry, 103 production-ready agent templates
- File-based memory system that is easy to inspect, debug, and version control
- Hybrid search (vector + BM25) over per-agent SQLite with temporal decay
- Multi-provider LLM support with fallback chains
- Skills are just markdown files (SKILL.md) — no SDK, no compilation
Weaknesses:
- Heavy: ~1GB RAM, 5-second startup. Not suitable for edge or constrained environments
- Local-only architecture. Cloud deployment requires MoltWorker or similar wrappers
- The 302K stars created a gold rush of low-quality forks and plugins
- Agent-to-agent communication is opt-in and somewhat clunky
Architecture:
User (WhatsApp/Telegram/Slack/...)
|
Local Gateway (ws://127.0.0.1:18789)
|
Agent Runtime Sessions
|-- SOUL.md (personality)
|-- AGENTS.md (behavior rules, injected every turn)
|-- MEMORY.md (long-term curated facts)
|-- memory/ (daily logs, temporal decay)
|-- skills/ (SKILL.md playbooks, selectively injected)
|-- SQLite (sessions, search index)
Skill definition example:
---
name: deploy-to-cloudflare
description: Deploy a Cloudflare Workers project using Wrangler
requires:
binaries: ["wrangler", "node"]
env: ["CLOUDFLARE_ACCOUNT_ID", "CLOUDFLARE_API_TOKEN"]
tags: ["deployment", "cloudflare", "infrastructure"]
---
1. Verify wrangler.jsonc exists in the project root
2. Run `wrangler whoami` to verify authentication
3. Run `wrangler deploy` and capture output
4. Verify deployment by checking the output URL
5. If deployment fails, read error output and attempt fix
- If CLOUDFLARE_API_TOKEN is missing, tell the user
- If wrangler.jsonc is missing, check for wrangler.toml and convert
- If deployment fails with route conflict, suggest manual resolution
Key insight: OpenClaw proved that skills-as-markdown is a viable pattern. No SDK, no compilation, no dependency management. Just a folder with a SKILL.md file. This pattern has been adopted by nearly every framework in the Claw family.
NanoClaw
What it is: A container-isolated, Claude-native alternative to OpenClaw. ~500 lines of core TypeScript. Each agent runs in its own Linux container with filesystem isolation.
Strengths:
- Security through container isolation (Docker or Apple Container)
- Built directly on the Claude Agent SDK
- Dead simple filesystem IPC (JSON files polled by host every second)
- “Skills over Features” philosophy — users add capabilities by having Claude modify the codebase
Weaknesses:
- Claude-only. No multi-provider support
- Container overhead means slightly higher latency per request
- Smaller ecosystem than OpenClaw
Architecture:
Platform -> Channel.onMessage() -> storeMessage(SQLite)
-> MessageLoop polls -> GroupQueue enqueues
-> runContainerAgent() spawns container
-> Claude Agent SDK processes
-> IPC files written -> Host polls & routes
-> Channel delivers to platform
IPC structure:
data/ipc/{group}/
|-- messages/ # Outbound message JSON files
|-- tasks/ # Schedule/pause/cancel task JSONs
|-- current_tasks.json # Host -> container snapshot
|-- available_groups.json
Tool definition example (MCP tools inside container):
// NanoClaw exposes these as MCP tools to Claude inside each container
const tools = {
send_message: async (params: { group: string; text: string }) => {
// Write JSON to data/ipc/{group}/messages/
await writeFile(
`data/ipc/${params.group}/messages/${Date.now()}.json`,
JSON.stringify({ text: params.text, timestamp: new Date().toISOString() })
);
},
schedule_task: async (params: { name: string; cron: string; prompt: string }) => {
await writeFile(
`data/ipc/${currentGroup}/tasks/${params.name}.json`,
JSON.stringify({ action: "schedule", cron: params.cron, prompt: params.prompt })
);
},
list_tasks: async () => {
const tasks = await readFile(`data/ipc/${currentGroup}/current_tasks.json`);
return JSON.parse(tasks);
},
};
Key insight: NanoClaw proves you can build a production-capable agent system in ~500 lines by standing on the Claude Agent SDK. The “skills over features” model — where customization means code changes, not configuration — avoids the configuration sprawl that plagues larger frameworks.
NanoBot (HKUDS)
What it is: An ultra-lightweight Python alternative to OpenClaw. ~4,000 lines. Delivers core agent functionality with 99% less code than OpenClaw.
Strengths:
- Stateful knowledge graph memory — the agent builds a local graph of user history and context
- Model-agnostic: works with OpenAI, Anthropic, local models
- MCP support for external tool integration
- ClawHub skill — search and install public agent skills
- Clean Python codebase that is easy to read and extend
Weaknesses:
- Python means slower startup than Go/Rust/Zig alternatives
- Knowledge graph memory is more complex to debug than flat file memory
- Smaller community than OpenClaw (27K vs 302K stars)
Memory example:
PicoClaw
What it is: An ultra-lightweight Go-based agent. <10MB RAM, boots in 1 second, runs on $10 RISC-V hardware. 95% of core code is AI-generated through a self-bootstrapping process.
Strengths:
- Single static binary across RISC-V, ARM, MIPS, and x86
- 400x faster startup than OpenClaw
- Gateway command for multi-platform messaging (Telegram, Discord, QQ, DingTalk)
- Multi-provider LLM support (OpenRouter, Anthropic, OpenAI, DeepSeek, Groq)
Weaknesses:
- Multi-agent collaboration is still in progress (Issue #294)
- Smaller skill ecosystem
- Memory system less sophisticated than OpenClaw’s hybrid search
Use case: Edge computing, IoT devices, Raspberry Pi, self-hosted agents on cheap hardware.
NullClaw
What it is: The smallest possible autonomous agent. 678KB binary, ~1MB RAM, <2ms startup. Written in raw Zig with zero dependencies — no Python, no JVM, no Go runtime.
Strengths:
- 23+ LLM providers, 18 channels, 18+ tools in a 678KB binary
- Hybrid vector + FTS5 memory search in self-contained SQLite
- Hardware peripheral support (MaixCam, sensors)
- Multi-layer sandbox for security
- Vtable-driven architecture — every subsystem is pluggable
Weaknesses:
- Zig ecosystem is small — fewer contributors, harder to find developers
- Documentation is sparser than TypeScript/Python alternatives
- Community and plugin ecosystem much smaller
Architecture pattern:
// NullClaw's vtable-driven extension model
// Every subsystem implements a simple interface
const ProviderVTable = struct {
init: *const fn (config: *const Config) anyerror!void,
complete: *const fn (messages: []const Message) anyerror!Response,
embed: *const fn (text: []const u8) anyerror![]f32,
deinit: *const fn () void,
};
const ChannelVTable = struct {
init: *const fn (config: *const Config) anyerror!void,
receive: *const fn () anyerror!?InboundMessage,
send: *const fn (message: OutboundMessage) anyerror!void,
deinit: *const fn () void,
};
// Register a new provider:
pub fn registerProvider(name: []const u8, vtable: ProviderVTable) void {
provider_registry.put(name, vtable);
}
Key insight: NullClaw proves that a full-featured agent runtime (providers, channels, tools, memory, sandbox) can fit in under 1MB. The vtable pattern is what makes this possible — zero abstraction cost, zero dynamic dispatch overhead.
ZeroClaw
What it is: A Rust-based agent runtime. ~16MB binary, ~5MB RAM. Trait-driven architecture where every subsystem is swappable.
Strengths:
- Hybrid memory search: 70% vector (cosine similarity) + 30% FTS5 (BM25), tunable weights
- Authentication pairing, workspace isolation, explicit tool allowlists
- TOML-based skill manifests with community skill packs
- Auto-recall: context automatically retrieved based on task
- Embedding cache (LRU, 10K entries) for performance
Weaknesses:
- Rust compile times slow down development iteration
- Smaller community than Go or TypeScript alternatives
- Multiple unofficial forks creating confusion (official org)
Memory configuration:
[memory]
backend = "sqlite"
vector_weight = 0.7
keyword_weight = 0.3
embedding_model = "nomic-embed-text"
embedding_cache_size = 10000
auto_recall = true
[memory.retention]
short_term_hours = 24
long_term_days = 365
decay_half_life_days = 30
[skills]
paths = ["./skills", "~/.zeroclaw/skills"]
community_registry = "https://registry.zeroclawlabs.ai"
TinyClaw
What it is: A multi-agent team collaboration framework. Agents work in teams, communicate through persistent chat rooms, and hand off work via chain execution and fan-out patterns.
Strengths:
- First-class multi-agent teams: coder, reviewer, writer, researcher collaborate autonomously
- TinyOffice web portal for monitoring agents, teams, queues, and event feeds
- Actor model with simple message queue for agent-to-agent communication
- Cross-channel context sharing (Discord, WhatsApp, Telegram)
- Fan-out: one agent mentions a teammate, work distributes in parallel
Weaknesses:
- Higher memory footprint than single-agent alternatives (~300MB)
- Team coordination adds latency vs single-agent execution
- Newer framework — API still evolving
Multi-agent team definition:
// TinyClaw team configuration
const team = {
id: "content-team",
agents: [
{
id: "researcher",
model: "claude-sonnet-4-5-20250514",
systemPrompt: "You are a research specialist...",
tools: ["web_search", "read_file", "write_file"],
},
{
id: "writer",
model: "claude-sonnet-4-5-20250514",
systemPrompt: "You are a content writer...",
tools: ["read_file", "write_file", "send_message"],
},
{
id: "reviewer",
model: "claude-haiku-3-5-20241022",
systemPrompt: "You are a content reviewer...",
tools: ["read_file", "send_message"],
},
],
// Persistent team chat room -- all agents see all messages
chatRoom: {
persistence: "sqlite",
broadcastAll: true,
},
};
// Fan-out: researcher mentions @writer and @reviewer
// Both receive the message and process in parallel
// Responses flow back to the team chat room
Agent Zero
What it is: A general-purpose AI agent with the best auto-learning system in the open-source space. Hierarchical superior-subordinate model with Docker-isolated execution.
Strengths:
- Auto-learning is the killer feature: FAISS vector memory with automatic fact extraction after every agent turn
- Auto-consolidation: related memories merge and summarize over time
- Model-agnostic with 3 slots (chat, utility, embedding) that can mix providers
- Subordinate spawning: any agent can create sub-agents for specialized tasks
- The agent genuinely gets better over time without explicit training
Weaknesses:
- Smaller community (12K stars)
- Docker dependency for execution isolation
- Python-only
- No built-in messaging platform integration (it is a framework, not a personal assistant)
Auto-learning memory flow:
extracted = utility_model.extract(
conversation_turn,
categories=["facts", "solutions", "preferences", "errors"]
)
for item in extracted:
embedding = embedding_model.embed(item.text)
faiss_index.add(embedding, metadata={
"category": item.category,
"timestamp": now(),
"source_conversation": conversation_id,
})
memories = faiss_index.search(current_query, k=10)
Key insight: Agent Zero’s auto-learning is the single most differentiating feature in the entire framework landscape. No other framework automatically extracts, categorizes, embeds, and consolidates knowledge from conversations. Every other system requires the user or developer to explicitly manage memory.
LangGraph
What it is: LangChain’s agent graph framework. Models agent workflows as state machines with nodes, edges, and reducers. Reached 1.0 GA in October 2025.
Strengths:
- Strongest checkpointing and persistence story: MemorySaver, SqliteSaver, PostgresSaver
- Thread-based state management with time-travel debugging
- Human-in-the-loop built into the graph model (interrupt nodes)
- Reducer-driven state schemas prevent data loss in multi-agent systems
- Production-proven at scale
Weaknesses:
- LangChain dependency (large, complex)
- Graph DSL has a learning curve
- Python-centric (TypeScript support exists but lags)
- Overhead for simple use cases
State and checkpointing example:
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.postgres import PostgresSaver
from typing import TypedDict, Annotated
from operator import add
class AgentState(TypedDict):
messages: Annotated[list, add] # Reducer: append new messages
task_status: str
research_results: list[dict]
draft: str
review_feedback: str
def research_node(state: AgentState) -> dict:
"""Research node - fetches data and updates state."""
results = search_api(state["messages"][-1])
return {
"research_results": results,
"task_status": "researched",
}
def write_node(state: AgentState) -> dict:
"""Write node - drafts content from research."""
draft = llm.invoke(f"Write based on: {state['research_results']}")
return {"draft": draft, "task_status": "drafted"}
def review_node(state: AgentState) -> dict:
"""Review node - human can interrupt here."""
feedback = llm.invoke(f"Review: {state['draft']}")
return {"review_feedback": feedback, "task_status": "reviewed"}
graph = StateGraph(AgentState)
graph.add_node("research", research_node)
graph.add_node("write", write_node)
graph.add_node("review", review_node)
graph.add_edge(START, "research")
graph.add_edge("research", "write")
graph.add_edge("write", "review")
graph.add_conditional_edges("review", lambda s: END if s["task_status"] == "approved" else "write")
checkpointer = PostgresSaver(conn_string="postgresql://...")
app = graph.compile(checkpointer=checkpointer)
config = {"configurable": {"thread_id": "article-draft-42"}}
result = app.invoke({"messages": ["Write about agent frameworks"]}, config)
resumed = app.invoke({"messages": ["Add more code examples"]}, config)
CrewAI
What it is: Multi-agent orchestration with role-based agent definitions. Dual architecture: Crews (autonomous teams) and Flows (event-driven workflows). 45.9K GitHub stars.
Strengths:
- Intuitive role-based agent model (researcher, writer, editor)
- 600+ platform integrations, 7,000+ tools
- Advanced memory: weighted strategies (recency, semantic, importance), configurable half-life
- Adaptive-depth recall with composite scoring
- Flows for enterprise event-driven workflows
Weaknesses:
- Python-only
- Can be slow for simple tasks (agent overhead)
- Memory system complexity can be hard to debug
- Commercial features locked behind enterprise tier
Agent and crew definition:
from crewai import Agent, Task, Crew, Process
researcher = Agent(
role="Senior Research Analyst",
goal="Find comprehensive information about autonomous agent frameworks",
backstory="You are an experienced tech researcher...",
tools=[web_search, file_reader],
memory=True,
verbose=True,
llm="anthropic/claude-sonnet-4-5-20250514",
)
writer = Agent(
role="Technical Writer",
goal="Produce clear, detailed technical articles",
backstory="You are a staff engineer who writes...",
tools=[file_writer, markdown_formatter],
memory=True,
llm="anthropic/claude-sonnet-4-5-20250514",
)
research_task = Task(
description="Research {topic} and compile findings",
expected_output="Structured research notes with citations",
agent=researcher,
)
writing_task = Task(
description="Write a comprehensive article based on research",
expected_output="A 2000+ word technical article in Markdown",
agent=writer,
context=[research_task], # Writer gets researcher's output
)
crew = Crew(
agents=[researcher, writer],
tasks=[research_task, writing_task],
process=Process.sequential,
memory=True,
memory_config={
"provider": "mem0",
"config": {"vector_store": {"provider": "chroma"}},
},
)
result = crew.kickoff(inputs={"topic": "agent memory systems"})
AutoGen / Microsoft Agent Framework
What it is: Microsoft’s multi-agent framework. AutoGen v0.4 was a complete rewrite with async event-driven architecture. Now merging with Semantic Kernel into the Microsoft Agent Framework, targeting GA by end of Q1 2026.
Strengths:
- Enterprise-ready: OpenTelemetry, GDPR, Azure integration
- Cross-language: Python and .NET with more planned
- Distributed agent networks across organizational boundaries
- Graph-based workflows for explicit multi-agent orchestration
- Session-based state management
Weaknesses:
- Complexity: merging two large frameworks creates a steep learning curve
- Microsoft-ecosystem bias (Azure, Semantic Kernel)
- Migration path from AutoGen to Agent Framework is non-trivial
- Heavyweight for simple use cases
AutoGen v0.4 agent example:
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination
from autogen_ext.models.openai import OpenAIChatCompletionClient
model = OpenAIChatCompletionClient(model="gpt-4o")
coder = AssistantAgent(
name="coder",
model_client=model,
system_message="You write Python code to solve tasks.",
)
reviewer = AssistantAgent(
name="reviewer",
model_client=model,
system_message="You review code for bugs and improvements.",
)
termination = TextMentionTermination("APPROVED")
team = RoundRobinGroupChat(
participants=[coder, reviewer],
termination_condition=termination,
max_turns=10,
)
result = await team.run(task="Write a function to calculate Fibonacci numbers")
OpenAI Agents SDK
What it is: The production evolution of Swarm. Lightweight, provider-agnostic multi-agent framework with handoffs, guardrails, and tracing.
Strengths:
- Clean, minimal API (agents, handoffs, tools, guardrails)
- Built-in tracing for debugging agent runs
- Sessions for automatic conversation history management
- Provider-agnostic: supports 100+ LLMs
- Realtime voice agents
Weaknesses:
- No built-in persistence/checkpointing (unlike LangGraph)
- Limited memory system (session-based only)
- Newer framework — less battle-tested at scale
Agent with handoff:
from agents import Agent, Runner
triage_agent = Agent(
name="Triage",
instructions="Route the user to the right specialist.",
handoffs=["research_agent", "coding_agent"],
)
research_agent = Agent(
name="Research",
instructions="Search for information and summarize findings.",
tools=[web_search_tool],
)
coding_agent = Agent(
name="Coding",
instructions="Write and debug code.",
tools=[code_execution_tool],
)
result = Runner.run_sync(triage_agent, "I need to find the best API for geocoding")
MetaGPT
What it is: A multi-agent framework that simulates a software company. Agents have roles (Product Manager, Architect, Engineer, QA) and collaborate through a shared message pool following Standard Operating Procedures (SOPs).
Strengths:
- SOP-driven workflows produce structured deliverables (PRDs, designs, code, tests)
- Shared message pool with pub/sub — agents subscribe to relevant messages by role
- Enforces structured outputs (not just chat), improving downstream quality
- Academic backing with published research
Weaknesses:
- Opinionated toward software development workflows
- Heavy Python dependency tree
- Less flexible for non-software-development use cases
- SOPs are rigid — hard to adapt mid-execution
Claude Agent SDK
What it is: Anthropic’s framework for building agents on top of Claude Code. Provides the same tools, agent loop, and context management that power Claude Code, programmable in Python and TypeScript.
Strengths:
- Full Claude Code toolset (Read, Write, Edit, Bash, etc.) out of the box
- Custom tools as Python/TypeScript functions
- MCP integration for external services (Slack, GitHub, Google Drive, Asana)
- Hook system for intercepting and modifying agent behavior
- First-class code review capabilities (multi-agent)
Weaknesses:
- Claude-only (no multi-provider support)
- Newer — ecosystem still developing
- Requires Anthropic API key (no subscription arbitrage)
Cloudflare Agents SDK
What it is: Each agent is a Durable Object with built-in state persistence, scheduling, WebSocket communication, and queue integration. Edge-native.
Strengths:
- Every agent is a Durable Object — state auto-persists to SQLite
this.schedule()for cron,this.queue()for async work- Agent-to-agent communication via DO RPC (zero HTTP overhead)
- WebSocket state sync with React via
useAgent()hook - MCPAgent class for building MCP servers
- AgentWorkflow for bidirectional Workflow-Agent RPC
Weaknesses:
- Cloudflare platform lock-in
- No local development story without Miniflare
- Durable Object limitations (memory, CPU time)
- Smaller community
import { Agent, AIChatAgent } from "@cloudflare/agents";
class ResearchAgent extends AIChatAgent<Env, AgentState> {
async onChatMessage(onFinish: StreamCallbacks["onFinish"]) {
const response = await generateText({
model: anthropic("claude-sonnet-4-5-20250514"),
system: this.system,
messages: this.messages,
tools: this.tools,
onFinish: (result) => {
// State auto-persists to Durable Object SQLite
this.setState({
...this.state,
lastResearch: result.text,
researchCount: (this.state.researchCount || 0) + 1,
});
onFinish(result);
},
});
}
// Built-in scheduling
async onAlarm() {
// Runs on configured schedule
await this.runDailyResearch();
}
}
Pydantic AI
What it is: A type-safe agent framework built on Pydantic. Catches agent logic errors at development time through Python’s type system.
Strengths:
- Type safety: IDE autocompletion and static analysis for agent code
- Structured output with continuous streaming validation
- Durable execution: preserves progress across failures and restarts
- Model-agnostic: OpenAI, Anthropic, Gemini, DeepSeek, Grok, Cohere, Mistral, Bedrock, Vertex
- MCP and Agent2Agent protocol support
Weaknesses:
- Python-only
- Newer framework (15K stars, still growing)
- Less opinionated about multi-agent patterns
BabyAGI
What it is: A task-driven autonomous agent that runs a create-execute-reprioritize loop. The OG autonomous agent from 2023.
Current status: Evolved into an experimental self-building agent framework. The newest version is the agent that builds itself.
Historical significance: Proved the task loop pattern (create tasks -> execute -> reprioritize -> repeat) that influenced nearly every subsequent framework.
AutoGPT
What it is: The original viral autonomous agent (170K stars). Has pivoted from fully autonomous execution to a low-code platform with a block-based agent builder.
Current status: Now a platform rather than a framework. Users build agents using modular blocks through a Next.js UI with FastAPI backend and PostgreSQL storage.
SuperAGI
What it is: A dev-first autonomous agent framework with a GUI, toolkit marketplace, and APM dashboard.
Current status: Stalled. No releases since January 2024. Issues go unanswered. The company has pivoted. Security vulnerabilities remain unaddressed. Do not use for new projects.
This is the comprehensive comparison across all frameworks and all dimensions.
Architecture
| Framework | Type | Language | Async | LLM Support | Binary/Runtime |
|---|---|---|---|---|---|
| OpenClaw | Single agent | TypeScript | Yes | Multi-provider (20+) | Node.js |
| NanoClaw | Single agent | TypeScript | Yes | Claude only | Node.js + Container |
| NanoBot | Single agent | Python | Yes | Multi-provider | Python |
| PicoClaw | Single agent | Go | Yes | Multi-provider (7+) | Static binary |
| ZeroClaw | Single agent | Rust | Yes | Multi-provider | Static binary |
| NullClaw | Single agent | Zig | Yes | Multi-provider (23+) | Static binary |
| TinyClaw | Multi-agent | TypeScript | Yes | Multi-provider | Node.js |
| Agent Zero | Hierarchical | Python | Yes | Multi-provider (3 slots) | Python + Docker |
| LangGraph | Graph-based | Python/TS | Yes | Multi-provider | Python/Node.js |
| CrewAI | Multi-agent | Python | Yes | Multi-provider | Python |
| AutoGen | Multi-agent | Python/.NET | Yes | Multi-provider | Python/.NET |
| OpenAI SDK | Multi-agent | Python/TS | Yes | Multi-provider (100+) | Python/Node.js |
| MetaGPT | Multi-agent | Python | Yes | Multi-provider | Python |
| Claude SDK | Single agent | Python/TS | Yes | Claude only | Python/Node.js |
| CF Agents | Single/Multi | TypeScript | Yes | Multi-provider | Cloudflare Workers |
| Pydantic AI | Single agent | Python | Yes | Multi-provider (10+) | Python |
| BabyAGI | Single agent | Python | No | OpenAI primary | Python |
| AutoGPT | Platform | Python/TS | Yes | Multi-provider | Docker |
Memory Systems
| Framework | Short-Term | Long-Term | Vector Search | Keyword Search | Auto-Learning | Shared Memory | Temporal Decay |
|---|---|---|---|---|---|---|---|
| OpenClaw | Session JSONL | MEMORY.md + daily logs | sqlite-vec | BM25 | No | No (opt-in IPC) | 30-day half-life |
| NanoClaw | SQLite | Per-group CLAUDE.md | No | No | No | No | No |
| NanoBot | In-memory | Knowledge graph | Planned | Planned | Graph-based | No | No |
| PicoClaw | In-memory | File-based | Planned | Planned | No | No | No |
| ZeroClaw | In-memory | SQLite/Markdown | Cosine sim (0.7) | FTS5 BM25 (0.3) | No | No | Configurable |
| NullClaw | In-memory | SQLite | Cosine sim | FTS5 BM25 | No | No | No |
| TinyClaw | Per-agent | Team chat history | No | No | No | Team chat rooms | No |
| Agent Zero | Context window | FAISS vectors | FAISS IndexFlatIP | No | Yes (auto) | Subordinate inheritance | No (consolidation instead) |
| LangGraph | Graph state | Checkpointer (PG/SQLite) | Via LangChain | Via LangChain | No | Shared graph state | No |
| CrewAI | Short-term memory | Long-term w/ vector DB | Chroma/custom | Hybrid | No | Crew-level shared | Configurable half-life |
| AutoGen | Session-based | Configurable | Via plugins | Via plugins | No | Group chat history | No |
| OpenAI SDK | Session | Session (auto-managed) | No built-in | No built-in | No | Via handoff context | No |
| MetaGPT | Role context | Shared message pool | No built-in | No built-in | No | Shared message pool | No |
| Claude SDK | Conversation | File-based (CLAUDE.md) | No built-in | No built-in | No | No | No |
| CF Agents | DO SQLite state | DO SQLite + Vectorize | Vectorize | No built-in | No | DO RPC | No |
| Pydantic AI | Conversation | Durable execution | Via integration | Via integration | No | No | No |
| BabyAGI | Task list | Pinecone vectors | Pinecone | No | No | No | No |
| AutoGPT | Block state | PostgreSQL | Via blocks | Via blocks | No | No | No |
Skills and Tools
| Framework | Skill Format | Composable | Shared Registry | Built-in Tools | Custom Skills |
|---|---|---|---|---|---|
| OpenClaw | SKILL.md (Markdown + YAML) | Yes | 5,400+ skills | Shell, browser, file, memory | Drop-in folder |
| NanoClaw | Code changes | Yes | No | MCP tools (send, schedule, list) | Fork and modify |
| NanoBot | ClawHub skills | Yes | ClawHub registry | Shell, file, MCP | MCP + ClawHub |
| PicoClaw | Config-based | Limited | No | Shell, file, messaging | Config |
| ZeroClaw | TOML manifests | Yes | Community registry | Shell, file, memory, git, browser, cron | TOML + folder |
| NullClaw | Vtable interface | Yes | No | 18+ (file, shell, memory, browser, hw) | Zig interface impl |
| TinyClaw | Agent config | Yes | No | Read, write, search, message | TypeScript functions |
| Agent Zero | Python functions | Yes | No | Shell, browser, code exec, delegate | Python functions |
| LangGraph | Python functions | Yes | LangChain hub | LangChain tools ecosystem | Python/TS functions |
| CrewAI | Python decorators | Yes | 7,000+ tools | 600+ integrations | Python decorators |
| AutoGen | Python/.NET | Yes | No | Code execution, web | Functions + plugins |
| OpenAI SDK | Functions/MCP | Yes | No | Functions, MCP, hosted tools | Functions |
| MetaGPT | Role actions | Yes | No | Code, write, design, review | Python classes |
| Claude SDK | Python/TS functions + MCP | Yes | MCP ecosystem | Read, Write, Edit, Bash, + more | Functions + MCP |
| CF Agents | TypeScript + MCP | Yes | MCP ecosystem | DO state, schedule, queue, SQL | TS functions + MCP |
| Pydantic AI | Typed Python functions | Yes | MCP/A2A | MCP ecosystem | Typed functions |
| BabyAGI | Python functions | Limited | No | Search, code execution | Python functions |
| AutoGPT | Blocks (low-code) | Yes | Marketplace | Web, file, code, AI | Block builder |
Communication Patterns
| Framework | Agent-to-Agent | Human-in-Loop | Message Passing | Event System |
|---|---|---|---|---|
| OpenClaw | Opt-in IPC (sessions_send/spawn) | Chat interface | WebSocket gateway | No |
| NanoClaw | Filesystem IPC (JSON polling) | Chat interface | JSON files, 1s poll | No |
| NanoBot | No | Chat interface | Messaging platforms | No |
| PicoClaw | Planned | Chat interface | Gateway command | No |
| ZeroClaw | No | Chat interface | Channels | No |
| NullClaw | No | Chat interface | 18 channels | Webhooks |
| TinyClaw | Team chat rooms (broadcast) | Chat interface | Actor model + message queue | Event feed |
| Agent Zero | Subordinate spawning | Superior chain (human at top) | Direct invocation | No |
| LangGraph | Graph edges | Interrupt nodes | State reducers | No |
| CrewAI | Task delegation | Callback system | Task context passing | Flows (event-driven) |
| AutoGen | Async messaging | Chat interface | Pub/sub, request/response | OpenTelemetry |
| OpenAI SDK | Handoffs | Human-in-loop API | Handoff functions | Tracing |
| MetaGPT | Shared message pool (pub/sub) | No | Publish/subscribe by role | No |
| Claude SDK | No (single agent) | Hooks | MCP | No |
| CF Agents | DO RPC | WebSocket | DO RPC, Queues, Workflows | Queue events |
| Pydantic AI | No (single agent) | No | MCP, A2A | No |
| BabyAGI | No | No | Task list | No |
| AutoGPT | Block connections | UI approval | Block graph | Webhooks |
Execution Model
| Framework | Runs On | Sandboxing | Cost Control | Parallel Exec | Error Handling |
|---|---|---|---|---|---|
| OpenClaw | Local machine | None (runs as user) | Model fallback chains | No | Retry + fallback |
| NanoClaw | Local + containers | Linux containers | Claude only | Per-group parallel | Container restart |
| NanoBot | Local machine | None | Model selection | No | Retry |
| PicoClaw | Any ($10 hw) | None | Cheap models | No | Retry |
| ZeroClaw | Any (self-hosted) | Workspace isolation | Tool allowlists | No | Retry |
| NullClaw | Any (embedded) | Multi-layer sandbox | Minimal by design | No | Retry |
| TinyClaw | Local + Docker | Docker (via tinyclaw-infra) | Model per agent | Fan-out parallel | Actor supervision |
| Agent Zero | Docker containers | Docker per agent | 3-slot model mixing | Subordinate parallel | Subordinate retry |
| LangGraph | Local/cloud | None built-in | Token tracking | Parallel branches | Checkpoint recovery |
| CrewAI | Local/cloud | None built-in | Token tracking | Parallel tasks | Task retry |
| AutoGen | Local/cloud/Azure | Code execution sandbox | Token tracking | Async agents | Retry + fallback |
| OpenAI SDK | Local/cloud | None built-in | Token tracking | Async agents | Guardrails |
| MetaGPT | Local | Docker for code exec | Token tracking | Role parallel | SOP recovery |
| Claude SDK | Local | Tool permissions | API pricing | No | Retry |
| CF Agents | Cloudflare edge | DO isolation | DO CPU/memory limits | Queue workers | Alarm retry + DLQ |
| Pydantic AI | Local/cloud | None built-in | Token tracking | No | Durable execution |
| BabyAGI | Local | None | Token tracking | No | Loop retry |
| AutoGPT | Docker | Docker | Credit system | Block parallel | Block retry |
Memory is the most differentiated axis across frameworks. Here is a detailed breakdown.
Memory Architecture Patterns
Four distinct patterns have emerged:
Pattern 1: File-Based Memory (OpenClaw, NanoClaw, Claude SDK)
workspace/
MEMORY.md # Curated long-term facts (append-only)
SOUL.md # Agent personality (rarely changes)
AGENTS.md # Behavior rules (injected every prompt)
memory/
2026-03-16.md # Daily activity log
2026-03-15.md # Yesterday (also loaded at startup)
Pros: Human-readable, version-controllable, easy to debug Cons: No semantic search without additional index, linear scan
Pattern 2: Vector Database Memory (Agent Zero, BabyAGI, CrewAI)
memory_store = FAISSMemory(
index_type="IndexFlatIP", # Cosine similarity
dimension=1536,
auto_extract=True, # LLM extracts facts after each turn
auto_consolidate=True, # Merge related memories over time
categories=["facts", "solutions", "preferences", "errors"],
)
Pros: Semantic retrieval, scales to millions of memories Cons: Opaque (hard to inspect), embedding model dependency, cost
Pattern 3: Hybrid Search (OpenClaw, ZeroClaw, NullClaw)
-- Hybrid search: vector similarity + BM25 keyword matching
-- ZeroClaw's SQLite implementation
-- Vector search (70% weight)
SELECT id, content,
cosine_similarity(embedding, ?) AS vec_score
FROM memories;
-- Keyword search (30% weight)
SELECT id, content,
bm25(memories_fts) AS kw_score
FROM memories_fts
WHERE memories_fts MATCH ?;
-- Combined score
SELECT id, content,
(0.7 * vec_score + 0.3 * kw_score) AS combined_score
FROM (
-- join vector and keyword results
) ORDER BY combined_score DESC LIMIT 10;
Pros: Best of both worlds — catches semantic similarity AND exact matches Cons: More complex, requires both embedding model and FTS index
Pattern 4: Graph Memory (NanoBot)
User
|-- works_on --> data-pipeline
| |-- uses --> pandas
| |-- deployed_on --> AWS Lambda
|-- prefers --> type-hints
|-- asked_about --> agent-frameworks (recent)
Pros: Relationship-aware retrieval, good for ongoing projects Cons: Graph construction is imprecise, harder to debug than files
Memory Retrieval Strategies
| Strategy | Used By | How It Works |
|---|---|---|
| Temporal decay | OpenClaw, ZeroClaw | Recent memories weighted higher. 30-day half-life. Evergreen files skip decay |
| Auto-extraction | Agent Zero | Utility LLM extracts facts/solutions/errors after every turn |
| Auto-consolidation | Agent Zero | Related memories merged and summarized periodically |
| Adaptive-depth recall | CrewAI | Retrieval depth adjusts based on query complexity |
| Composite scoring | CrewAI | Weighted combination of recency, semantic similarity, and importance |
| Compaction | OpenClaw | When context fills, important info flushed to MEMORY.md, older conversation summarized |
| Embedding cache | ZeroClaw | LRU cache of 10K embeddings to avoid recomputation |
| Thread checkpointing | LangGraph | Full state snapshot at every graph node, recoverable by thread ID |
What the Best Memory Systems Have in Common
-
Hybrid search is converging as the standard. Pure vector search misses exact terms. Pure keyword search misses meaning. The 70/30 vector/keyword split in ZeroClaw and OpenClaw appears to be the sweet spot.
-
File-based memory for human-inspectable state. Every Claw-family framework uses MEMORY.md. It is not technically optimal, but it is debuggable, versionable, and portable.
-
Auto-learning is the biggest gap. Only Agent Zero automatically extracts and consolidates knowledge. Every other framework requires explicit memory management.
Skill Definition Patterns
Pattern 1: Markdown Playbooks (OpenClaw, ZeroClaw)
The most popular pattern. A skill is a directory with a SKILL.md file containing YAML frontmatter (requirements, tags, environment) and markdown instructions.
---
name: keyword-research
description: Research keywords using DataForSEO API
requires:
env: ["DATAFORSEO_LOGIN", "DATAFORSEO_PASSWORD"]
tags: ["seo", "research", "marketing"]
platforms: ["all"]
---
OpenClaw injects a compact XML list of available skills into the system prompt. Skills are selectively loaded based on the current context.
Pattern 2: Typed Functions (Pydantic AI, LangGraph, OpenAI SDK)
Skills are Python/TypeScript functions with type annotations.
from pydantic_ai import Agent
from pydantic import BaseModel
class SearchResult(BaseModel):
title: str
url: str
snippet: str
relevance_score: float
agent = Agent(
"anthropic:claude-sonnet-4-5-20250514",
system_prompt="You are a research assistant.",
)
@agent.tool
async def search_web(query: str, max_results: int = 5) -> list[SearchResult]:
"""Search the web for information."""
results = await search_api.search(query, limit=max_results)
return [
SearchResult(
title=r["title"],
url=r["url"],
snippet=r["snippet"],
relevance_score=r["score"],
)
for r in results
]
Pattern 3: Role Actions (MetaGPT, CrewAI)
Skills are embedded in agent roles as expected behaviors.
class Architect(Role):
name: str = "Architect"
profile: str = "Software Architect"
def __init__(self, **kwargs):
super().__init__(**kwargs)
self.set_actions([WriteDesign, WriteAPIDesign])
self.watch({WritePRD}) # Subscribe to Product Manager output
Pattern 4: MCP Tools (Claude SDK, NanoClaw, Pydantic AI)
Skills are exposed as MCP (Model Context Protocol) servers.
// Claude SDK: MCP server as skill provider
const agent = new ClaudeAgent({
mcpServers: [
{
name: "seo-tools",
transport: "stdio",
command: "npx",
args: ["-y", "@my-org/seo-mcp-server"],
},
{
name: "database",
transport: "sse",
url: "https://api.example.com/mcp",
},
],
});
Pattern 5: Vtable Interfaces (NullClaw)
Skills are compiled interface implementations.
const ToolVTable = struct {
name: []const u8,
description: []const u8,
parameters: []const ParameterDef,
execute: *const fn (params: ParameterMap) anyerror!ToolResult,
};
// Adding a new tool means implementing this interface
pub const git_commit_tool = ToolVTable{
.name = "git_commit",
.description = "Create a git commit with the given message",
.parameters = &[_]ParameterDef{
.{ .name = "message", .type = .string, .required = true },
.{ .name = "files", .type = .string_array, .required = false },
},
.execute = &gitCommitImpl,
};
Skill Ecosystem Size
| Framework | Built-in | Community | Registry |
|---|---|---|---|
| OpenClaw | ~50 bundled | 5,400+ in registry | openclaw/skills |
| CrewAI | 600+ integrations | 7,000+ tools | CrewAI Tools |
| LangGraph | LangChain ecosystem | LangChain Hub | LangChain Hub |
| NanoBot | ~10 core | ClawHub | ClawHub |
| ZeroClaw | ~20 built-in | Community packs | TOML registry |
| NullClaw | 18+ built-in | Small | None |
| Claude SDK | Claude Code toolset + MCP | MCP ecosystem | MCP servers |
| Others | 5-15 built-in | Small | None |
Pattern 1: No Agent-to-Agent Communication
Used by: NanoBot, PicoClaw, ZeroClaw, NullClaw, Claude SDK, Pydantic AI, BabyAGI
These frameworks are single-agent systems. Communication is always agent-to-human via messaging platforms or CLI.
Pattern 2: Filesystem IPC (Polling)
Used by: NanoClaw
Host process polls data/ipc/{group}/ every 1 second
-> Reads JSON files from messages/, tasks/
-> Processes and deletes after handling
-> Container agents write to IPC directory
-> Host routes messages to platforms
Simple, debuggable, but adds 1-second latency per message hop.
Pattern 3: Handoff Functions
Used by: OpenAI Agents SDK, Swarm (legacy)
triage = Agent(
name="Triage",
handoffs=["billing_agent", "tech_support_agent"],
instructions="Route user to the right specialist based on their question.",
)
Clean and minimal. Works well for customer support routing patterns. Limited for complex multi-step coordination.
Pattern 4: Shared Message Pool (Pub/Sub)
Used by: MetaGPT
class SharedMessagePool:
def publish(self, message: Message, sender_role: str):
for subscriber in self.subscribers:
if subscriber.should_receive(message, sender_role):
subscriber.inbox.append(message)
def subscribe(self, agent: Agent, filter_roles: list[str]):
self.subscribers.append(Subscription(agent, filter_roles))
Pattern 5: Team Chat Rooms (Broadcast)
Used by: TinyClaw
// Every team has a persistent chat room
// All agents see all messages from teammates
// Agents mention teammates to delegate: "@writer please draft this"
interface TeamMessage {
from: string; // "researcher"
to?: string; // "@writer" or broadcast (no to)
team: string; // "content-team"
content: string;
timestamp: number;
}
// Fan-out: researcher mentions @writer and @reviewer
// Both process in parallel
// Responses flow back to team room
Pattern 6: Graph Edges and State Reducers
Used by: LangGraph
class State(TypedDict):
# add reducer: new messages are appended, not replaced
messages: Annotated[list[str], add]
# last-write-wins for simple values
status: str
Pattern 7: DO RPC (Direct Method Invocation)
Used by: Cloudflare Agents SDK
// Agent-to-agent communication via Durable Object RPC
// Zero HTTP overhead -- direct method call within the same datacenter
class CoordinatorAgent extends Agent<Env> {
async delegateResearch(topic: string) {
const researcher = this.env.RESEARCH_AGENT.get(
this.env.RESEARCH_AGENT.idFromName(topic)
);
// Direct method call -- no HTTP, no serialization overhead
const result = await researcher.research(topic);
return result;
}
}
Human-in-the-Loop Mechanisms
| Framework | Mechanism | Granularity |
|---|---|---|
| LangGraph | Interrupt nodes | Per-node in graph |
| CrewAI | Callbacks | Per-task |
| AutoGen | Chat interface | Per-message |
| OpenAI SDK | Human-in-loop API | Per-run |
| Agent Zero | Superior chain | Human is top of hierarchy |
| TinyClaw | Team chat | Agents and humans share chat room |
| OpenClaw | Chat platforms | Per-message |
| CF Agents | WebSocket | Real-time bidirectional |
Local CLI Execution
Used by: Most frameworks (OpenClaw, NanoBot, PicoClaw, ZeroClaw, NullClaw, Agent Zero, LangGraph, CrewAI)
The agent runs as a local process. Advantages: full machine access, no cloud costs, privacy. Disadvantages: requires a running machine, no horizontal scaling, no fault tolerance.
Container-Isolated Execution
Used by: NanoClaw, Agent Zero, AutoGPT
Each agent runs in its own container. NanoClaw uses Docker or Apple Container. Agent Zero uses Docker per subordinate. This provides OS-level isolation but adds startup latency.
Edge/Serverless Execution
Used by: Cloudflare Agents SDK
// Each agent is a Durable Object
// Runs on Cloudflare's edge network (300+ cities)
// Automatically hibernates when idle (no cost)
// Wakes on request, alarm, or queue message
export default {
async fetch(request: Request, env: Env) {
const agentId = env.AGENT.idFromName("my-agent");
const agent = env.AGENT.get(agentId);
return agent.fetch(request);
},
};
Advantages: global distribution, automatic scaling, pay-per-use, built-in persistence. Disadvantages: platform lock-in, CPU time limits, no local filesystem.
Sandboxing Comparison
| Framework | Sandbox Type | Isolation Level | Performance Impact |
|---|---|---|---|
| NanoClaw | Linux container | OS-level | Moderate (container startup) |
| Agent Zero | Docker | OS-level | Moderate |
| NullClaw | Multi-layer sandbox | Process-level | Minimal |
| ZeroClaw | Workspace isolation | Filesystem-level | None |
| CF Agents | Durable Object | V8 isolate | None |
| LangGraph | None built-in | None | None |
| OpenClaw | None | Runs as user | None |
Cost Control Mechanisms
| Framework | Mechanism | How It Works |
|---|---|---|
| OpenClaw | Model fallback chains | Try cheap model first, escalate on failure |
| Agent Zero | 3-slot model mixing | Different models for chat/utility/embedding |
| LangGraph | Token tracking | State includes token counts per node |
| CrewAI | Token tracking | Per-agent token budgets |
| CF Agents | DO CPU/memory limits | Hard limits per request, alarm-based budgets |
| OpenAI SDK | Token tracking | Per-run cost visibility |
| ZeroClaw | Tool allowlists | Limit which tools can run (prevents expensive operations) |
Persistence Models
| Framework | State Backend | Checkpoint/Resume | Session Management | Crash Recovery |
|---|---|---|---|---|
| OpenClaw | SQLite + files | Session JSONL (branching) | Per-agent sessions | Resume from JSONL |
| NanoClaw | SQLite | Crash recovery built-in | Per-group | SQLite WAL |
| NanoBot | Knowledge graph | Persistent graph | In-memory sessions | Graph persists |
| ZeroClaw | SQLite/Markdown | File-based | Per-workspace | File recovery |
| NullClaw | SQLite | File-based | Per-workspace | Binary restart |
| TinyClaw | SQLite | Per-agent workspace | Per-team | Actor restart |
| Agent Zero | FAISS + files | FAISS persists to disk | Per-conversation | FAISS reload |
| LangGraph | PG/SQLite/Memory | Full checkpoint at every node | Thread-based | Checkpoint recovery |
| CrewAI | Configurable vector DB | Task-level | Per-crew | Task retry |
| AutoGen | Configurable | Session-based | Per-group | Session replay |
| OpenAI SDK | Session API | Session history | Auto-managed | Session resume |
| MetaGPT | In-memory + files | No built-in | Per-run | No |
| Claude SDK | File-based | No built-in | Conversation | No |
| CF Agents | DO SQLite | Automatic (every setState) | Per-DO instance | DO auto-recovery |
| Pydantic AI | Configurable | Durable execution | Per-conversation | Progress preserved |
| BabyAGI | Pinecone | Task list | Per-run | Vector reload |
| AutoGPT | PostgreSQL | Block state | Per-agent | DB recovery |
Key insight: LangGraph and Cloudflare Agents SDK have the strongest state persistence stories. LangGraph checkpoints at every graph node with full state snapshots. CF Agents auto-persist state to Durable Object SQLite on every
setState()call. Both enable true resume-from-failure — every other framework requires manual work.
| Framework | GitHub Stars | Last Active | Docs Quality | Plugin System | Community |
|---|---|---|---|---|---|
| OpenClaw | 302K | Daily | Excellent | Skills registry (5,400+) | Massive |
| NanoBot (HKUDS) | 27K | Daily | Good | ClawHub | Growing |
| NanoClaw | 22K | Daily | Good | Fork-and-modify | Growing |
| PicoClaw | 18K | Weekly | Fair | Limited | Growing |
| ZeroClaw | 17K | Weekly | Good | TOML manifests | Growing |
| NullClaw | 12K | Weekly | Fair | Vtable interfaces | Small |
| TinyClaw | 8K | Weekly | Fair | Agent configs | Small |
| CrewAI | 46K | Daily | Excellent | 7,000+ tools | Large |
| MetaGPT | 42K | Monthly | Good | Role system | Large |
| AutoGen | 38K | Daily | Good | Plugin system | Large |
| BabyAGI | 20K | Monthly | Fair | Limited | Declining |
| Swarm | 18K | Archived | Fair | Deprecated | Migrating to Agents SDK |
| OpenAI SDK | 15K | Daily | Excellent | MCP + tools | Growing |
| Pydantic AI | 15K | Daily | Excellent | MCP + A2A | Growing |
| SuperAGI | 15K | Stalled (Jan 2024) | Outdated | Marketplace | Dead |
| Agent Zero | 12K | Weekly | Good | Python functions | Niche |
| AutoGPT | 170K | Monthly | Fair | Block marketplace | Declining |
After analyzing 18 frameworks, several consensus patterns emerge:
1. Markdown Files as Persistent Memory
Every Claw-family framework uses MEMORY.md. Claude Code uses CLAUDE.md. MetaGPT uses structured artifacts. The pattern is universal: human-readable text files for state that humans need to inspect.
MEMORY.md # What the agent knows (curated facts)
SOUL.md # Who the agent is (personality, boundaries)
AGENTS.md # How the agent behaves (rules, injected every turn)
This is not technically optimal (linear scan, no semantic search), but it solves the debuggability problem that vector-only systems fail at.
2. Hybrid Search Is the Memory Sweet Spot
The frameworks with the best memory systems all combine vector similarity with keyword matching:
- OpenClaw: sqlite-vec + BM25
- ZeroClaw: cosine similarity (0.7) + FTS5 BM25 (0.3)
- NullClaw: vector + FTS5
Pure vector search misses exact terms (“wrangler.jsonc” won’t match “wrangler configuration”). Pure keyword search misses semantic similarity (“deployment tool” won’t match “wrangler”). The 70/30 split appears to be the emerging consensus.
3. Skills Are Just Markdown
OpenClaw proved it. ZeroClaw adopted it. The skill-as-markdown-file pattern has become the standard:
skills/
my-skill/
SKILL.md # YAML frontmatter + markdown instructions
(optional files) # Templates, configs, reference code
No SDK, no compilation, no runtime dependency. The agent reads the skill file and follows the instructions. This works because modern LLMs are good enough to follow structured markdown instructions reliably.
4. Container Isolation for Security
NanoClaw and Agent Zero both use container isolation per agent. This is the only pattern that provides real security — everything else (workspace isolation, tool allowlists) is convention-based and can be bypassed.
5. Polling, Not Pushing
NanoClaw polls IPC directories every second. OpenClaw watches file changes. BabyAGI runs a poll loop. The pattern: simple polling is more reliable than complex push systems for agent coordination.
6. Single Agent Scales Down, Multi-Agent Scales Up
The Claw family (OpenClaw, NanoClaw, PicoClaw, ZeroClaw, NullClaw) are all single-agent systems. They scale down beautifully — NullClaw runs on 1MB RAM.
Multi-agent systems (CrewAI, AutoGen, TinyClaw, MetaGPT) scale up — they can handle complex workflows with specialized roles. But they add significant complexity.
The pattern: start with a single agent, add multi-agent only when a single agent demonstrably cannot handle the workload.
7. State Machines for Complex Workflows
LangGraph’s graph model and CrewAI’s Flows both treat agent workflows as state machines. This provides:
- Deterministic routing (edges determine next step)
- Checkpoint/resume (state snapshots at each node)
- Human-in-the-loop (interrupt at specific nodes)
- Observability (trace the path through the graph)
For anything beyond simple chat, state machines are emerging as the coordination primitive.
How to Build a Hybrid Memory System
The best memory systems combine multiple retrieval strategies. Here is a TypeScript implementation inspired by OpenClaw and ZeroClaw:
import { Database } from "better-sqlite3";
interface Memory {
id: string;
content: string;
category: "fact" | "solution" | "preference" | "error";
embedding: Float32Array;
created_at: number;
importance: number;
}
interface SearchResult {
memory: Memory;
score: number;
source: "vector" | "keyword" | "both";
}
class HybridMemoryStore {
private db: Database;
private vectorWeight = 0.7;
private keywordWeight = 0.3;
private decayHalfLifeDays = 30;
constructor(dbPath: string) {
this.db = new Database(dbPath);
this.initSchema();
}
private initSchema() {
this.db.exec(`
CREATE TABLE IF NOT EXISTS memories (
id TEXT PRIMARY KEY,
content TEXT NOT NULL,
category TEXT NOT NULL,
embedding BLOB,
created_at INTEGER NOT NULL,
importance REAL DEFAULT 1.0
);
CREATE VIRTUAL TABLE IF NOT EXISTS memories_fts
USING fts5(content, id UNINDEXED);
`);
}
async store(memory: Omit<Memory, "id" | "created_at">): Promise<string> {
const id = crypto.randomUUID();
const created_at = Date.now();
this.db.prepare(`
INSERT INTO memories (id, content, category, embedding, created_at, importance)
VALUES (?, ?, ?, ?, ?, ?)
`).run(id, memory.content, memory.category,
Buffer.from(memory.embedding.buffer), created_at, memory.importance);
this.db.prepare(`
INSERT INTO memories_fts (id, content) VALUES (?, ?)
`).run(id, memory.content);
return id;
}
async search(
query: string,
queryEmbedding: Float32Array,
limit: number = 10
): Promise<SearchResult[]> {
// Vector search
const vectorResults = this.vectorSearch(queryEmbedding, limit * 2);
// Keyword search
const keywordResults = this.keywordSearch(query, limit * 2);
// Merge with weights
const merged = this.mergeResults(vectorResults, keywordResults);
// Apply temporal decay
const decayed = merged.map((r) => ({
...r,
score: r.score * this.temporalDecay(r.memory.created_at),
}));
return decayed
.sort((a, b) => b.score - a.score)
.slice(0, limit);
}
private vectorSearch(
queryEmbedding: Float32Array,
limit: number
): SearchResult[] {
const rows = this.db.prepare(`
SELECT id, content, category, embedding, created_at, importance
FROM memories WHERE embedding IS NOT NULL
`).all() as any[];
return rows
.map((row) => {
const embedding = new Float32Array(row.embedding.buffer);
const similarity = this.cosineSimilarity(queryEmbedding, embedding);
return {
memory: { ...row, embedding },
score: similarity * this.vectorWeight,
source: "vector" as const,
};
})
.sort((a, b) => b.score - a.score)
.slice(0, limit);
}
private keywordSearch(query: string, limit: number): SearchResult[] {
const rows = this.db.prepare(`
SELECT m.id, m.content, m.category, m.embedding, m.created_at, m.importance,
bm25(memories_fts) as bm25_score
FROM memories_fts f
JOIN memories m ON f.id = m.id
WHERE memories_fts MATCH ?
ORDER BY bm25_score
LIMIT ?
`).all(query, limit) as any[];
return rows.map((row) => ({
memory: row,
score: Math.abs(row.bm25_score) * this.keywordWeight,
source: "keyword" as const,
}));
}
private mergeResults(
vectorResults: SearchResult[],
keywordResults: SearchResult[]
): SearchResult[] {
const merged = new Map<string, SearchResult>();
for (const r of vectorResults) {
merged.set(r.memory.id, r);
}
for (const r of keywordResults) {
const existing = merged.get(r.memory.id);
if (existing) {
existing.score += r.score;
existing.source = "both";
} else {
merged.set(r.memory.id, r);
}
}
return Array.from(merged.values());
}
private temporalDecay(createdAt: number): number {
const ageMs = Date.now() - createdAt;
const ageDays = ageMs / (1000 * 60 * 60 * 24);
return Math.pow(0.5, ageDays / this.decayHalfLifeDays);
}
private cosineSimilarity(a: Float32Array, b: Float32Array): number {
let dot = 0, normA = 0, normB = 0;
for (let i = 0; i < a.length; i++) {
dot += a[i] * b[i];
normA += a[i] * a[i];
normB += b[i] * b[i];
}
return dot / (Math.sqrt(normA) * Math.sqrt(normB));
}
}
How to Build an Auto-Learning System
Inspired by Agent Zero’s approach, here is how to add auto-extraction to any agent:
interface ExtractedKnowledge {
content: string;
category: "fact" | "solution" | "preference" | "error";
confidence: number;
}
class AutoLearner {
private memoryStore: HybridMemoryStore;
private extractionModel: string;
private embeddingModel: string;
constructor(
memoryStore: HybridMemoryStore,
extractionModel = "claude-haiku-3-5-20241022", // cheap model for extraction
embeddingModel = "nomic-embed-text"
) {
this.memoryStore = memoryStore;
this.extractionModel = extractionModel;
this.embeddingModel = embeddingModel;
}
async learnFromConversation(
userMessage: string,
assistantResponse: string
): Promise<ExtractedKnowledge[]> {
const extracted = await this.extract(userMessage, assistantResponse);
for (const item of extracted) {
if (item.confidence < 0.7) continue; // Skip low-confidence extractions
// Check for duplicates via semantic search
const embedding = await this.embed(item.content);
const existing = await this.memoryStore.search(item.content, embedding, 3);
const isDuplicate = existing.some((r) => r.score > 0.9);
if (isDuplicate) continue;
await this.memoryStore.store({
content: item.content,
category: item.category,
embedding,
importance: item.confidence,
});
}
return extracted;
}
private async extract(
userMessage: string,
assistantResponse: string
): Promise<ExtractedKnowledge[]> {
const prompt = `Extract factual knowledge from this conversation turn.
For each piece of knowledge, classify it:
- fact: Something true about the user, their project, or their environment
- solution: A problem-solution pair that worked
- preference: A user preference or style choice
- error: A mistake or anti-pattern discovered
Return JSON array. Only include high-confidence items.
User: ${userMessage}
Assistant: ${assistantResponse}
Extract:`;
const response = await callLLM(this.extractionModel, prompt);
return JSON.parse(response);
}
private async embed(text: string): Promise<Float32Array> {
return callEmbedding(this.embeddingModel, text);
}
async consolidate(): Promise<number> {
// Run periodically to merge related memories
const allMemories = await this.memoryStore.getAll();
let mergeCount = 0;
for (let i = 0; i < allMemories.length; i++) {
for (let j = i + 1; j < allMemories.length; j++) {
const similarity = this.memoryStore.cosineSimilarity(
allMemories[i].embedding,
allMemories[j].embedding
);
if (similarity > 0.85) {
// Merge: ask LLM to combine
const merged = await this.merge(allMemories[i], allMemories[j]);
await this.memoryStore.store(merged);
await this.memoryStore.delete(allMemories[i].id);
await this.memoryStore.delete(allMemories[j].id);
mergeCount++;
}
}
}
return mergeCount;
}
private async merge(a: Memory, b: Memory): Promise<Omit<Memory, "id" | "created_at">> {
const prompt = `Merge these two related memories into one concise statement:
1: ${a.content}
2: ${b.content}
Merged:`;
const content = await callLLM(this.extractionModel, prompt);
const embedding = await this.embed(content);
return {
content,
category: a.category,
embedding,
importance: Math.max(a.importance, b.importance),
};
}
}
How to Build a Skill Loader
A TypeScript implementation of the SKILL.md pattern:
import { readdir, readFile } from "fs/promises";
import { join } from "path";
import { parse as parseYaml } from "yaml";
interface SkillMetadata {
name: string;
description: string;
requires?: {
binaries?: string[];
env?: string[];
};
tags?: string[];
platforms?: string[];
}
interface Skill {
metadata: SkillMetadata;
instructions: string;
path: string;
}
class SkillLoader {
private skillPaths: string[];
private loadedSkills: Map<string, Skill> = new Map();
constructor(skillPaths: string[]) {
// Precedence: workspace > user > bundled (last wins on conflict)
this.skillPaths = skillPaths;
}
async loadAll(): Promise<Map<string, Skill>> {
for (const basePath of this.skillPaths) {
try {
const dirs = await readdir(basePath, { withFileTypes: true });
for (const dir of dirs) {
if (!dir.isDirectory()) continue;
const skillPath = join(basePath, dir.name, "SKILL.md");
try {
const skill = await this.parseSkillFile(skillPath);
// Later paths override earlier (workspace > user > bundled)
this.loadedSkills.set(skill.metadata.name, skill);
} catch {
// Skip invalid skill files
}
}
} catch {
// Skip missing directories
}
}
return this.loadedSkills;
}
private async parseSkillFile(path: string): Promise<Skill> {
const content = await readFile(path, "utf-8");
// Parse YAML frontmatter
const frontmatterMatch = content.match(/^---\n([\s\S]*?)\n---\n([\s\S]*)$/);
if (!frontmatterMatch) {
throw new Error(`Invalid SKILL.md format: ${path}`);
}
const metadata = parseYaml(frontmatterMatch[1]) as SkillMetadata;
const instructions = frontmatterMatch[2].trim();
return { metadata, instructions, path };
}
async getAvailableSkills(
context: { env: Record<string, string>; platform: string }
): Promise<Skill[]> {
const available: Skill[] = [];
for (const skill of this.loadedSkills.values()) {
// Check environment requirements
if (skill.metadata.requires?.env) {
const missing = skill.metadata.requires.env.filter(
(e) => !context.env[e]
);
if (missing.length > 0) continue;
}
// Check platform requirements
if (
skill.metadata.platforms &&
!skill.metadata.platforms.includes("all") &&
!skill.metadata.platforms.includes(context.platform)
) {
continue;
}
available.push(skill);
}
return available;
}
formatForSystemPrompt(skills: Skill[]): string {
// OpenClaw-style: inject compact XML list into system prompt
const lines = skills.map(
(s) =>
`<skill name="${s.metadata.name}" description="${s.metadata.description}" />`
);
return `<available_skills>\n${lines.join("\n")}\n</available_skills>`;
}
}
// Usage:
const loader = new SkillLoader([
"./skills", // Workspace skills (highest priority)
`${process.env.HOME}/.openclaw/skills`, // User skills
"./bundled-skills", // Built-in skills (lowest priority)
]);
const skills = await loader.loadAll();
const available = await loader.getAvailableSkills({
env: process.env as Record<string, string>,
platform: process.platform,
});
const systemPromptSkills = loader.formatForSystemPrompt(available);
How to Build a Multi-Executor Router
For systems like Code Turtle that need to route tasks to different LLM backends based on cost and capability:
interface ExecutorConfig {
name: string;
model: string;
provider: "anthropic" | "openai" | "google" | "openrouter" | "workers-ai";
costPerMTokInput: number;
costPerMTokOutput: number;
maxContextTokens: number;
capabilities: Set<"code" | "reasoning" | "simple" | "embedding">;
rateLimit: { requests: number; perSeconds: number };
currentUsage: { requests: number; windowStart: number };
}
class ExecutorRouter {
private executors: ExecutorConfig[];
private dailyBudget: number;
private dailySpend: number = 0;
constructor(executors: ExecutorConfig[], dailyBudget: number) {
this.executors = executors;
this.dailyBudget = dailyBudget;
}
selectExecutor(task: {
type: "code" | "reasoning" | "simple";
estimatedInputTokens: number;
estimatedOutputTokens: number;
priority: "high" | "normal" | "low";
}): ExecutorConfig | null {
// Filter by capability
const capable = this.executors.filter((e) =>
e.capabilities.has(task.type)
);
// Filter by rate limit
const available = capable.filter((e) => {
const now = Date.now() / 1000;
if (now - e.currentUsage.windowStart > e.rateLimit.perSeconds) {
e.currentUsage = { requests: 0, windowStart: now };
}
return e.currentUsage.requests < e.rateLimit.requests;
});
if (available.length === 0) return null;
// Estimate cost for each
const withCost = available.map((e) => ({
executor: e,
estimatedCost:
(task.estimatedInputTokens / 1_000_000) * e.costPerMTokInput +
(task.estimatedOutputTokens / 1_000_000) * e.costPerMTokOutput,
}));
// Budget check
const affordable = withCost.filter(
(e) => this.dailySpend + e.estimatedCost <= this.dailyBudget
);
if (affordable.length === 0) {
// Budget exceeded -- only allow free tier
const free = withCost.filter((e) => e.estimatedCost === 0);
return free.length > 0 ? free[0].executor : null;
}
// Strategy by priority
if (task.priority === "high") {
// Best model regardless of cost (within budget)
return affordable.sort(
(a, b) => b.estimatedCost - a.estimatedCost
)[0].executor;
}
// Normal/Low: cheapest capable model
return affordable.sort(
(a, b) => a.estimatedCost - b.estimatedCost
)[0].executor;
}
recordUsage(executor: ExecutorConfig, actualCost: number): void {
executor.currentUsage.requests++;
this.dailySpend += actualCost;
}
}
// Example configuration:
const executors: ExecutorConfig[] = [
{
name: "claude-sonnet",
model: "claude-sonnet-4-5-20250514",
provider: "anthropic",
costPerMTokInput: 3,
costPerMTokOutput: 15,
maxContextTokens: 200_000,
capabilities: new Set(["code", "reasoning", "simple"]),
rateLimit: { requests: 50, perSeconds: 60 },
currentUsage: { requests: 0, windowStart: 0 },
},
{
name: "claude-haiku",
model: "claude-haiku-3-5-20241022",
provider: "anthropic",
costPerMTokInput: 0.8,
costPerMTokOutput: 4,
maxContextTokens: 200_000,
capabilities: new Set(["simple", "code"]),
rateLimit: { requests: 100, perSeconds: 60 },
currentUsage: { requests: 0, windowStart: 0 },
},
{
name: "gemini-flash",
model: "gemini-2.0-flash",
provider: "google",
costPerMTokInput: 0.1,
costPerMTokOutput: 0.4,
maxContextTokens: 1_000_000,
capabilities: new Set(["simple", "reasoning"]),
rateLimit: { requests: 15, perSeconds: 60 },
currentUsage: { requests: 0, windowStart: 0 },
},
{
name: "workers-ai-llama",
model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast",
provider: "workers-ai",
costPerMTokInput: 0, // Free tier
costPerMTokOutput: 0,
maxContextTokens: 8_192,
capabilities: new Set(["simple"]),
rateLimit: { requests: 300, perSeconds: 60 },
currentUsage: { requests: 0, windowStart: 0 },
},
];
| Don’t | Do Instead | Why |
|---|---|---|
| Store all memory in a vector database only | Use hybrid search (vector + keyword) with human-readable MEMORY.md | Vector-only misses exact terms and is impossible to debug by inspection |
| Give agents unrestricted shell access | Use container isolation (NanoClaw) or tool allowlists (ZeroClaw) | One rm -rf / away from disaster. Security is not optional |
| Use one model for everything | Route by task complexity: Haiku for simple, Sonnet for code, Opus for reasoning | A single Opus call costs 60x a Haiku call. Cost control requires routing |
| Build multi-agent when single-agent suffices | Start single-agent, add coordination only when demonstrably needed | Multi-agent adds latency, cost, and debugging complexity |
| Rely on LLM memory alone (no persistent store) | Persist key facts to files or database after every session | LLMs forget everything between context windows. Persistent memory is infrastructure, not optional |
| Use push-based agent coordination | Use polling (check for new tasks every N seconds) | Push systems are harder to debug, harder to make reliable, harder to replay |
| Skip idempotency in task execution | Check if task was already completed before executing | Agents retry. Networks flake. At-least-once delivery means duplicate execution is guaranteed |
| Store embeddings without an LRU cache | Cache recent embeddings (ZeroClaw caches 10K) | Embedding API calls are slow and expensive. Cache hits are free |
| Use a stalled/dead framework (SuperAGI, early BabyAGI) | Use actively maintained frameworks | Unpatched security vulnerabilities, no bug fixes, community has moved on |
| Lock into a single LLM provider | Abstract the LLM interface (provider trait/interface) | Pricing changes, outages, and new models are constant. Provider agnosticism is insurance |
| Inject all skills into every prompt | Filter skills by context, load only relevant ones | 5,400 skills in the prompt would consume the entire context window. Selective injection is mandatory |
| Build custom memory from scratch | Adopt the hybrid search pattern (SQLite + FTS5 + vector) | This is a solved problem. OpenClaw, ZeroClaw, and NullClaw all converge on the same design |
Personal AI Assistant (Chat + Automation)
Recommended: OpenClaw (full-featured) or NanoBot (lightweight)
OpenClaw if you want the largest ecosystem, most messaging platform integrations, and battle-tested memory system. NanoBot if you want something lighter with knowledge graph memory and Python extensibility. NanoClaw if you want container security and are Claude-only.
Edge / IoT / Embedded
Recommended: PicoClaw (Go, <10MB) or NullClaw (Zig, 678KB)
PicoClaw for the widest architecture support (RISC-V, ARM, MIPS, x86) and Go’s ecosystem. NullClaw for the absolute smallest footprint and hardware peripheral support.
Multi-Agent Team Workflows
Recommended: CrewAI (role-based) or LangGraph (graph-based)
CrewAI for intuitive role definitions and the largest tool ecosystem (7,000+). LangGraph for the strongest checkpointing, persistence, and human-in-the-loop story. AutoGen if you need enterprise features and Azure integration.
TinyClaw is interesting for lightweight team collaboration with fan-out patterns.
Production Enterprise Deployment
Recommended: LangGraph + PostgresSaver or Microsoft Agent Framework
LangGraph is GA at v1.0.10 with full checkpoint recovery and production tooling. Microsoft Agent Framework for Azure-native shops that need GDPR, OpenTelemetry, and enterprise support.
Serverless / Edge-Native Agents
Recommended: Cloudflare Agents SDK
The only framework where each agent is a Durable Object with auto-persistent state, built-in scheduling, WebSocket communication, and queue integration. No alternative matches this for edge-native deployment.
Auto-Learning Agents
Recommended: Agent Zero (adopt the pattern) + your framework of choice
No other framework does automatic knowledge extraction. Implement Agent Zero’s extract-embed-consolidate pattern on top of whatever framework you choose.
Cost-Optimized Autonomous Workers
Recommended: Build a multi-executor router (see implementation above)
No framework handles this well out of the box. You need to build a routing layer that picks the cheapest capable model per task and enforces daily budgets. The executor router pattern in this article is a starting point.
Code Review / Development Agents
Recommended: Claude Agent SDK or OpenAI Agents SDK
Claude Agent SDK gives you the full Claude Code toolset (Read, Write, Edit, Bash). OpenAI Agents SDK gives you provider flexibility with clean handoff patterns.
Recommendation Matrix
| Use Case | First Choice | Runner-Up | Avoid |
|---|---|---|---|
| Personal assistant | OpenClaw | NanoBot | SuperAGI |
| Edge / IoT | PicoClaw | NullClaw | OpenClaw (too heavy) |
| Multi-agent teams | CrewAI | LangGraph | BabyAGI |
| Enterprise | LangGraph | Microsoft Agent Framework | Swarm (deprecated) |
| Serverless | CF Agents SDK | — | Local-only frameworks |
| Auto-learning | Agent Zero pattern | — | Frameworks without persistence |
| Cost-optimized | Custom router | — | Single-provider frameworks |
| Code agents | Claude Agent SDK | OpenAI Agents SDK | AutoGPT |
| Type safety | Pydantic AI | LangGraph | Untyped frameworks |
| Software company sim | MetaGPT | CrewAI | Single-agent frameworks |
Our autonomous issue worker needs:
-
Multi-executor backends — The executor router pattern above solves this. Abstract the LLM interface, route by task complexity and budget. Start with Claude Sonnet for code, Haiku for simple tasks, Workers AI for free-tier simple tasks.
-
A skill system — Adopt the SKILL.md pattern from OpenClaw. Skills as markdown files with YAML frontmatter. No SDK, no compilation. The skill loader implementation above is directly usable.
-
A coordinator (L2) — LangGraph’s graph model is the best pattern for multi-step workflows. But for a single-agent system, a simple state machine with checkpoint/resume is sufficient. Start with Pydantic AI’s durable execution or implement checkpoint/resume over D1.
-
Memory across runs — Implement the hybrid memory system above. SQLite + FTS5 + vector embeddings. Store MEMORY.md for human inspection. Add auto-learning (Agent Zero pattern) once the basic system works.
-
Cost optimization — The executor router with daily budgets, rate limit awareness, and free-tier fallback. Track actual costs per task and adjust routing thresholds based on real data.
Patterns to adopt: SKILL.md skills, hybrid memory search, auto-learning extraction, multi-executor routing, checkpoint/resume state Patterns to skip: Multi-agent coordination (overkill for our use case), push-based communication (polling is simpler), full graph workflow engine (state machine is sufficient)
Official Documentation
- OpenClaw Documentation — Full documentation for the OpenClaw framework
- OpenClaw Skills Guide — How the SKILL.md skill system works
- NanoClaw GitHub — Container-isolated Claude-native agent
- NanoBot GitHub — Ultra-lightweight OpenClaw alternative
- PicoClaw GitHub — Go-based agent for edge/IoT hardware
- ZeroClaw GitHub — Rust-based trait-driven agent runtime
- NullClaw GitHub — 678KB Zig agent with vtable architecture
- TinyClaw GitHub — Multi-agent team collaboration framework
- Agent Zero GitHub — Auto-learning hierarchical agent framework
- LangGraph Documentation — Graph API for stateful agents
- CrewAI Documentation — Multi-agent orchestration framework
- CrewAI Memory System — Short-term, long-term, entity, and contextual memory
- AutoGen Documentation — Microsoft’s multi-agent framework
- Microsoft Agent Framework Overview — Enterprise agent platform combining AutoGen and Semantic Kernel
- OpenAI Agents SDK — Production evolution of Swarm
- OpenAI Agents SDK GitHub — Source repository
- MetaGPT GitHub — SOP-driven multi-agent collaboration
- Claude Agent SDK Overview — Anthropic’s agent framework
- Claude Agent SDK Python — Python implementation
- Claude Agent SDK TypeScript — TypeScript implementation
- Cloudflare Agents SDK — Edge-native agents on Durable Objects
- Pydantic AI Documentation — Type-safe agent framework
- Pydantic AI GitHub — Source repository
- AutoGPT GitHub — Original autonomous agent platform
- BabyAGI GitHub — Task-driven autonomous agent
- SuperAGI GitHub — Stalled autonomous agent framework
- Swarm GitHub — OpenAI’s educational multi-agent framework (deprecated)
Articles and Analysis
- OpenClaw Beat React’s 10-Year GitHub Record in 60 Days — Analysis of OpenClaw’s explosive growth
- OpenClaw Wikipedia — Background and history
- NanoClaw Solves OpenClaw’s Security Issues — VentureBeat coverage of container isolation approach
- PicoClaw and NanoBot vs OpenClaw — Lightweight alternatives comparison
- Meet NullClaw: 678KB Zig AI Framework — MarkTechPost technical overview
- ZeroClaw: A Minimal Rust-Based AI Agent Framework — DEV Community deep dive
- OpenClaw, NanoBot, PicoClaw, ZeroClaw: The Claw Craziness — Overview of the Claw ecosystem
- Building Agents with the Claude Agent SDK — Anthropic engineering blog
- MetaGPT Research Paper — Academic paper on SOP-driven multi-agent collaboration
- What is MetaGPT? - IBM — IBM’s overview of MetaGPT
- Agent Zero: Revolutionary AI Framework — Tutorial and overview
- Agent Zero Memory and Learning — DeepWiki analysis of auto-learning system
- What is BabyAGI? - IBM — IBM’s overview of BabyAGI
- Birth of BabyAGI — Yohei Nakajima’s original post
Comparison Articles
- The 2026 AI Agent Framework Decision Guide: LangGraph vs CrewAI vs Pydantic AI — DEV Community comparison
- Top 7 Agentic AI Frameworks in 2026 — AlphaMatch overview
- AI Agent Frameworks 2026: LangGraph vs CrewAI & More — Let’s Data Science comparison
- LangGraph vs CrewAI vs AutoGen: Which Framework in 2026? — ML Journey comparison
- CrewAI vs AutoGen: Usage, Performance & Features in 2026 — Head-to-head comparison
- Agent Zero vs AutoGen: Multi-Agent 2026 Guide — The AI Journal comparison
- OpenClaw vs NanoBot: Which AI Agent Framework? — DataCamp comparison
- 12 Best Open-Source AI Agents & Frameworks in 2026 — Taskade comprehensive list
- Rust Agent Runtime Showdown — Rust agent frameworks compared
Ecosystem and Community
- Awesome OpenClaw Skills — 5,400+ skills filtered and categorized
- Awesome OpenClaw Agents — 103 production-ready agent templates
- OpenClaw Skills System (DeepWiki) — Technical analysis of skill loading
- NanoClaw CLAUDE.md — Agent behavior rules
- TinyClaw TinyOffice Portal — Web management portal
- TinyClaw Infrastructure — Docker orchestration layer
- ZeroClaw Migration Assessment — OpenClaw to ZeroClaw migration guide
- PicoClaw Go Package — Go documentation
- NanoBot Study Guide — Learn agent architecture in 3 days
- Awesome Agents List — Comprehensive agent framework directory
Tutorials and Guides
- Setting Up Skills in OpenClaw — Step-by-step skill creation
- PicoClaw Setup Guide: Go Binary AI Assistant on $10 Hardware — Edge deployment guide
- What are OpenClaw Skills? A Developer’s Guide — DigitalOcean developer guide
- How to Build Custom OpenClaw Skills — LumaDock tutorial
- Building Production-Ready AI Agents in 2026 using OpenAI Agent SDK — Architecture-first approach
- NanoBot Tutorial: A Lightweight OpenClaw Alternative — DataCamp tutorial
- LangGraph: Build Stateful Multi-Agent Systems That Don’t Crash — Production LangGraph patterns
- Pydantic AI Tutorial: Build Type-Safe AI Agents — MyEngineeringPath tutorial
Platform Documentation
- OpenClaw Official Site — Landing page and quick start
- PicoClaw Official Site — Go-powered performance-first assistant
- NullClaw Architecture — Technical architecture documentation
- ZeroClaw Labs — Official site (beware unofficial domains)
- NanoBot Official — MCP agent framework (nanobot-ai, different from HKUDS nanobot)
- AutoGPT Platform — Low-code agent builder
- Agent Zero Official — AI framework and computer assistant
- CrewAI Open Source — Multi-agent orchestration
- OpenAI AgentKit — Production agent deployment
- Cloudflare Agents Landing — Edge-native agent platform