OpenClaw vs OpenHuman vs Hermes Agent: Three Architectures of the Open-Source Agent Stack

Pick the wrong open-source agent and you rebuild from zero. OpenClaw, OpenHuman, and Hermes Agent are not the same animal — one sandboxes your machine, one memorizes you, one learns from its own failures. Here's the architecture, side by side.

At a glance

Three projects, three different answers to the same question: what does an autonomous agent actually need to be useful beyond a demo? The metadata table below sets the basics; the chart and matrix that follow show where each one bets hardest.

Project	Released	Primary niche	Deployment shape
OpenClaw	2026.3.31 (Task Brain release)	Local-first runtime / system automation	Always-on local daemon
OpenHuman	May 13, 2026	Personal context / life-aware assistant	Native desktop app (Tauri)
Hermes Agent	Feb 26, 2026	Always-on autonomous research / task agent	Self-hosted always-on service

GitHub stars at writing — OpenClaw's lead is two orders of magnitude.

Where each agent leans hardest. Filled cells mark a project's strongest axes.

The three projects in 60 seconds

OpenClaw is, by a large margin, the most-starred open-source agent project ever built — roughly 347,000 GitHub stars at writing. Its core thesis is that a local-first runtime can convert a static LLM into a proactive digital worker: give it file system access, a shell, a browser, email, calendar, and nine messaging platforms, then let it run without per-step human check-ins. The 2026.3.31 release introduced the Task Brain ledger, a SQLite-backed unified management layer that consolidates every task, subagent, cron job, and background process into a single queryable store. What is not obvious about OpenClaw: the eBPF sandboxing layer means skills run with the minimum kernel-level permissions needed to complete their job — the agent is proactive, but it is not operating with a blank permission slip.

OpenHuman launched on May 13, 2026 from TinyHumans and accumulated roughly 7,800 GitHub stars within days of launch — and kept climbing fast, past 29,000 within about two weeks. The project's thesis is inversion: instead of asking the agent to learn who you are mid-conversation, it builds a compressed model of your life — from Gmail, Notion, GitHub, Slack, Stripe, Linear, Jira, and 110+ other connectors — before you type a single prompt. That memory tree, stored in local SQLite, becomes the initial context for every interaction. What is not obvious: the 20-minute polling interval means the agent's model of you is continuously refreshed without any action on your part — the memory is not static; it drifts toward the present on its own.

Hermes Agent was released February 26, 2026 by Nous Research — the team behind the Hermes family of fine-tuned models — and crossed 140,000 GitHub stars within three months, making it the most-used agent on OpenRouter by telemetry. Its thesis is persistence and self-improvement: the agent runs as a self-hosted always-on service, writes a structured episodic record after each task (what was tried, what succeeded, what failed), and retrieves those records before tackling similar future tasks. What is not obvious: the episodic memory loop is a measurable self-improvement mechanism across sessions, not a chatbot-style conversation history — each future task starts with retrieved priors rather than a blank slate.

OpenClaw — deep dive

OpenClaw's control loop wraps a SQLite-backed Task Brain ledger and eBPF-isolated skill processes.

Control loop

OpenClaw's runtime operates on a continuous execution model. The agent receives a goal, decomposes it into steps, calls the appropriate skill for each step, evaluates the result, and decides the next step — all without pausing for human confirmation. This differs from a simple pipeline: the model is the decision-maker at each junction, not a fixed script.

Task Brain acts as the orchestration layer for this loop. Every step's state is recorded in the ledger before the skill executes, so if the process crashes mid-task, the loop can resume from the last committed state rather than restart from zero. Subagents forked to handle parallel branches write into the same ledger under the parent task's namespace, making multi-agent coordination auditable in a single query.

Task Brain ledger

The Task Brain SQLite store introduced in the 2026.3.31 release is a unified management layer for everything that moves in the agent: tasks, subagents, cron-scheduled jobs, and background processes. Before this release, each of these was tracked separately; a crash in one subsystem left the others without visibility into what had actually completed. The ledger consolidates them under a single schema.

Each row in the ledger carries the task identifier, parent task (if any), the assigned skill, the current status, the tool call arguments, and the raw result. The agent reads this table — not an in-memory structure — as the source of truth for what to do next. This means the loop survives process restart: on reconnect, the agent queries the ledger, identifies pending steps, and continues.

Subagent coordination works through the same table. A parent task forks a child record; the child's status transitions from pending to running to complete or failed inside the ledger. The parent polls the ledger to detect completion, not an in-memory channel. This makes the entire execution graph inspectable with a standard SQLite viewer.

Sandboxing model

Skills run as sandboxed processes whose kernel-level permissions are enforced by eBPF hooks at runtime. A skill that needs file read access to a specific directory gets exactly that — not write access, not network access, not shell execution — unless the skill's declared manifest requests and is granted each capability explicitly. This is least-privilege execution: the permission set is narrow by default, widened only on deliberate declaration.

The threat model OpenClaw is designed against is prompt injection via the environment — a malicious document or web page that attempts to redirect the agent into taking actions outside the user's intent. Because skills run in eBPF-constrained processes, even a successful prompt injection cannot escalate to capabilities the skill was never granted. The attacker's surface area is bounded by the skill's declared manifest, not by everything the host OS allows.

Integration surface

OpenClaw's integration surface covers the file system, shell, browser (via Playwright), email, and calendar. On the messaging side, as of April 2026, direct integration covers WhatsApp, Telegram, Discord, Signal, Slack, iMessage, Matrix, LINE, QQ Bot, and Microsoft Teams — ten platforms. The breadth signals a design philosophy: the agent lives where users live, not in a separate tool they have to visit.

What the integration list reveals is that OpenClaw is not designed as a developer-only tool. The messaging integrations — particularly iMessage and WhatsApp — suggest that users who never open a terminal are intended users. The agent is supposed to reach them in the communication channel they already have open.

OpenHuman — deep dive

OpenHuman compresses 118 connectors into a Markdown memory tree before the agent reads a single prompt.

Connector model

OpenHuman ships with 118+ third-party connectors covering Gmail, Notion, GitHub, Slack, Stripe, Calendar, Drive, Linear, Jira, and dozens of category peers. Each connector declares what it can pull: emails, calendar events, issues, pull requests, transactions, documents. On authorization, OpenHuman begins traversing these connections automatically — no additional configuration required per connector.

The automatic fetcher runs every 20 minutes against all active connections. It pulls only deltas — not full re-syncs — and compresses new data into Markdown blocks that are appended to the relevant branch of the memory tree. The result is a continuously refreshed model of the user's work life that does not require the user to initiate a sync. The agent's context is always within 20 minutes of current.

Memory tree compression

The memory tree is a hierarchical structure — not a flat log — that organizes compressed Markdown blocks by domain: communications, tasks, code activity, finances, calendar, and so on. Each branch corresponds to a connector category. When the fetcher pulls new data, it compresses the delta into a Markdown block and appends it to the appropriate branch, pruning older blocks when the branch exceeds a size threshold. The tree shape matters: it lets the agent selectively load the branch relevant to a query rather than scanning the full history.

Compression is deliberate. Raw connector output — a full email thread, an entire GitHub PR — would exhaust a context window quickly. The compression step distills the signal: who said what, what changed, what the status is. The resulting Markdown blocks are dense but human-readable, which matters for debugging when the agent draws on wrong context.

The tree is stored in local SQLite, not a cloud database. This means the memory survives connectivity loss and never leaves the user's machine — an important property for users whose connectors include sensitive financial or legal data.

User-context-first prompting

Most agents treat the user as the target of the interaction: the user types a goal, the agent pursues it. OpenHuman inverts this. The user's memory tree is loaded as the initial context before the prompt is read. The agent starts already knowing the user's current project status, outstanding tasks, recent communications, and calendar obligations. The user's prompt arrives into a context window that is already saturated with user-specific information.

This changes the nature of what the user has to say. Instead of briefing the agent on their situation at the start of every session, the user can issue terse instructions — "draft a follow-up to the Stripe conversation from Tuesday" — and expect the agent to have the relevant context without further explanation. The user is not the system prompt; the user is the pre-loaded world state. The prompt is the delta on top of that state.

Desktop runtime

OpenHuman is built on Rust and Tauri, shipping as a native desktop application on macOS, Windows, and Linux. The Tauri choice offers a small binary footprint relative to Electron alternatives, native OS integration for file system and IPC access, and a process model that confines the web view without requiring Chromium-level privileges for the agent's core logic. At version v0.53.43 (Early Beta) as of writing, the application is under active development but fully usable for daily tasks.

Local-first data storage is not a limitation — it is a deliberate trust model. Because the memory tree never leaves the machine, users with sensitive connector data (legal, financial, medical) keep that data on-device rather than in a vendor's store. The trust model is more nuanced than "fully local," though: by default, account sign-in, model routing, web-search, and connector OAuth still route through OpenHuman's managed backend — it is your data that stays local, not every service. The optional local AI support narrows that surface: when a local model is configured, query data stays on the device too.

Hermes Agent — deep dive

Hermes Agent's planner retrieves episodic memories from past tasks before deciding what to do next.

Planner loop

Hermes Agent structures each task as a four-phase loop: decompose the goal into subtasks, select the appropriate tools for each subtask, execute the tool calls, and evaluate the results against the goal. If evaluation fails or returns a partial result, the loop re-enters at the decompose phase with updated observations. The model directs this loop — it is not a fixed pipeline — which means the agent can revise its plan mid-execution based on what it finds.

"Always-on" in Hermes Agent means the service runs as a persistent background process, not a session-per-conversation tool. Tasks can be queued externally, triggered by schedules, or initiated by other services. The planner loop runs on each incoming task independently. There is no concept of "starting a conversation" — there is only "submitting a task to the queue." This makes Hermes Agent composable as a backend service, not just a user-facing assistant.

Tool library

Hermes Agent ships with 40+ built-in tools covering web search, web fetching, code execution, file operations, API calls, and data processing. Each tool is a registered callable with a typed schema. The planner selects tools by matching the schema's declared capabilities against the subtask description. Registering a new tool means providing a name, a description the planner uses for selection, and a typed input/output schema — there is no plugin API separate from this registration contract.

This differs meaningfully from OpenClaw's Skill model. In OpenClaw, a Skill is a sandboxed process with eBPF-enforced kernel-level permissions — adding a Skill requires declaring a capability manifest and accepting that the kernel will enforce it. In Hermes Agent, tools run as registered callables in the agent's process space, isolated by process boundaries but not by kernel-enforced capability constraints. The trade-off is extensibility speed: adding a Hermes tool is faster, but the blast radius of a malicious or buggy tool is larger.

Episodic memory

After every task completes — whether successfully or not — Hermes Agent writes a structured episodic record to its memory store. Each record contains the task description, the approach taken (which tools were selected, in what order, with what arguments), the outcome (succeeded, failed, partially completed), and any notable observations from the execution. This is not a conversation log; it is a structured case base indexed for retrieval.

Before a new task begins, the planner queries the episodic store for records with similar characteristics. If past records exist, they are prepended to the planner's context. The planner reads prior approaches — including failed ones — before deciding how to decompose the current task. A task that failed three times with a particular tool sequence will show those failures in the retrieved records; the planner can adjust its approach before it repeats the same mistake.

Self-improvement is measurable across sessions. If the same class of task initially takes four tool calls and two retries, and after ten rounds of episodic retrieval it takes two tool calls and zero retries, that delta is visible in the execution logs. The memory is not decorative — it has a causal relationship to task performance that compounds over time.

Model-agnostic backend

Hermes Agent is decoupled from any specific LLM. The operator configures which model backend the planner uses — a local model via Ollama, a cloud API, or any OpenRouter-compatible endpoint. Changing the model does not require changing the agent's tool library, task queue configuration, or episodic memory schema. The agent's behavior is shaped by the model's capabilities, but its architecture does not hardwire any of them.

The OpenRouter telemetry signal — Hermes Agent as the most-used agent on the platform — confirms that the model-agnostic design works in practice. Users are selecting different models for different deployments, and the agent functions correctly across that range. For operators who want to run a capable model today and upgrade later without rewriting agent logic, this decoupling is the core deployment advantage over architectures that embed model-specific prompting patterns deep in the planner.

Cross-cutting comparison

Memory model

Three persistence shapes for three different jobs.

OpenClaw's memory is transactional: the SQLite task ledger records what the agent is doing and has done in the current execution context. It is not designed to accumulate knowledge about the user across weeks — it is designed to make the current task resumable and auditable. When the task completes, the ledger entry is closed; it does not feed future task planning.

OpenHuman chooses a life-context tree: a hierarchical, continuously updated Markdown structure that models the user's world rather than the agent's current job. The memory tree exists before any task is submitted and persists regardless of whether the agent is active. OpenClaw's ledger is session-scoped; OpenHuman's tree is perpetually accumulating. Hermes Agent splits the difference with an episodic case base — memory that is task-scoped like OpenClaw's but accumulates across sessions like OpenHuman's.

The episodic case base stores outcomes, not state. Where OpenHuman's tree answers "who is this user and what is their current context," Hermes Agent's records answer "what happened last time we tried this class of task and what should we do differently." The three shapes serve three distinct retrieval purposes: resumability, user-context saturation, and cross-session approach improvement.

Tool / skill model

What the agent reaches for — and what reaches into it.

OpenClaw uses a Skill abstraction: each Skill is a sandboxed process with a declared capability manifest enforced by eBPF hooks. The agent reaches for a Skill; the Skill executes with the specific kernel permissions it declared. This makes OpenClaw's tool model the safest of the three but also the most heavyweight to extend — adding a Skill requires writing a manifest and testing it against eBPF enforcement.

OpenHuman's model is a hybrid. On the input side, connectors pull data from external services into the memory tree — these are read-only data integrations, not action-taking tools. On the action side, OpenHuman ships native tools: web search, web fetch, filesystem access, git workflows, linting, testing, grep-like code search, speech-to-text, and text-to-speech. These native tools are tightly integrated with the desktop runtime rather than sandboxed at the process level.

Hermes Agent opts for a registered callable model: 40+ built-in tools with typed schemas, extensible by registering additional callables using the same contract. Compared to OpenClaw's Skill SDK, registration is faster and requires less infrastructure; compared to OpenHuman's native tools, it is more composable as a remote service but runs without kernel-level isolation. The extensibility speed advantage comes at the cost of blast-radius containment.

Security and sandboxing

The blast-radius story for each runtime.

OpenClaw's eBPF sandboxing is the strongest isolation story of the three. Kernel-enforced least-privilege means a compromised or prompt-injected skill cannot access resources beyond its declared manifest regardless of what it attempts. The attack surface is bounded by capability declaration, not by the agent's or user's awareness of what the skill could theoretically reach. This is the right model for an agent with broad system access — file system, shell, nine messaging platforms — where the potential blast radius without isolation is large.

OpenHuman operates at OS-level desktop trust. The native desktop app runs with user-level process permissions, not kernel-enforced capability constraints per tool. For a desktop assistant whose primary surface is reading and summarizing data from connectors, this is an acceptable trade-off: the threat model is data exposure through connector misuse rather than privilege escalation. The local data storage reduces the blast radius of a breach — the attacker who compromises OpenHuman does not gain cloud credentials or a remote database; they gain what was already on the machine.

Hermes Agent uses process isolation: each tool call executes in a separate process, limiting the damage of a buggy tool to that process's scope. This is a meaningful boundary — a tool that crashes or misbehaves does not take down the agent — but it does not approach OpenClaw's kernel-enforced capability constraints. For a self-hosted service deployed on a $5/month VPS handling automated research tasks, process isolation is a pragmatic match for the threat model: the surface area is the VPS environment, and the operator controls what tools are registered.

Deployment topology

Where the agent lives shapes what it can do.

OpenClaw runs as an always-on local daemon. It lives on the machine continuously, not just when a user opens a window. This enables cron-scheduled tasks, event-triggered workflows, and background processes that run without user presence. The daemon is the integration point for the full messaging stack — WhatsApp, Telegram, Discord, and the rest — which require persistent connectivity to deliver inbound messages to the agent. A session-based deployment cannot serve this use case.

OpenHuman is a native desktop application: it runs when the user opens it, persists the memory tree between sessions via SQLite, but does not run background processes when the app is closed. This shapes the assistant model significantly — OpenHuman is a tool the user invokes, not a service the user delegates to indefinitely. Hermes Agent sits at the opposite end: a self-hosted always-on service that runs whether or not anyone is watching. Tasks arrive via API or schedule; the agent processes them in the background and stores results. The operator's interface is a log, not a chat window.

The deployment topology choice determines the hosting cost curve. OpenClaw and OpenHuman run on the user's existing hardware at no incremental cost. Hermes Agent requires a self-hosted server — but Nous Research explicitly targets a $5/month VPS as the baseline deployment target, making it accessible to individual developers without a cloud budget. That cost point is competitive with the API costs of running the same tasks through a managed agent service.

Extensibility

How the system grows beyond what shipped on day one.

OpenClaw extends through the Skill SDK: write a skill, declare its capability manifest, register it with the runtime, and the agent gains that capability with eBPF-enforced constraints. The SDK is the primary extension surface; the messaging integrations and direct integrations are built on top of the same Skill primitives. The consequence is that every extension inherits the security model — strong isolation, explicit capability declaration — but also the engineering overhead of that model.

OpenHuman extends through connector plugins on the data ingestion side and through native tool additions on the action side. Adding a connector means writing a plugin that implements the fetcher interface; the new connector's output enters the memory tree compression pipeline automatically. OpenHuman's extensibility is primarily about enriching the user's context model, not about adding new action capabilities. Hermes Agent extends through tool registration and model swap: add a new callable to the registry and the planner gains access to it immediately; swap the LLM backend and the entire agent reasons differently without touching any tool code. The dual extension surface — both the tool library and the model — is Hermes Agent's unique extensibility advantage. No other project in this comparison allows the "brain" to be replaced independently of the "hands."

When to pick which

Use case profile	Use OpenClaw if…	Use OpenHuman if…	Use Hermes Agent if…
DevOps automation / always-on local sysadmin	You need a persistent local daemon that executes shell commands, monitors services, and reacts to cron triggers with eBPF-sandboxed skill isolation.	Not a strong fit — OpenHuman is session-based and does not run background processes when closed.	You want to self-host the automation logic remotely and invoke it via API rather than running it on the target machine itself.
Personal assistant / inbox + calendar + notes	You primarily communicate via the supported messaging platforms and want the agent to handle replies and scheduling without a separate app open.	You want the agent to already know your context — open tasks, recent emails, calendar conflicts — before you say a word, pulled from 118+ connectors.	You are comfortable running a self-hosted service and want an agent that improves its handling of your recurring personal tasks over time via episodic memory.
Autonomous research / long-running multi-step tasks	You need a local runtime that runs extended multi-step workflows with full resume-from-crash capability and no per-step human approval.	Your research tasks require deep personal context — knowing your existing notes, current projects, or prior work — as the starting point for each research session.	You want a service-oriented agent that queues research tasks, runs them against a 40+ tool library, and gets measurably faster at recurring research patterns over weeks.
Regulated-data workflows (offline, audit trail)	You need kernel-enforced per-skill capability constraints and a full SQLite audit trail of every task and tool call that survives process restart.	You want fetched connector data kept on the device — it lands in local SQLite, and optional local AI keeps queries on-device too (sign-in and connector OAuth still use the managed backend by default).	You self-host on infrastructure you control, accept process-level isolation as your security boundary, and need the episodic record as your audit trail.
Multi-model experimentation / avoiding LLM lock-in	You are willing to accept the model constraints of OpenClaw's configured LLM in exchange for the integrated local runtime and sandboxing story.	You want optional local AI for privacy-sensitive queries but accept cloud routing for complex tasks where local models fall short.	You need to swap LLM backends freely — from local Ollama to GPT-4o to a fine-tuned Hermes model — without changing any agent logic, tool registration, or memory schema.

FAQ

Which has the most GitHub stars?

OpenClaw leads decisively with approximately 347,000 GitHub stars as of May 2026 — the most-starred open-source agent project ever built. Hermes Agent crossed 140,000 stars within three months of its February 26, 2026 launch. OpenHuman, released May 13, 2026, is the newest — roughly 7,800 stars in its first days, climbing fast past 29,000 within about two weeks.

Can I run any of them on a Raspberry Pi or a small VPS?

Hermes Agent explicitly targets a $5/month VPS as its baseline deployment, making it the lightest server-side option — it runs without a desktop environment and with a small LLM or a remote API backend. OpenHuman requires a desktop OS (macOS, Windows, or Linux with a display), so a headless Raspberry Pi is not a natural fit. OpenClaw runs as a local daemon and will run on any machine that supports eBPF, which limits it to Linux kernels 4.15+ on x86-64 and ARM64; a Raspberry Pi 4 running a modern 64-bit Raspberry Pi OS is in scope, but the hardware budget matters for LLM inference.

Which is the most secure for production use?

OpenClaw offers the strongest technical isolation: eBPF hooks enforce kernel-level least-privilege per skill, and a prompt-injected skill cannot escalate beyond its declared capability manifest regardless of what it attempts. OpenHuman's OS-level desktop trust model is appropriate for a personal assistant but does not provide kernel-enforced capability constraints per tool. Hermes Agent uses process isolation — meaningful for crash containment but not comparable to kernel-enforced capability boundaries. For production workflows where a malicious payload in agent input could attempt privilege escalation, OpenClaw's sandboxing model is the most defensible.

Which works best as a personal assistant?

OpenHuman is purpose-built for this use case: 118+ connectors pull from Gmail, Calendar, Notion, Slack, GitHub, and more into a local memory tree that pre-loads your context before you type a prompt. You spend zero time briefing the agent on your situation. OpenClaw covers the messaging surface well — WhatsApp, iMessage, Slack, and eight other platforms — and excels at acting on your behalf within those channels, but its memory model is task-scoped rather than life-context-scoped. Hermes Agent can serve as a personal assistant but requires self-hosting and benefits most from repeated task patterns where episodic memory improves performance over time.

Are they interoperable?

Not natively. OpenClaw's Skill format, OpenHuman's connector and memory tree schema, and Hermes Agent's tool registration contract are independent interfaces — there is no shared protocol, no cross-project plugin format, and no standard for exchanging memory between them. A team that wanted to combine, say, OpenHuman's memory tree with Hermes Agent's planner would need to build a custom adapter. The projects are architecturally distinct enough that shallow interoperability (shared tools or shared memory) would require non-trivial integration work rather than a configuration toggle.

Which model backends do they support?

Hermes Agent is the most flexible: it is model-agnostic by design and works with any LLM backend the operator configures — local models via Ollama, OpenRouter endpoints, direct cloud APIs, or any OpenAI-compatible interface. OpenClaw supports configurable LLM backends covering both local and cloud options, but the integration is tighter to the runtime and less foregrounded as a differentiator. OpenHuman routes to cloud LLMs by default but includes optional local AI support for users who want queries to stay fully on-device; the model routing layer selects between local and cloud based on task complexity.