Hermes Agent / May 23, 2026

How I built a bounded Hermes sub-agent system

A practical operator guide to building specialist agents without turning them into an unrestricted swarm: roles, profile shells, runtime packages, launcher routing, tool boundaries, logs, QA gates, and human approval.

Hermes sub-agent system field guide diagram showing Research Scout, QA Auditor, Build Architect, Content Operator, and Automation Steward routed through launcher gates around a Hermes Atlas command core.

Why this guide exists

I wanted Hermes Agent to feel less like one giant assistant and more like a real operator system. Atlas was already useful as my primary Hermes operator, but the next step was obvious: break repeatable work into specialist lanes.

The tempting mistake is to call that autonomy too early. A system can look impressive while still being unsafe, vague, or expensive. The version that actually worked for me was bounded: Atlas routes the work, each specialist has a lane, the launcher controls execution, outputs land in assigned folders, and approval gates decide what moves forward.

The point: the strongest specialist system is not "agents do whatever they want." It is "the right specialist gets the right task, with the right tools, inside the right boundary, and a human-controlled approval path."

What this system is

A bounded Hermes sub-agent system is a structured way to split work across specialized agent roles instead of giving one assistant every responsibility. The operator still owns the decision path. The specialists make the work sharper.

In my setup, Atlas stayed as the router and coordinator. The specialist agents became narrow lanes for research, quality review, build planning, documentation, and automation governance.

01 Atlas receives the task and decides whether a specialist lane is useful.
02 The launcher maps that task to a profile, skill, toolset, and output path.
03 The specialist creates a bounded artifact, log, or recommendation.
04 QA or the operator reviews the output before anything important changes.

What this system is not

This is not a permissionless agent swarm. It is not five agents with every tool, every credential, and full authority to act. That may sound exciting, but it is exactly how an operator system becomes noisy, expensive, and hard to trust.

  • It is not fully autonomous by default.
  • It is not a replacement for human approval.
  • It is not broad OAuth for every profile.
  • It is not automatic publishing, messaging, deletion, or production access.
  • It is not "profiles alone are security boundaries."

Important: Hermes profiles can separate state and configuration, but I do not treat profile folders by themselves as hard security boundaries. The safer model is launcher-enforced routing, explicit toolset allowlists, assigned paths, logs, and approval gates.

The core architecture

The architecture is simple on purpose. One operator receives the work. Specialists handle bounded lanes. A launcher decides what each specialist can access. Logs and QA artifacts create receipts.

Primary operator Atlas routes the work, keeps context, and synthesizes decisions.
Specialist profiles Each specialist has a named profile shell and role identity.
Role docs and SOUL files The role explains purpose, tone, allowed work, and refusal behavior.
Skills Each lane gets a narrow skill that matches its job.
Runtime packages Reusable prompts, templates, examples, and operating instructions.
Launcher script The launcher maps specialist, profile, model, toolset, skill, and paths.
Toolset allowlists Research gets read-only lanes. Builders do not automatically get shell or production access.
Receipts Every serious run should create an output, run log, or QA audit.

The five specialist roles

These were the five roles we built into the Hermes specialist system. They are useful because they are different. If every specialist gets the same tools and the same mission, the system is only pretending to specialize.

Research Scout

Research Scout gathers evidence, source trails, and bounded findings. In a safer setup, it starts with approved local/export inputs, then graduates to read-only public or approved-folder sources.

  • Allowed: evidence summaries, source comparison, citation trails, public/read-only research.
  • Restricted: broad browsing, account login, private material, paid/API-billed sources without approval.
  • Example task: compare three AI video providers and summarize cost, rights, and setup paths.

QA / Critique Auditor

QA is the control layer. It audits packages, checks claims, catches missing boundaries, and keeps "ready" from being used too early.

  • Allowed: artifact audits, boundary checks, readiness reviews, source/status validation.
  • Restricted: making final approval decisions, changing production systems, running broad tools.
  • Example task: audit a Research Scout report for citation quality and overreach.

Build Architect

Build Architect designs implementation plans and handoffs. It should not be treated as a production engineer by default. The first safe lane is design-only.

  • Allowed: system plans, acceptance criteria, folder structures, rollback plans, handoff specs.
  • Restricted: direct code execution, production edits, broad shell access, deployment actions.
  • Example task: design a safe first pilot workflow for an AI-generated scenic video pipeline.

Content / Documentation Operator

Content Operator turns approved source material into guides, templates, handoffs, summaries, and operator-facing packets.

  • Allowed: documentation packages, source-bound guides, templates, checklists, field notes.
  • Restricted: inventing facts, publishing externally, using private material without approval.
  • Example task: turn a QA-cleared setup log into a public-facing operator guide.

Automation Steward

Automation Steward is not an automation executor at first. It is the gatekeeper that says whether something is ready to be scheduled, monitored, or made recurring.

  • Allowed: readiness reviews, no-go memos, rollback rules, monitoring criteria, pilot proposals.
  • Restricted: creating cron jobs, webhooks, delivery, external actions, or production triggers without explicit approval.
  • Example task: evaluate whether a morning briefing workflow is ready for a limited save-only scheduled pilot.

Step-by-step build process

This is the sequence I would use again. It is slower than just handing out tools, but it produces a system you can actually trust.

  1. Define the operating model. Decide who routes work, who approves work, and what is blocked by default.
  2. Create specialist role docs. Write purpose, allowed work, restricted work, refusal behavior, and example tasks.
  3. Create profile shells. Make the specialists visible without adding credentials or broad tools yet.
  4. Assign tool boundaries. Start with local/export and file-only lanes wherever possible.
  5. Create runtime packages. Include prompts, templates, checklists, and expected outputs.
  6. Build the launcher. Hard-map profile, toolset, skill, source path, and output path.
  7. Create a command deck. Give the operator one visual surface for what exists, what is active, and what is blocked.
  8. Run manual tests. Do not schedule a specialist until manual runs become boring and reliable.
  9. Add read-only OAuth only where needed. Research and QA may need read-only source access before other specialists do.
  10. Run launcher-routed live invocation. Prove each specialist can run through the launcher without direct/freeform access.
  11. Audit results. Have QA review important outputs and boundary claims.
  12. Only then consider limited automation. Use short pilots, explicit stop rules, logs, and renewal approval.

Example folder structure

Your exact paths will be different. The important pattern is separation: source docs, runtime packages, profile shells, launchers, outputs, logs, and audits should not be mixed together.

Example structure
hermes-ecosystem/
  runtime/
    research-scout/
    qa-critique-auditor/
    build-architect/
    content-documentation-operator/
    automation-steward/
  profiles/
    research-scout/
    qa-critique-auditor/
    build-architect/
    content-documentation-operator/
    automation-steward/
  operator-console/
    specialist-command-deck.html
    invocation-snippet-library.md
    status-matrix.md
  launchers/
    specialist-invocation-launcher.py
  runs/
    research-scout/
    qa-critique-auditor/
    build-architect/
    content-documentation-operator/
    automation-steward/
  audits/
  approvals/
  source-of-truth/

Example launcher pattern

The launcher is the difference between "I have profiles" and "I can route specialists safely." It should make the allowed path explicit and boring.

Safe launcher-style pseudo-code
SPECIALISTS = {
  "qa": {
    "profile": "qa-critique-auditor",
    "toolsets": ["file"],
    "skill": "qa-file-document-audit",
    "output_root": "runs/qa-critique-auditor/"
  },
  "research": {
    "profile": "research-scout",
    "toolsets": ["file"],
    "skill": "research-scout-readonly-research",
    "output_root": "runs/research-scout/"
  }
}

def launch_specialist(name, task_file, output_file):
    spec = SPECIALISTS[name]
    assert output_file.startswith(spec["output_root"])
    assert task_file.startswith("approved-inputs/")

    run([
      "hermes",
      "--profile", spec["profile"],
      "--toolsets", ",".join(spec["toolsets"]),
      "--skills", spec["skill"],
      "run",
      "--input", task_file,
      "--output", output_file
    ])

Do not put secrets in the launcher. Credentials belong in restricted environment/config storage, not repo files, exports, logs, screenshots, or specialist prompts.

Approval and safety model

The approval model is what keeps the system useful. Without it, "specialists" quickly become an excuse to let tools act without context.

  • Read-only first. Start with file reads and approved source packets before any write or external access.
  • No broad OAuth by default. Use dedicated test accounts, narrow scopes, and approved folders.
  • No publishing by default. Social posting, messaging, upload, delivery, or external sync needs separate approval.
  • No important file changes without explicit approval. Back up before editing anything that affects runtime behavior.
  • Log outputs. Every serious run should create a run log, output artifact, or QA receipt.
  • Use fake fixtures first. Prove the workflow before using private, production, customer, or source-code material.

Testing checklist

I do not consider a specialist "live" because a folder exists. It has to pass a small, boring test.

Profile loads The named profile can be launched without broad inherited behavior.
Role behavior is correct The specialist acts like its lane, not like a general assistant.
Tool boundaries work The launcher applies the expected toolsets and skills.
Output path is correct The artifact lands where the operator expected.
No surprise OAuth The run does not request new credentials, scopes, or account access.
No external action No publishing, messaging, uploading, purchasing, deleting, or syncing occurs.
QA can critique it The output has enough source/status detail for an auditor to review.
Stop rules are clear The operator knows when the specialist must stop instead of guessing.

Common mistakes

Most specialist systems fail because they scale permission faster than reliability. These are the mistakes I would avoid.

  • Calling it autonomous too early. A manual specialist conveyor is still valuable.
  • Giving every specialist every tool. That defeats the point of specialization.
  • Skipping role docs. Without a role document, the specialist has no stable lane.
  • Treating profile folders as hard security boundaries. Verify actual launch behavior and tool inheritance.
  • Adding OAuth too early. Start with local/export sources, then narrow read-only scopes.
  • Skipping logs. If you cannot explain what happened, you cannot safely improve it.
  • Automating before manual runs are boring. If a workflow is not reliable manually, scheduling it will amplify the mess.

Final operating principle

The version that worked was not a swarm. It was a specialist operating system with lanes, receipts, and approvals.

Atlas routes. Specialists handle bounded work. The launcher enforces the shape of the run. QA challenges important outputs. The operator decides what gets approved, expanded, scheduled, or stopped.

Build the capability, then earn the automation. Profiles make the system visible. Launchers make it usable. Logs make it reviewable. Approval gates make it trustworthy.