Hermes Agent / May 21, 2026

How I added Honcho memory to Hermes Agent

A practical operator guide to setting up a conservative long-term memory layer for a Telegram-connected Hermes agent without changing the agent's voice, over-injecting context, or handing memory to every specialist at once.

Hermes plus Honcho memory guide diagram showing a controlled memory layer, tools-only recall, per-session setup, and operator approval.

Why this guide exists

I wanted Atlas, my Hermes Agent running through Telegram, to keep growing without losing the thing that already made him useful: stable behavior, strong operator memory, and the ability to stay grounded inside the work we are actually doing.

Honcho looked like the right next layer, but I did not want to flip every memory feature on at once. The goal was not to make Atlas more noisy. The goal was to add a cloud-backed memory layer while protecting his existing workflow, voice, and approval boundaries.

The framing: Honcho gives Hermes a more advanced memory layer, but the smart move is to start conservative: tools-only recall, measured usage, controlled cadence, and balance tracking before turning on deeper automatic context.

What Hermes Agent is

Hermes Agent is the operating layer I use to run Atlas through Telegram. It can use tools, skills, profiles, memory, cron jobs, and external providers, but the important part is the operator model: Atlas routes work, keeps boundaries visible, and helps coordinate specialist lanes without letting every tool become live by default.

In my setup, Hermes was already better at memory than my older OpenClaw workflows. Atlas could remember durable preferences, search session history, use built-in memory, and stay aligned with how I like work handled. That is why I treated the Honcho setup as an additive layer, not a replacement.

What Honcho adds

Honcho is an external memory provider for Hermes. The official Hermes docs describe it as an AI-native backend that can maintain user modeling, session context, peer separation, semantic search, and derived conclusions over time. In plain operator terms: it gives Hermes a place to build longer-range memory beyond the local built-in files.

That extra power is exactly why I did not start with the most aggressive mode. More memory is only useful if it improves the work. If it starts injecting half-baked context into every reply, the agent can feel less precise, not more.

Built-in memory stays active Hermes keeps its normal local memory layer even when an external provider is added.
Honcho becomes additive It adds provider tools and external memory capability instead of replacing the baseline.
Recall mode matters Tools-only recall lets Atlas choose when to ask Honcho, rather than injecting context every turn.
Cadence matters Dialectic reasoning can cost money and add noise, so it should be paced deliberately.

Before and after

Before Honcho, Atlas was responding from the current Telegram/session context, Hermes built-in memory, durable profile notes, and search tools that could recover prior session context when needed. That setup already worked well.

After Honcho, the goal is not for every message to become heavier. In the conservative setup, Atlas still gets the current conversation and built-in Hermes memory first. Honcho sits beside that as an external memory provider, available through tools and writing conversation memory in the background.

Before Current context + built-in Hermes memory + session search when needed.
After Same baseline, plus Honcho tools and external memory accumulation.
Protected Atlas's normal voice and workflow are not supposed to change immediately.
Delayed Automatic context injection waits until there is enough signal to justify it.

The route I chose

The official quick start can use the interactive memory setup flow. I chose a more operator-controlled path because I already had a Telegram-connected Hermes environment, a working Atlas personality, and a strong preference not to paste or duplicate secrets into the wrong files.

The route was: store the Honcho key safely, set Hermes to use Honcho as the external memory provider, configure Honcho conservatively, restart the gateway, and verify status before asking Atlas to depend on it.

Important: I did not attach Honcho to every specialist profile. I did not ingest Bixby, source code, private records, production data, or broad history. This was an Atlas-first memory pilot.

My starting configuration

This is the conservative profile I wanted: tools-only recall, per-session starts, async writes, a modest context budget, and dialectic reasoning paced at every five turns.

  • Workspace: hermes-atlas
  • AI peer: atlas
  • User peer: Jonathan
  • Recall mode: tools
  • Session strategy: per-session
  • Context budget: 800 tokens
  • Dialectic cadence: every 5 turns
  • Write frequency: async

Why tools-only? I wanted Honcho available, not dominant. In tools-only mode, Atlas can call Honcho when useful, but Honcho does not automatically stuff context into every reply.

Step-by-step setup overview

These steps are written from my VPS setup. Replace paths and workspace names with your own. Do not paste secrets into chat, exports, source control, or public logs.

Before you run commands: This reflects my Hermes setup and file paths. Review every command, replace paths/workspace names with your own, and back up your config first. Do not paste API keys into chat, source control, or public files.

1. Back up the current Hermes memory files
mkdir -p /home/hermes/.hermes/backups/honcho-manual-setup-$(date +%Y%m%d_%H%M%S)
BACKUP_DIR=$(ls -td /home/hermes/.hermes/backups/honcho-manual-setup-* | head -1)
cp -a /home/hermes/.hermes/config.yaml "$BACKUP_DIR/config.yaml.bak" 2>/dev/null || true
cp -a /home/hermes/.hermes/honcho.json "$BACKUP_DIR/honcho.json.bak" 2>/dev/null || true
cp -a /home/hermes/.hermes/.env "$BACKUP_DIR/env.bak" 2>/dev/null || true
echo "$BACKUP_DIR"
2. Store Honcho secrets in the Hermes env file
touch /home/hermes/.hermes/.env
chmod 600 /home/hermes/.hermes/.env

# Add these with nano or another terminal editor.
# Do not print real values back to the screen.
HONCHO_API_KEY=your_key_here
HONCHO_WORKSPACE_ID=hermes-atlas
HONCHO_BASE_URL=https://api.honcho.dev
3. Point Hermes at Honcho as the external provider
python3 - <<'PY'
from pathlib import Path
import yaml

p = Path("/home/hermes/.hermes/config.yaml")
data = {}
if p.exists():
    data = yaml.safe_load(p.read_text()) or {}

data.setdefault("memory", {})
data["memory"]["provider"] = "honcho"

p.write_text(yaml.safe_dump(data, sort_keys=False))
PY
4. Create the conservative Honcho config
cat > /home/hermes/.hermes/honcho.json <<'EOF'
{
  "workspace": "hermes-atlas",
  "peerName": "Jonathan",
  "dialecticCadence": 5,
  "hosts": {
    "hermes": {
      "enabled": true,
      "aiPeer": "atlas",
      "workspace": "hermes-atlas",
      "peerName": "Jonathan",
      "recallMode": "tools",
      "writeFrequency": "async",
      "sessionStrategy": "per-session",
      "observationMode": "unified",
      "observation": {
        "user": { "observeMe": true, "observeOthers": false },
        "ai": { "observeMe": false, "observeOthers": true }
      },
      "dialecticReasoningLevel": "low",
      "dialecticDynamic": true,
      "dialecticCadence": 5,
      "dialecticDepth": 1,
      "dialecticMaxChars": 500,
      "contextCadence": 3,
      "contextTokens": 800,
      "messageMaxChars": 12000,
      "dialecticMaxInputChars": 8000,
      "saveMessages": true
    }
  }
}
EOF

chmod 600 /home/hermes/.hermes/honcho.json
5. Restart and verify
systemctl --user restart hermes-gateway
sleep 3
systemctl --user status hermes-gateway --no-pager -l
hermes honcho status
hermes memory status

What went wrong

The first status check showed Honcho was connected, but the cadence still read every 1 turn instead of every 5 turns. That mattered because the whole point was to start with controlled cadence, not an aggressive memory loop.

The fix was to make sure dialecticCadence existed at the root level of honcho.json as well as inside the host block. After restart, hermes honcho status correctly reported:

Expected verification line
Dialectic cad: every 5 turns

Lesson: do not assume the file edit worked. Trust the status command. If the status output does not match the intended setup, inspect the expected schema before moving on.

How I verified it worked

A working setup was not "the file exists." A working setup meant Hermes could see Honcho, the plugin was active, the gateway was running, the API key was masked, and the intended conservative settings appeared in status output.

Gateway running hermes-gateway.service was active after restart.
Provider active hermes memory status showed provider: honcho.
Honcho available The Honcho plugin showed installed and available.
Cadence corrected Status reported dialectic cadence every 5 turns.
Clean status pattern
Built-in:  always active
Provider:  honcho
Plugin:    installed
Status:    available
Recall mode: tools
Session strat: per-session
Context budget: 800 tokens
Dialectic cad: every 5 turns

How /new behaves now

Starting a new Telegram session with /new still means the active thread starts clean. It does not mean Atlas loses every durable memory layer.

With this setup, a new session should still have Hermes built-in memory available, and Honcho can accumulate or retrieve memory externally. Because recall mode is tools-only, that Honcho context should not automatically flood every new reply.

/new Clears the immediate working session.
Built-in Hermes built-in memory remains available.
Honcho External memory remains available through provider tools.
Voice Atlas should still feel like Atlas because auto-injection is limited.

What I am testing next

The next test is not "turn everything on." The next test is to use Atlas normally and watch whether Honcho helps with useful recall without adding noise.

  • Does Atlas remember durable operator preferences more reliably?
  • Does tools-only recall keep responses clean?
  • Does cost stay predictable?
  • Does cadence stay controlled?
  • Does memory help without pulling in private or unrelated lanes?

Blocked for now: broad specialist memory, private project ingestion, Bixby memory ingestion, source code ingestion, automatic memory sync across profiles, and switching to hybrid/context mode without a review window.

Rollback

If Atlas starts acting noisier, if cost climbs, or if the memory layer feels like it is steering instead of helping, rollback should be boring.

Rollback outline
# Option 1: use Hermes if available
hermes memory off

# Option 2: restore from backup
cp -a /home/hermes/.hermes/backups/YOUR_BACKUP/config.yaml.bak /home/hermes/.hermes/config.yaml
cp -a /home/hermes/.hermes/backups/YOUR_BACKUP/honcho.json.bak /home/hermes/.hermes/honcho.json 2>/dev/null || true
cp -a /home/hermes/.hermes/backups/YOUR_BACKUP/env.bak /home/hermes/.hermes/.env 2>/dev/null || true

systemctl --user restart hermes-gateway
hermes memory status

If you want to fully revoke the cloud memory lane, revoke the API key in Honcho, remove the local env value, restart the gateway, and verify that Hermes reports built-in memory only.

Final takeaway

The best memory setup is not the one with the most switches turned on. It is the one that makes the agent more useful without changing why you trusted it in the first place.

For me, that means Hermes built-in memory stays the foundation. Honcho becomes the controlled long-term layer. Atlas gets room to grow, but the operator boundary stays intact.

My recommendation: start with tools-only recall, per-session strategy, measured cadence, and active balance tracking. Let the memory layer prove itself before enabling deeper automatic context.

References

Back to home