Fully Local Hermes Agent Stack on my Ryzen AI
I have been using Agent Zero, but after watching the many reviews, I got curious about Hermes and decided to give it a try as a container on my Ryzen AI PC.
Running autonomous agent architectures completely locally on consumer hardware is the absolute frontier of modern AI engineering. Moving from terminal-centric frameworks like Agent Zero over to Nous Research’s Hermes Agent ecosystem unlocks an entirely new tier of advanced tool orchestration.
If you are looking to build a sandboxed, hardware-accelerated AI agent lab that you can control from any device on your home network, here is the exact architectural blueprint, the hurdles I encountered, and how to set it up flawlessly.
The Architecture: How It Works
To maximize efficiency and keep my primary operating system safe, this setup splits the workload into a localized, containerized split-brain architecture:
- The Host Engine (Ollama): Sits directly on the Windows 11 host, leveraging the Ryzen AI NPU/GPU stack to serve large language models at maximum tokens-per-second.
- The Sandboxed Sandbox (Docker): A Linux container (
s6-overlayUbuntu environment) running the Hermes Agent gateway. It isolates 90 pre-bundled core skills—ranging from Playwright web browsers and file managers to GitHub PR workflows and Kanban boards—safely away from your host machine’s sensitive files.
The Complete docker-compose.yml Configuration
This highly optimized configuration opens up the web interface to your entire local Wi-Fi/Ethernet network, bridges communication smoothly back to your Windows Ollama engine, and keeps advanced security variables documented but safely commented out for an isolated home network.
Save this as docker-compose.yml in your working directory (e.g., C:\LocalData\DockerApps\hermesAgent):
YAML
services:
hermes-agent:
image: nousresearch/hermes-agent:latest
container_name: hermes-agent
restart: unless-stopped
ports:
# Exposes ports to 0.0.0.0 so other devices on your local network can connect
- "8642:8642"
- "9119:9119"
volumes:
# Maps your physical Windows directory to the container's virtual Linux storage
- C:\LocalData\DockerApps\hermesAgent\data:/opt/data
environment:
- HERMES_DASHBOARD=1
- HERMES_DASHBOARD_HOST=0.0.0.0
- CONFIG_PATH=/opt/data/config.yaml
# =================================================================
# SECURITY & ALLOWLISTS (Commented out for isolated home networks)
# =================================================================
# # By default, the gateway denies unapproved external users.
# # To enforce strict token or API key authentication, uncomment these:
# - GATEWAY_ALLOW_ALL_USERS=false
# - GATEWAY_API_KEY=your_super_secure_secret_password_here
#
# # If you eventually link Hermes to messaging apps (Telegram, Discord),
# # restrict execution to ONLY your user ID so network guests can't trigger tools:
# - TELEGRAM_ALLOWED_USERS=your_telegram_numeric_id
# - DISCORD_ALLOWED_USERS=your_discord_numeric_id
# =================================================================
extra_hosts:
- "host.docker.internal:host-gateway"
# Runs the gateway with the insecure flag to allow external LAN web UI access
command: ["gateway", "run", "--insecure"]
tty: true
The Core Configuration: config.yaml
Because Hermes Agent bypasses generic configurations when deployed as a customized node, you must manually supply an explicit configuration profile. Create a folder named data right next to your compose file, and save this file inside it as config.yaml:
Local
provider: "ollama"
base_url: "http://host.docker.internal:11434/v1"
model: "gemma4:latest" # Or "hermes3:latest" depending on your preference
terminal:
backend: "local"
memory:
memory_enabled: true
skills:
config: {}
OpenRouter
provider: openrouter
base_url: https://openrouter.ai/api/v1
model: google/gemini-1.5-flash
terminal:
backend: local
memory:
memory_enabled: true
skills:
config: {}
onboarding:
seen:
tool_progress_prompt: true
YAML
Lessons: Gotchas Solved
Setting this up isn’t always plug-and-play. If you are building this yourself, look out for these three core pitfalls we debugged and resolved along the way:
1. The Volume Trailing Path Trap (FileNotFoundError)
If you encounter a Python traceback pointing to status.py trying to write a lock file and throwing a FileNotFoundError, check your pathing. When specifying volume mounts in Docker for Windows, ensure your host path mirrors your exact directory structure (C:\LocalData\DockerApps\hermesAgent\data). If the path is wrong, Docker mounts an empty virtual loopback, breaking the container’s internal state managers.
2. The Local LAN Lockout (--insecure flag)
The Hermes Web UI is incredibly powerful—it can execute bash scripts, manipulate files, and spin up browser agents. Because of this, Hermes automatically activates an OAuth gate if it detects it is binding to any interface other than localhost (127.0.0.1).
To allow a tablet, laptop, or phone on your home network to access http://<YOUR-HOST-IP>:9119, you must pass the explicit command array ["gateway", "run", "--insecure"] in your compose file to intentionally authorize local area network visibility.
3. The OpenAI Compatibility Suffix (/v1)
When pointing Hermes to an Ollama instance, setting the base URL to http://host.docker.internal:11434 will trigger immediate HTTP 404: page not found errors during API execution calls. Ollama hosts its OpenAI-compatible routing maps under a specific subdirectory. Changing your base endpoint to http://host.docker.internal:11434/v1 instantly fixes the connection handshake.
4. Direct Model Tag Mismatches (ollama pull hermes3)
If you fire up the agent and immediately hit an HTTP 404: model not found error despite knowing the weights are on your hard drive, check your exact string matching. Ollama requires absolute precision when resolving model names. If your config.yaml is pointing toward a specific sub-tag like hermes3:8b, but your host machine only has the general library tag downloaded, the handshake will instantly drop. The quick fix is pulling the exact flagship tag directly from your host Windows terminal with ollama pull hermes3, which populates your library as hermes3:latest. Aligning this string perfectly on both sides of the container network bridge ensures the agent registers your hardware instantly.
5. The Grand Irony: Hermes Models vs. Hermes Agent
Perhaps the funniest hurdle you will encounter when launching the Hermes Agent framework is trying to run it with a Hermes 3 LLM model. If you point the container at hermes3:latest, the application will throw a warning claiming the model is “not agentic.”
Don’t panic: Hermes 3 is highly capable of tool calling, but it utilizes specialized XML formatting structures (<tool_call>), while the hermes-agent orchestration platform defaults to standard JSON payloads. To keep your workflow entirely local and hardware-accelerated without buying API keys for proprietary cloud models, swap your configuration back to an open-weights model with native standard JSON tool-calling engines—such as Gemma 4 or Qwen 3.6. This bridges the gap perfectly!
Spin Up and Execution
Once your files match the templates above, initialization requires just a few standard commands in Windows PowerShell:
PowerShell
# Navigate to your deployment folder
cd C:\LocalData\DockerApps\hermesAgent
# Tear down any lingering broken container definitions
docker compose down
# Launch the complete stack detached in the background
docker compose up -d
Accessing the Dashboard Over Wi-Fi
Find your local IPv4 address by running ipconfig in your host terminal. You can now walk over to your couch, grab a secondary machine or smartphone, and open the fully interactive orchestrator dashboard via: 👉 http://<YOUR_IP_ADDRESS>:9119 (e.g., http://192.168.3.11:9119)
Dropping into Interactive Chat
If you want to pull up the terminal execution interface directly on your host machine to watch your agent parse thoughts in real-time, execute:
PowerShell
docker exec -it hermes-agent /opt/hermes/.venv/bin/hermes
Model Selection: Gemma 4 vs. Hermes 3
To close out this setup guide, it’s worth noting the engine powering your agent loop. In my local library, I tested this utilizing Google’s lightning-fast Gemma 4.
- Gemma 4 brings massive efficiency, incredible speed via highly optimized architectures, and native multi-lingual and multimodal competencies that run like a dream on AMD Ryzen AI processing nodes.
- Hermes 3 (Nous Research fine-tune of Llama 3.1) is engineered explicitly for agent loops. It features designated structural training tags (
<SCRATCHPAD>,<PLAN>) meant to stop an agent from hallucinating when running system tool calls over and over.
The differences between the two break down across these core criteria:
| Feature | Google Gemma 4 | Nous Research Hermes 3 |
| Primary Developer | Google DeepMind | Nous Research (Open-source collective) |
| Architectural Heritage | Gemini 3 core research | Meta Llama 3.1 fine-tune optimization |
| Media Modalities | Native Text, Code, Images, Video, Audio | Text, Advanced Code, System Logics |
| Licensing Framework | Open-source (Permissive Apache 2.0) | Open-weights (Llama 3.1 Community License) |
| Primary Strengths | Blistering throughput speed, highly efficient execution on edge hardware, multilingual fluidities. | Hyper-steerable instruction adherence, structural tags (<SCRATCHPAD>, <PLAN>), uncensored logic. |
| Best Used For | Fast day-to-day interactions, vision analysis, high-concurrency multitasking. | Autonomous agent orchestration loops, sandboxed execution, terminal automations. |
If you are transitioning from Agent Zero and want to push the absolute boundaries of local tool automation without context or structural failures, Hermes 3 is the definitive “brain” for this specific environment. But if speed and hardware efficiency on your Ryzen AI node are your primary targets, Gemma 4 acts as a phenomenal, high-throughput alternative!
Docker Permissions Issue (and How to Kill the Log Spam)
Now that Nous Research’s Hermes Agent inside a local Docker container but it is mapping to a local drive, you quickly run into a classic, silent operational blocker: the Linux host-to-container filesystem mismatch. If you spin up your container and find your terminal logs aggressively flooded with repetitive configuration warnings like this every 60 seconds:
Plaintext
WARNING gateway.config: Failed to process config.yaml — falling back to .env / gateway.json values. Check /opt/data/config.yaml for syntax errors. Error: [Errno 13] Permission denied: '/opt/data/config.yaml'
It means your background growth engine is running with one hand tied behind its back. While the engine might successfully fall back to standard .env variables to stay alive, a Permission denied error means crucial background features—like writing self-improving procedural memory snapshots or scaling system context lengths—are completely blocked.
The Anatomy of the Error
When a Docker container initializes, it often maps a folder from your local host machine into the container’s virtual filesystem (e.g., mapping a local ./data folder to /opt/data to ensure your agent’s databases persist when the container reboots).
If you inspect the filesystem from directly within the running container using an interactive session (ls -la), the hidden friction point becomes instantly clear:
Plaintext
drwx------ 1 hermes hermes 4096 May 28 14:19 .
-rw-r----- 1 root root 212 May 28 05:59 config.yaml
-rw------- 1 root root 53993 May 28 01:53 .skills_prompt_snapshot.json
This output reveals two major structural bugs:
- The Ownership Loop: Essential system files like
config.yamland the agent’s core procedural memory bank (.skills_prompt_snapshot.json) are owned byrootbecause they were generated during a root-level build or sync sequence. Meanwhile, the background application process drops down to run under an unprivileged user account (likehermes). - The Directory Isolation: The parent directory (
.) is locked down todrwx------. This means only the explicit owner can enter or read the path. This is highly destructive for applications utilizing SQLite databases (state.db,kanban.db), as it prevents the database engine from cleanly creating and cycling its temporary Write-Ahead Log (-wal) and Shared Memory (-shm) journal sidecar files.
The Fix: Recalibrating the Container Filesystem
To permanently resolve this log spam and unlock full execution autonomy, you need to jump directly inside the running container as root and normalize both ownership and traversal permissions in one clean sweep.
Step 1: Open an Interactive Terminal Inside the Container
Execute into your running agent instance with administrative privileges:
Bash
docker exec -it --user root <your-container-name> bash
(If the container baseline is built on a minimal image lacking bash, substitute bash with sh at the end of the line).
Step 2: Unify Ownership and Directories
Once inside the prompt, navigate to your data node (cd /opt/data) and execute the following corrective stack:
Bash
# 1. Force the application user to own every single file and subfolder recursively
chown -R hermes:hermes /opt/data
# 2. Relax the parent and all operational subdirectories (logs, skills, memories) to standard traversal permissions
find /opt/data -type d -exec chmod 755 {} \;
# 3. Standardize file read/write scopes to universal system readability
find /opt/data -type f -exec chmod 644 {} \;
Step 3: Verify the Layout
Run a final filesystem status check to ensure everything is aligned:
Bash
ls -la
The parent directory and individual system assets will now reflect a perfectly optimized, balanced configuration layout:
Plaintext
drwxr-xr-x 1 hermes hermes 4096 May 28 14:19 .
drwxr-xr-x 1 hermes hermes 4096 May 28 01:55 logs
-rw-r--r-- 1 hermes hermes 212 May 28 05:59 config.yaml
-rw-r--r-- 1 hermes hermes 53993 May 28 01:53 .skills_prompt_snapshot.json
By ensuring the directory structure sits at 755 (drwxr-xr-x) and individual assets sit at 644 (-rw-r--r--), your agentic sub-processes can seamlessly handle transactional database writes, log cycles, and configuration updates. Type exit to step out of the sandbox, and enjoy a completely clear terminal window as your local partner goes to work in total background silence.

Setting up Hermes Agent inside this sandboxed container network is the definitive way to do it. Happy hacking!
What to Do next:
You will come up w

Ask it:
Can you please give me a status report of the Hermes configuration and see if it is all working?
It will respond something like this:
I resolved several dependency issues in your opt/hermes environment by installing the missing Python libraries (PyYAML, python-dotenv, python-rich, python-httpx) required for the CLI to run correctly.
Status Summary:
* Core: The main hermes CLI successfully executed hermes status.
* Gateway Service: Running (PID 126 via s6).
* Model: Currently using google/gemini-3.1-flash-lite via OpenRouter.
* API Keys: OpenRouter API key is detected (sk-o...68d7). Most other providers are not currently configured.
* Authentication: Not logged into additional services like Nous Portal, Discord, or Telegram.
If you plan to use specific platforms (like Discord or Telegram) or need tools that require other API keys (like search or browsing), you can configure them using hermes setup or add the required keys to your .env file. You can run hermes doctor if you encounter any specific functional issues moving forward.
Screenshots
Couple of screenshots of the resources it takes:






