Building the Ultimate Private AI Lab

Tags: AI, AI Series, Containers, Docker, Local AI, ModernEUC, self hosting, Zero VPS Strategy

To learn more about Local AI topics, check out related posts in the Lo cal AI Series

AI Disclaimer I love exploring new technology, and that includes using AI to help with research and editing! My digital “team” includes tools like Google Gemini, Notebook LM, Microsoft Copilot, Perplexity.ai, Claude.ai, and others as needed. They help me gather insights and polish content—so you get the best, most up-to-date information possible.

Part of: AI Learning Series Here

Quick Links: Resources for Learning AI | Keep up with AI | List of AI Tools

Subscribe to JorgeTechBits newsletter

As a follow up to my How to Host Multiple Public Websites on Your Windows PC a few months bac, I am writing this one

If you want to move away from expensive AI subscriptions and bring your data back home, building a local AI home lab is the answer. Today, I’m documenting how I set up my new Ryzen AI 9 HX370 server to act as a private AI powerhouse for both chat and automation.

The Goal

A fully self-hosted AI stack where Ollama runs the models, Open WebUI provides the interface, and n8n handles complex automations—all accessible from anywhere without a Virtual Private Server (VPS). I call it my “Zero VPS” Strategy

1. The Hardware: Why the Ryzen AI 9?

We are using the Ryzen AI 9 HX370. While it has a powerful Radeon 890M iGPU, getting ROCm (AMD’s GPU software) to play nice in Docker on Windows is currently “bleeding edge.” For maximum stability, we chose to run the stack in Docker on Windows (WSL2) using CPU optimization and the fast AVX-512 instruction set.

My Hardware Specs:

MINISFORUM A1 Z1 Pro-370 Mini PC
AMD Ryzen AI 9 HX370
GPU: AMD Radeon 890M Copilot AIPC
12 Cores / 24 threads 80 Tops 5.1GHX (Zen 5 Architecture) , 2TB HD
64GB DDR5, WiFi7, 2 USB4, 2 RJ45 Bluetooth 5.4,
NPU, CPU CGPU
Window 11 Pro

The Result: You now have a production-grade AI lab accessible from anywhere that costs $0/month in subscriptions. Whether you are chatting with locally installed LLMs (Llama 3.2, Qwen) or the many (hundresds) of the ones available via OpenRouter, the control is entirely in your hands—all of which can be used from a local n8n installation (no VPS required!).

2. The “Perfect” Docker-Compose File

This configuration ensures all three services talk to each other internally while exposing the right ports for external access from your laptop or phone.

`docker-compose.yml`

YAML

services:
  ollama:
    image: ollama/ollama
    container_name: ollama
    volumes:
      - ollama_data:/root/.ollama
    ports:
      - "11434:11434" # Allows access from your laptop
    environment:
      - OLLAMA_HOST=0.0.0.0 # Tells Ollama to listen to the network
      - OLLAMA_KEEP_ALIVE=-1 # Keeps models in RAM for instant response
    networks:
      - ai-network
    restart: unless-stopped

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    ports:
      - "3000:8080"
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
    volumes:
      - open_webui_data:/app/backend/data
    networks:
      - ai-network
    restart: unless-stopped

  n8n:
    image: n8nio/n8n:latest
    container_name: n8n
    ports:
      - "5678:5678"
    environment:
      - N8N_HOST=192.168.12.88 # Replace with your Ryzen Server IP
      - WEBHOOK_URL=http://192.168.12.88:5678/ # Replace with your Ryzen Server IP
      - OLLAMA_HOST=http://ollama:11434
      - N8N_SECURE_COOKIE=false # Required for non-HTTPS local access
    volumes:
      - n8n_data:/home/node/.n8n
    networks:
      - ai-network
    restart: unless-stopped

networks:
  ai-network:

volumes:
  ollama_data:
  n8n_data:
  open_webui_data:

d. Scaling with OpenRouter (Hybrid AI)

For tasks too heavy for local hardware (like Claude 3.5 Sonnet), we integrated OpenRouter.

In Open WebUI: Go to Settings > Admin > Connections and add https://openrouter.ai/api/v1 with your API key.
In n8n: Use the “OpenRouter” node to access high-intelligence models when your local Llama needs a “second opinion.”

4. Performance & Benchmarks

One of the strongest arguments for this “Hybrid” approach is the balance between local privacy and cloud power. While local models are “free” to run once you own the hardware, cloud models via OpenRouter offer incredible speed and zero impact on your Ryzen server’s storage.

Model Source	Model Name	Local Storage	Expected Speed (TPS)	Best Use Case
Local (Ollama)	Llama 3.2 1B	~1.3 GB	45-50 t/s	Instant classification & routing.
Local (Ollama)	Llama 3.2 3B	~2.0 GB	25-30 t/s	Private drafting & daily assistance.
Local (Ollama)	Qwen 2.5 14B	~9.0 GB	8-12 t/s	Deep logic & local coding help.
Cloud (OpenRouter)	GPT-4o	0 GB	60-80 t/s	High-speed, high-intelligence tasks.
Cloud (OpenRouter)	Claude 3.5 Sonnet	0 GB	50-70 t/s	Advanced coding & complex reasoning.
Cloud (OpenRouter)	Gemini 2.5 Flash	0 GB	200+ t/s	Massive context & ultra-fast bursts.

Key Takeaway:

Local Storage: OpenRouter models require 0 GB of local storage. This is perfect for when you want to use a massive 400B parameter model that would never fit on a standard SSD.
Throughput (TPS): While the Ryzen AI 9 is impressively fast, cloud providers use massive GPU clusters. Using Gemini 2.5 Flash via OpenRouter can give you speeds that feel like the text is appearing all at once, which is a game-changer for long-form content generation in n8n.

5. Security & Remote Access: The Tailscale Way

Instead of opening risky ports on your router, we used Tailscale.

Install Tailscale on the Ryzen Server and your Laptop.
Log in with the same account.
Install the TailScale App
Your server gets a private IP (100.x.x.x). You can now access your n8n or Open WebUI from a coffee shop exactly as if you were home.

Summary

You’ve just built a enterprise-grade AI infrastructure for the price of a single high-end PC. By leveraging n8n as the orchestrator, you can build automations that are private, cost $0 in monthly fees, and scale infinitely.

The “Zero Storage” Strategy

You can start with 0 GB of models and just use OpenRouter for everything while they learn the ropes. Then, as you get comfortable, they can “pull” local models like Llama 3.2, Qwen or whatever is available, to save money on repetitive tasks.

This makes the “barrier to entry” for a home lab practically zero!

Disclaimer: I work for Dell Technology Services as a Workforce Transformation Solutions Principal. It is my passion to help guide organizations through the current technology transition specifically as it relates to Workforce Transformation. Visit Dell Technologies site for more information. Opinions are my own and not the views of my employer.