Building Your Local AI Lab: Single Docker Image

Tags: AI Series, AMD, Docker, gemma4, Local AI, NVIDIA, Windows

To learn more about Local AI topics, check out related posts in the Lo cal AI Series

Have questions, ideas to share, or just want to connect? I’d love to hear from you! Check out my About Page to learn more about me or connect with me.

Ollama, Open WebUI and n8n on Windows

Hosting your own AI models is no longer just for Linux gurus. With Docker Desktop on Windows, you can run a professional-grade automation and chat suite—n8n, Ollama, and Open WebUI—all on your own hardware.

In this guide, we will set up a “Generic” configuration that makes it easy to handle changing IP addresses and specific hardware tweaks for NVIDIA or AMD GPUs.

Prerequisites

Docker Desktop: Installed and running with the WSL2 backend.
GPU Drivers: Ensure your NVIDIA (Adrenalin) or AMD (GeForce) drivers are up to date.
Local IP: Find your computer’s IP (open PowerShell and type ipconfig). We’ll assume it’s 192.168.x.x.

Step 1: The Secret Sauce – The `.env` File

To prevent “Hardcoding Horror” (where you have to change your IP address in ten different places), we use a .env file. Create a folder named ai-lab and create a file inside it named .env:

powerShell

# .env file - Update LOCAL_IP if your router changes your address
LOCAL_IP=192.168.4.24

Step 2: The Universal `docker-compose.yml`

This file is the “blueprint” for your lab. I have included commented-out sections for both NVIDIA and AMD users.

Note: By default, this script is set for AMD (ROCm). If you have NVIDIA, simply follow the comments to swap the image and resource tags.

YAML

services:
  # --- OLLAMA (The AI Engine) ---
  ollama:
    # FOR AMD: Use ollama/ollama:rocm
    # FOR NVIDIA: Use ollama/ollama:latest
    image: ollama/ollama:rocm 
    container_name: ollama
    
    # AMD SETTINGS: Un-comment the 'devices' section below
    devices:
      - "/dev/kfd:/dev/kfd"
      - "/dev/dri:/dev/dri"

    # NVIDIA SETTINGS: Un-comment the 'deploy' section below and comment out 'devices'
    # deploy:
    #   resources:
    #     reservations:
    #       devices:
    #         - driver: nvidia
    #           count: 1
    #           capabilities: [gpu]

    volumes:
      - ollama_data:/root/.ollama
    ports:
      - "11434:11434"
    environment:
      - OLLAMA_HOST=0.0.0.0
    networks:
      - ai-network
    restart: unless-stopped

  # --- OPEN WEBUI (The Interface) ---
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    ports:
      - "3000:8080"
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
    volumes:
      - open_webui_data:/app/backend/data
    networks:
      - ai-network
    restart: unless-stopped

  # --- n8n (The Automation Hub) ---
  n8n:
    image: n8nio/n8n:latest
    container_name: n8n
    ports:
      - "5678:5678"
    environment:
      - N8N_HOST=${LOCAL_IP}
      - WEBHOOK_URL=http://${LOCAL_IP}:5678/
      - N8N_SECURE_COOKIE=false
    volumes:
      - n8n_data:/home/node/.n8n
      - ./data:/home/node/data
    networks:
      - ai-network
    restart: unless-stopped

networks:
  ai-network:

volumes:
  ollama_data:
  n8n_data:
  open_webui_data:

Step 3: Launching the Lab

Open PowerShell in your ai-lab folder.
Run the command:PowerShell docker compose up -d
Wait a minute for the containers to pull. Docker will automatically read your .env file and “inject” your IP address into the n8n settings.

Step 4: Windows Firewall (Crucial!)

Windows is protective. If you want to access your AI from a tablet or another laptop on your WiFi, you must allow these ports through the Firewall:

Search for “Windows Defender Firewall with Advanced Security.”
Click Inbound Rules > New Rule.
Select Port > TCP > Specific local ports: 3000, 5678, 11434.
Select Allow the connection and apply it to “Private” networks.

How to Access Your New Lab

Chat with AI: http://[YOUR-IP]:3000 (Open WebUI)
Create Automations: http://[YOUR-IP]:5678 (n8n)
API Access: http://[YOUR-IP]:11434 (Ollama)

Pro-Tip: Hardware Verification

To ensure your AMD or NVIDIA card is actually doing the work:

Open Task Manager > Performance > GPU.
In Open WebUI, ask the AI to “Write a 500-word essay on the future of robotics.”
If you see the GPU Memory or Dedicated GPU Memory usage climb, your hardware acceleration is working!

Next Download the Models into Ollama!

The Command to Pull Gemma 4

Open your Windows terminal and run this command:

Bash

docker exec -it ollama ollama pull gemma4:4b

Happy Self-Hosting! By running this locally, your data never leaves your network, and you have no monthly subscription fees.

If you run into problems:

Cleanup and Restart

To prevent port conflicts and get a clean start, run these in your terminal:

Remove the failed containers:
- PowerShell
  - docker compose down
  - docker rm -f ollama open-webui n8n
Start the new setup:
- PowerShell
  - docker compose up -d

services:
  # --- OPEN WEBUI ---
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    ports:
      - "3000:8080"
    environment:
      # Use the IP from your .env file
      - OLLAMA_BASE_URL=http://${LOCAL_IP}:11434
    volumes:
      - open_webui_data:/app/backend/data
    networks:
      - ai-network
    restart: unless-stopped

  # --- n8n ---
  n8n:
    image: n8nio/n8n:latest
    container_name: n8n
    ports:
      - "5678:5678"
    environment:
      - N8N_HOST=${LOCAL_IP}
      - WEBHOOK_URL=http://${LOCAL_IP}:5678/
      - N8N_SECURE_COOKIE=false
    volumes:
      - n8n_data:/home/node/.n8n
      - ./data:/home/node/data
    networks:
      - ai-network
    restart: unless-stopped

networks:
  ai-network:

volumes:
  n8n_data:
  open_webui_data:

Update:

As of April 2026, Docker Desktop on Windows does not support passing AMD GPUs to containers. While NVIDIA users can use “GPU-PV,” AMD support is currently limited to the Linux version of Docker or the native Windows app.

The Trade-off

If you insist on 100% Docker on Windows:

The Good: Your setup is perfectly clean and contained.
The Bad: Ollama will run on your CPU, not your Radeon card. This means Gemma 4 will be about 10-20x slower (it will feel like it’s “stuttering” while typing).

If you are okay with that, here is your 100% Docker “CPU-Only” YAML file.

1. The 100% Docker `docker-compose.yml`

YAML

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    volumes:
      - ollama_data:/root/.ollama
    ports:
      - "11434:11434"
    environment:
      - OLLAMA_HOST=0.0.0.0
    networks:
      - ai-network
    restart: unless-stopped

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    ports:
      - "3000:8080"
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
    volumes:
      - open_webui_data:/app/backend/data
    networks:
      - ai-network
    restart: unless-stopped

  n8n:
    image: n8nio/n8n:latest
    container_name: n8n
    ports:
      - "5678:5678"
    environment:
      - N8N_HOST=${LOCAL_IP}
      - WEBHOOK_URL=http://${LOCAL_IP}:5678/
      - N8N_SECURE_COOKIE=false
    volumes:
      - n8n_data:/home/node/.n8n
      - ./data:/home/node/data
    networks:
      - ai-network
    restart: unless-stopped

networks:
  ai-network:

volumes:
  ollama_data:
  n8n_data:
  open_webui_data:

2. How to Pull Gemma 4 inside the Container

Once you run docker compose up -d, you have to tell the Docker container to download the model:

PowerShell

docker exec -it ollama ollama pull gemma4:e2b

(Note: I chose e2b for the CPU setup. Since CPUs are slower than GPUs, the 2.5B parameter “Edge” version of Gemma 4 will give you a much more usable speed than the larger versions.)

3. Summary of your “All-In-Docker” Lab

Feature	All-In-Docker (Current)	Hybrid (Native Ollama)
Location	100% inside Docker	Mixed (App + Docker)
GPU Usage	CPU Only (Slow)	AMD GPU (Fast)
Setup Difficulty	Easy	Medium
Maintenance	Low	Low

Choosing the Right Version of Gemma 4 for your AMD GPU

Since Gemma 4 was released recently (April 2026), it comes in several “flavors.” Depending on how much VRAM your Radeon card has, you might want to pull a different version:

Command	Best For…	VRAM Needed
`docker exec -it ollama ollama pull gemma4:2b`	Ultra Fast. Great for basic n8n automations.	~2GB
`docker exec -it ollama ollama pull gemma4:4b`	The Standard. Best balance of speed and logic.	~4-6GB
`docker exec -it ollama ollama pull gemma4:26b`	High Intelligence. Use this for complex coding/logic.	~16GB+

3. Verify in Open WebUI