Local AI Sovereignty: Deploying Ollama, Gemma 4, OpenWebUI, and n8n

Tags: AI Series, Docker, gemma4, Local AI, Windows

To learn more about Local AI topics, check out related posts in the Lo cal AI Series

Subscribe to JorgeTechBits newsletter

To learn more about Local AI topics, check out related posts in the Lo cal AI Series

With 64GB of RAM and the latest Ryzen AI silicon, you are no longer a mere consumer of AI—you are a host. This setup leverages AMD’s XDNA architecture to run Gemma 4 and / or and Qwen 3.5 locally, ensuring your data never leaves your machine while providing a professional-grade automation suite via Docker.

This is an update to my previous article on setting up local AI. Also if you have more than 64Gb you could read my other blog post here: Local AI Series

Step 1: Install Ollama Desktop (The Engine)

Ollama acts as the bridge between your Ryzen AI hardware and the Large Language Models.

Download: Visit ollama.com and download the Windows installer.
Install: Run the .exe. It will automatically configure the background service.
Optimization: Ensure your AMD IPU drivers are updated to the latest version (April 2026). This allows Ollama to offload computation to the Ryzen NPU, keeping your CPU cool and your fans quiet.
Verify: Open PowerShell and type ollama --version to confirm it’s active.

Step 2: Download Gemma 4

Google’s Gemma 4 is optimized specifically for local execution. With 64GB of RAM, you can comfortably run the 31B parameter version for high-reasoning tasks.

In your terminal, run:

Bash

ollama run gemma4:31b

Wait for the download to complete. Once finished, you can chat directly in the terminal to test performance.

Step 3: Deploy the Docker Stack (The Interface & Logic)

We will now use Docker to wrap your engine in a beautiful UI (OpenWebUI) and a powerful workflow engine (n8n).

Create a Directory: Create a folder named AI-Stack on your drive.
Create Data Folder: Inside AI-Stack, create a folder named data (this is required for n8n persistence).
Compose File: Save the following as docker-compose.yml inside your AI-Stack folder:

YAML

services:
  ollama:
    image: ollama/ollama
    container_name: ollama
    volumes:
      - ollama_data:/root/.ollama
    ports:
      - "11434:11434"
    environment:
      - OLLAMA_HOST=0.0.0.0
    networks:
      - ai-network
    restart: unless-stopped

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    ports:
      - "3000:8080"
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
    volumes:
      - open_webui_data:/app/backend/data
    networks:
      - ai-network
    restart: unless-stopped

  n8n:
    image: n8nio/n8n:latest
    container_name: n8n
    ports:
      - "5678:5678"
    environment:
      - N8N_HOST=192.168.4.88
      - WEBHOOK_URL=http://192.168.4.88:5678/
      - OLLAMA_HOST=http://ollama:11434
      - N8N_SECURE_COOKIE=false
      - N8N_BLOCKS_ENABLE_ALL=true
    volumes:
      - n8n_data:/home/node/.n8n
      - ./data:/home/node/data
    networks:
      - ai-network
    restart: unless-stopped

networks:
  ai-network:

volumes:
  ollama_data:
  n8n_data:
  open_webui_data:

Launch: In your terminal, navigate to the folder and run:

Bash

docker-compose up -d

Step 4: Network Access & URLs

To access your tools from other computers on your local network (Wi-Fi/Ethernet), use the following URLs:

Service	Local Access (Same PC)	Network Access (Other PC)
OpenWebUI	`http://localhost:3000`	`http://192.168.4.88:3000`
n8n	`http://localhost:5678`	`http://192.168.4.88:5678`
Ollama API	`http://localhost:11434`	`http://192.168.4.88:11434`

Step 5: Enabling Web Search

Give Gemma 4 “eyes” on the internet by configuring Web Search in OpenWebUI:

Open http://localhost:3000.
Go to Settings > Web Search.
Toggle Web Search to On.
Set the Search Engine to searxng or google_pse (if using an API key). If you want a zero-config option, use the Tavily or DuckDuckGo providers within the settings list.

Step 6: Recommended Next LLMs

Your 64GB RAM allows for a “Model Zoo.” Here are the next three you should pull:

The Logic King: Qwen 3.5 (32B or 35B MoE) – Alibaba’s Qwen 3.5 is currently the gold standard for n8n automation. It follows instructions perfectly and rarely “breaks” its JSON formatting.
- Command: ollama run qwen3.5:32b
- Why: Use this as your default model inside n8n for reliable tool-calling.
Llama 4 Scout (30B): Best-in-class general reasoning.
- ollama pull llama4:scout
DeepSeek V3.2 (Reasoning): Essential for coding and mathematical logic.
- ollama pull deepseek-v3.2:reasoning
Mistral-Large-2026 (123B-Quantized): With 64GB, you can run a 4-bit quantized version of this giant for near-GPT-4o performance.
- ollama pull mistral-large:q4_k_m

Conclusion

By self-hosting this stack, you’ve created a private, high-speed AI laboratory. Your Ryzen AI processor will handle the heavy lifting, while n8n and OpenWebUI provide the brains and the beauty. Welcome to the future of local computing.