Local AI Series

AI is no longer exclusive to data centers and cloud services. With the right software and a decent PC, you can run powerful AI models right from your desktop. This means you can create content, analyze data, and experiment with cutting-edge technology without an internet connection, subscription fees, or privacy concerns.

The future of AI isn’t just in the datacenters or in the cloud; it’s right on your desktop, phones and other endpoint devices. Running AI models locally offers unparalleled privacy, speed, and creative freedom. But with so many options from AMD, Intel, Apple and NVIDIA, how do you choose the best “Local AI PC” for you? Let’s break down the leading architectures and what they bring to the table.

You don’t need a supercomputer to run AI models, but the right software makes all the difference. Articles below talk about Local AI and my own learning journey.

Recent Modern EUC Related Posts:

Intel’s Panther Lake Architecture – Pay Attention

By Jorge Pereira • 2026-05-29

The next generation of AI-capable laptops is being shaped by a new set of requirements. Users increasingly expect thin-and-light systems to handle demanding workloads such as local AI assistants, content creation, software development, and advanced productivity applications without sacrificing battery life or portability. Intel’s upcoming Panther Lake platform, expected to launch as part of the…

Read More
Unlocking the NPU: FastFlowLM

By Jorge Pereira • 2026-05-24

How I Bypassed Ollama and LM studio Limitations on my Ryzen AI NPU to Hit 50+ TPS If you recently purchased a modern AI-PC, you bought into a promising vision: a dedicated, cutting-edge Neural Processing Unit (NPU) sitting right inside your silicon, designed to stream large language models (LLMs) smoothly without draining your battery or…

Read More
Digital Coworkers: The New AI Teammates for Work

By Jorge Pereira • 2026-05-17

The workplace is moving beyond chatbots and into a new category of software: digital coworkers. These are AI systems that do more than answer questions; they can execute background tasks, manage files, coordinate workflows, and keep working on complex jobs with less step-by-step prompting. What makes this shift important is simple: instead of asking AI…

Read More
- Tech Talk
LiteLLM – To Centrally Manage Multiple LLM Providers

By Jorge Pereira • 2026-05-16

There was a time when choosing an LLM provider was simple: you grabbed an OpenAI API key, plugged it into your environment variables, and started building. But the landscape has fundamentally shifted. Today, building production-ready AI agents or managing complex enterprise workflows requires navigating a sprawling, fragmented ecosystem. On any given day, your architecture might…

Read More
- Tech Talk
The Rise of the Enterprise Token Broker

By Jorge Pereira • 2026-05-14

As enterprises scale their AI operations from experimental “playgrounds” to full-scale agentic workflows, a new bottleneck has emerged: Token Controlling and API Key Chaos. With teams of 6–10 developers or automated agents hitting multiple providers (OpenAI, Anthropic, Gemini) and local servers simultaneously, managing individual accounts is no longer viable. Enter the AI Gateway—the centralized “Token…

Read More
Cloud AI vs Local AI – Cost Comparision

By Jorge Pereira • 2026-05-14

Back in 2024 I wrote a blog post: How Much Does It Cost to Operate AI ChatBots? As we move into the new era of the token economy, the conversations, about tokens costs and power are very much part of the story. A useful model must account for real model pricing, utilization, infrastructure, performance, and…

Read More
Windows 11 Taskbar: Now Open to AI Agents

By Jorge Pereira • 2026-05-14

Starting with the May 2026 security update, any developer can register AI agents via Windows.UI.Shell.Tasks API to appear alongside Copilot. Users invoke with “@” in search, monitor chain-of-thought via hover. This positions Windows as the OS for agents, not just AI features. In a move bigger than another Copilot tweak, Microsoft is turning Windows 11…

Read More
Small Local Models: Why Tiny AI Is Having a Big Moment

By Jorge Pereira • 2026-05-06

The narrative around artificial intelligence has long been dominated by bigger equals better. Models with trillions of parameters, trained on internet-scale datasets, powered by massive GPU clusters—these were the benchmarks of progress. But something interesting is happening at the other end of the spectrum. Small models—measured in billions, not trillions of parameters—are proving they can…

Read More
- Tech Talk
Choosing Your Vector Search Infrastructure

By Jorge Pereira • 2026-04-24

To learn more about Local AI topics, check out related posts in the Local AI Series In the landscape of Generative AI, Retrieval-Augmented Generation (RAG), and semantic search systems, vector databases have shifted from niche machine learning tooling to core backend infrastructure. At the center of almost every architectural evaluation sits a classic engineering dilemma: Should…

Read More
AI: Don’t Just Chat — Practical Roadmap

By Jorge Pereira • 2026-04-22

For the last few months, I’ve been working on a new book series called AI: Don’t Just Chat. I wrote it because the pace of AI has become exhausting for a lot of people. The tools change constantly. The advice shifts. What worked a few months ago can already feel outdated. Even people who are deep…

Read More
- Tech Talk
Agent Zero FAISS Memory Error: What It Means, What to Keep, and What to Reset

By Jorge Pereira • 2026-04-21

To learn more about Local AI topics, check out related posts in the Local AI Series If you are using Agent Zero and suddenly see this error, you are not alone: “`text ValueError: Could not find document for id 6oqX1vwBxV, got ID 6oqX1vwBxV not found. “` This problem can show up repeatedly, sometimes with different document…

Read More
Meet Gemma 4: Architecture, Origins, and What It Means for Open AI Models

By Jorge Pereira • 2026-04-09

I just posted yesterday Local AI Sovereignty: Deploying Ollama, Gemma 4, OpenWebUI, and n8n and I used Gemma 4 locally on Ollama. Someone asked me a good question: What and Why Gemma 4 Large language models (LLMs) have rapidly evolved from research concepts into foundational tools for modern work. Gemma 4 represents one of the…

Read More
What Kind of Computer Do I need to run Gemma 4 Locally

By Jorge Pereira • 2026-04-05

I just posted yesterday Local AI Sovereignty: Deploying Ollama, Gemma 4, OpenWebUI, and n8n and I used Gemma 4 locally on Ollama. Someone asked me a good question: What size computer (Windows PC) do I need tor it? What size computer do you need for Gemma 4? Google’s April 2026 release of Gemma 4 changed…

Read More
Local AI Sovereignty: Deploying Ollama, Gemma 4, OpenWebUI, and n8n

By Jorge Pereira • 2026-04-04

With 64GB of RAM and the latest Ryzen AI silicon, you are no longer a mere consumer of AI—you are a host. This setup leverages AMD’s XDNA architecture to run Gemma 4 and / or and Qwen 3.5 locally, ensuring your data never leaves your machine while providing a professional-grade automation suite via Docker. This…

Read More