Local AI Series

AI is no longer exclusive to data centers and cloud services. With the right software and a decent PC, you can run powerful AI models right from your desktop. This means you can create content, analyze data, and experiment with cutting-edge technology without an internet connection, subscription fees, or privacy concerns.
The future of AI isn’t just in the datacenters or in the cloud; it’s right on your desktop, phones and other endpoint devices. Running AI models locally offers unparalleled privacy, speed, and creative freedom. But with so many options from AMD, Intel, Apple and NVIDIA, how do you choose the best “Local AI PC” for you? Let’s break down the leading architectures and what they bring to the table.
You don’t need a supercomputer to run AI models, but the right software makes all the difference. Articles below talk about Local AI and my own learning journey.
Quick Links: Resources for Learning AI | Keep up with AI | List of AI Tools | Local AI | AI Agents | Future of Work
Recent Modern EUC Related Posts:
-

The Era of “No-Code” Productivity: How It Works
By Jorge Pereira • 2026-06-08Imagine a highly skilled, tireless digital employee sitting next to you. You do not need to teach it Python, you do not need to show it how to navigate APIs, and you do not need to write a single line of code to get it to build a workflow. You just talk to it. This… -

Beyond the Chatbot: A Look at My Current AI Lab
By Jorge Pereira • 2026-06-06I get asked all the time: “What are you actually using for your AI Lab?” To answer that, I wanted to provide an update on how I’m building, testing, and deploying. But first, a little background. I’m a technology consultant and services principal, and I’ve been developing custom software since I was 15. I’ve been deep… -

Unlocking the NPU: FastFlowLM
By Jorge Pereira • 2026-05-31How I Bypassed Ollama and LM studio Limitations on my Ryzen AI NPU to Hit 50+ TPS If you recently purchased a modern AI-PC, you bought into a promising vision: a dedicated, cutting-edge Neural Processing Unit (NPU) sitting right inside your silicon, designed to stream large language models (LLMs) smoothly without draining your battery or… -

Intel’s Panther Lake Architecture – Pay Attention
By Jorge Pereira • 2026-05-29The next generation of AI-capable laptops is being shaped by a new set of requirements. Users increasingly expect thin-and-light systems to handle demanding workloads such as local AI assistants, content creation, software development, and advanced productivity applications without sacrificing battery life or portability. Intel’s upcoming Panther Lake platform, expected to launch as part of the… -

Prompt Engineering Era Is Officially Behind Us
By Jorge Pereira • 2026-05-24Here is your comprehensive, fully revised, and complete blog post. For years, the discourse around Artificial Intelligence was dominated by the idea of the “Prompt Engineer”—the person who spent their time hunting for the exact combination of words, trigger phrases, and formatting tricks to coax a brittle model into producing a decent answer. If you… -

Digital Coworkers: The New AI Teammates for Work
By Jorge Pereira • 2026-05-17The workplace is moving beyond chatbots and into a new category of software: digital coworkers. These are AI systems that do more than answer questions; they can execute background tasks, manage files, coordinate workflows, and keep working on complex jobs with less step-by-step prompting. What makes this shift important is simple: instead of asking AI… -

LiteLLM – To Centrally Manage Multiple LLM Providers
By Jorge Pereira • 2026-05-16There was a time when choosing an LLM provider was simple: you grabbed an OpenAI API key, plugged it into your environment variables, and started building. But the landscape has fundamentally shifted. Today, building production-ready AI agents or managing complex enterprise workflows requires navigating a sprawling, fragmented ecosystem. On any given day, your architecture might… -

The Rise of the Enterprise Token Broker
By Jorge Pereira • 2026-05-14As enterprises scale their AI operations from experimental “playgrounds” to full-scale agentic workflows, a new bottleneck has emerged: Token Controlling and API Key Chaos. With teams of 6–10 developers or automated agents hitting multiple providers (OpenAI, Anthropic, Gemini) and local servers simultaneously, managing individual accounts is no longer viable. Enter the AI Gateway—the centralized “Token… -

Cloud AI vs Local AI – Cost Comparision
By Jorge Pereira • 2026-05-14Back in 2024 I wrote a blog post: How Much Does It Cost to Operate AI ChatBots? As we move into the new era of the token economy, the conversations, about tokens costs and power are very much part of the story. A useful model must account for real model pricing, utilization, infrastructure, performance, and… -

Windows 11 Taskbar: Now Open to AI Agents
By Jorge Pereira • 2026-05-14Starting with the May 2026 security update, any developer can register AI agents via Windows.UI.Shell.Tasks API to appear alongside Copilot. Users invoke with “@” in search, monitor chain-of-thought via hover. This positions Windows as the OS for agents, not just AI features. In a move bigger than another Copilot tweak, Microsoft is turning Windows 11… -

Small Local Models: Why Tiny AI Is Having a Big Moment
By Jorge Pereira • 2026-05-06The narrative around artificial intelligence has long been dominated by bigger equals better. Models with trillions of parameters, trained on internet-scale datasets, powered by massive GPU clusters—these were the benchmarks of progress. But something interesting is happening at the other end of the spectrum. Small models—measured in billions, not trillions of parameters—are proving they can… -

Choosing Your Vector Search Infrastructure
By Jorge Pereira • 2026-04-24To learn more about Local AI topics, check out related posts in the Local AI Series In the landscape of Generative AI, Retrieval-Augmented Generation (RAG), and semantic search systems, vector databases have shifted from niche machine learning tooling to core backend infrastructure. At the center of almost every architectural evaluation sits a classic engineering dilemma: Should… -

AI: Don’t Just Chat — Practical Roadmap
By Jorge Pereira • 2026-04-22For the last few months, I’ve been working on a new book series called AI: Don’t Just Chat. I wrote it because the pace of AI has become exhausting for a lot of people. The tools change constantly. The advice shifts. What worked a few months ago can already feel outdated. Even people who are deep… -

Agent Zero FAISS Memory Error: What It Means, What to Keep, and What to Reset
By Jorge Pereira • 2026-04-21To learn more about Local AI topics, check out related posts in the Local AI Series If you are using Agent Zero and suddenly see this error, you are not alone: “`text ValueError: Could not find document for id 6oqX1vwBxV, got ID 6oqX1vwBxV not found. “` This problem can show up repeatedly, sometimes with different document…
…
