Run OpenAI Codex Locally for FREE with Ollama

Mervin Praison · Intermediate ·💻 AI-Assisted Coding ·4h ago

Skills: LLM Foundations80%LLM Engineering60%

Run OpenAI Codex completely free and fully local with Ollama. Your code and data never leave your machine, and there's no Codex subscription to pay for. In this video I walk through setting up Ollama + Codex CLI and Codex Desktop step by step, including the context window settings that actually make it usable. https://docs.ollama.com/integrations/codex https://docs.ollama.com/integrations/codex-app ⏱ Timestamps 0:00 Run Codex free and local with Ollama 0:28 One-line setup overview 0:44 Why Ollama + which model to use (Gemma 3n E2B) 1:12 Install Ollama 1:28 Pull the Gemma 3n E2B model 1:44 Quick test in the terminal 2:06 Install Codex CLI 2:18 Launch Codex with Ollama as the backend 2:43 First task — reading a folder 2:50 Trying a refactor (and where small models hit a wall) 3:06 Switching to full Gemma 3 3:43 Refactor retry on the larger model 4:06 Use Ollama inside Codex Desktop app 4:26 Important: set the context window to 64,000 tokens 4:51 Final notes and trade-offs 🛠 Commands used Install Ollama: curl -fsSL https://ollama.com/install.sh | sh Pull the model: ollama run gemma4:e2b Install Codex CLI: npm install -g @openai/codex Launch Codex with Ollama: ollama launch codex Launch Codex Desktop with Ollama: ollama launch codex-app Inside Codex, switch model: /model gemma3:latest ⚙️ Honest trade-offs - Smaller models like Gemma 4 E2B are great for Q&A about your codebase, but they struggle with real refactors — they'll often hand you the code instead of editing the file. - Larger Gemma 3 handles edits more reliably but needs more RAM. - Increase the context window to 64,000 tokens in Ollama settings — Codex needs it. Default is too small. - Mac Studio 32GB handles this comfortably. Smaller machines will want the smaller model. #OpenAICodex #Ollama #LocalLLM #Gemma3 #AICoding #OpenSource #DeveloperTools This video demonstrates how to run OpenAI Codex locally for free using Ollama, ensuring your data remains private. We'll cover the step-b

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

More on: LLM Foundations

View skill →

Getting Started with Vertex AI Gemini 1.5 Flash

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

How to use the ChatGPT API with Python!!

How to use the ChatGPT API with Python!!

Nicholas Renotte

Gemini 2.5: Create an interactive plot of economic data

Gemini 2.5: Create an interactive plot of economic data

Google DeepMind

LangChain Chatbots: Building a Personalized AI Assistant

LangChain Chatbots: Building a Personalized AI Assistant

Analytics Vidhya

Auto-generating meeting notes with Python

Auto-generating meeting notes with Python

Related AI Lessons

Building an AI-Powered Git Commit & PR Assistant

Learn to build an AI-powered Git commit and PR assistant to streamline your development workflow and improve code quality

Dev.to · Vinayak G Hejib

From prototype to production: the infrastructure nobody tells you about

Learn how to transition your AI-built app from prototype to production by understanding the hidden infrastructure requirements and taking control of your database and code

The Final Boss of Code Is the Future of Vibe Coding

Learn how AI-assisted programming is revolutionizing the coding landscape and why it's not just about laziness

Dev.to · Greg Urbano

The Living Giant Python Syntax and Traps LeetCode Document

Master Python syntax and avoid common traps with this comprehensive LeetCode guide

Dev.to · Tomer Ben David

Chapters (15)

Run Codex free and local with Ollama

0:28 One-line setup overview

0:44 Why Ollama + which model to use (Gemma 3n E2B)

1:12 Install Ollama

1:28 Pull the Gemma 3n E2B model

1:44 Quick test in the terminal

2:06 Install Codex CLI

2:18 Launch Codex with Ollama as the backend

2:43 First task — reading a folder

2:50 Trying a refactor (and where small models hit a wall)

3:06 Switching to full Gemma 3

3:43 Refactor retry on the larger model

4:06 Use Ollama inside Codex Desktop app

4:26 Important: set the context window to 64,000 tokens

4:51 Final notes and trade-offs

How Building with AI Can Double the Throughput of Your Engineering Team — Brian Scanlan, Intercom