How to Share Your Local GPU for AI Inference Online Using Zrok | Simple Setup Guide
Use Your Local GPU for Llama 3 Inference from Anywhere! | Ollama + Zrok Setup
In this video, I’ll show you how to run Llama 3 (or any Ollama model) on your local GPU and make it accessible from anywhere using Zrok. This is perfect if you want to leverage your powerful home GPU for LLM inference—without paying for expensive cloud GPUs!
Why do this?
By combining Ollama with Zrok, you can expose your local LLM server (running on your own hardware) to the public internet, so remote servers or clients can send requests directly to your local GPU.
What do the commands do?
export OLLAMA_HOST="http:…
Watch on YouTube ↗
(saves to browser)
DeepCamp AI