📰 Dev.to · Billy Bob Gurr

Articles from Dev.to · Billy Bob Gurr · 2 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (26832) ArXiv cs.AI Dev.to AI Medium · Programming Medium · AI Medium · Machine Learning Medium · Cybersecurity

When I started running models locally, I thought quantization meant squeezing more into RAM. Turns o

Dev.to · Billy Bob Gurr 2d ago

When I started running models locally, I thought quantization meant squeezing more into RAM. Turns o

Most people default to Q4_K_M in llama.cpp because it's the "safe" choice. But I've found the real...

Why GPU Memory Bandwidth Matters More Than VRAM for Local LLMs

Dev.to · Billy Bob Gurr 🧠 Large Language Models ⚡ AI Lesson 3d ago

Why GPU Memory Bandwidth Matters More Than VRAM for Local LLMs

You've probably read that you need a GPU with tons of VRAM to run local models. That's true, but only...