📰 Dev.to · Vilius
38 articles · Updated every 3 hours · View all reads
All
Articles 81,788Blog Posts 105,415Tech Tutorials 19,885Research Papers 17,833News 13,908
⚡ AI Lessons

Dev.to · Vilius
🧠 Large Language Models
⚡ AI Lesson
2w ago
We Asked 10 LLMs to Write Efficient Code. Only 4 Got Better.
By Vilius Vystartas | May 2026 Every LLM can write code that works. The question is: can they write...

Dev.to · Vilius
💻 AI-Assisted Coding
⚡ AI Lesson
2w ago
10 Models Tested: From 81.6% to 10%. The Free Tier is a Full-On Gamble.
By Vilius Vystartas | May 2026 I tested another 10 models across the same 10 agent coding tasks....

Dev.to · Vilius
2w ago
I Tested 10 More Models. Five Brand New Families Debuted. None Scored Below 75%.
By Vilius Vystartas | May 2026 I ran another 10 models through the same agent coding benchmark. Five...

Dev.to · Vilius
2w ago
Two Models Just Hit 90% on Agent Coding. One Cost Less Than a Penny.
By Vilius Vystartas | May 2026 Ten more models through the same 10 agent coding tasks. Two tied the...

Dev.to · Vilius
2w ago
The Hype Correction
Weekly roundup, May 23, 2026 Google and Microsoft just told us the same thing from opposite...

Dev.to · Vilius
🤖 AI Agents & Automation
⚡ AI Lesson
3w ago
$0.08 and 3,500 Lines: The Complete Failure of a Deterministic Agent Harness
I have a theory about why agent suggestions land so heavy. It's not that the suggestions are good....

Dev.to · Vilius
3w ago
The Protocol Stack Nobody Talks About
Six agent protocols launched in the last year. Everyone's obsessing over model selection. The...

Dev.to · Vilius
3w ago
Build It, Then Kill It
The hardest thing after building agent infrastructure for a few months isn't building more. It's...

Dev.to · Vilius
🤖 AI Agents & Automation
⚡ AI Lesson
3w ago
Power Sockets Don't Need Certification — and Neither Should Agent Infrastructure
I'm tired of talking about plumbing. Every conversation about AI agents right now is about...

Dev.to · Vilius
3w ago
I Tested 6 Local Models on Real Agent Tasks. The Best Scored 50%.
I had a SmolLM3-3B running on my laptop. It scored 93.3% on my code quality benchmark. I thought I...

Dev.to · Vilius
4w ago
My Agent Kept Forgetting Everything. My Hand Was Forced.
Agent Autopsy, Day 8 My agent ran benchmarks yesterday evening and then lost the plot. Tried to...

Dev.to · Vilius
1mo ago
Benchmark Results: SmolLM3 3B, Phi-4-mini, DeepSeek V4, Grok 4.20 — Agent Coding Tested
The second round of the Works With Agents agent coding benchmark is in — 32 models tested this time,...

Dev.to · Vilius
1mo ago
We Tested 10 Untested LLMs on Agent Coding — The Results Are In
We Tested 10 Untested LLMs on Agent Coding — The Results Are In Yesterday I promised to...

Dev.to · Vilius
1mo ago
The $0 Agent: My 2GB Local Model Beat Claude
The $0 Agent: My 2GB Local Model Beat Claude Agent learns fast — Day 11 I ran an agent...

Dev.to · Vilius
🧠 Large Language Models
⚡ AI Lesson
1mo ago
Benchmarking 10 Untested LLMs Tonight — DeepSeek V4, Grok 4.20, GPT-5.5 Pro
Tonight at 23:00 BST we're running fresh benchmarks on 10 LLMs we haven't tested before. The...

Dev.to · Vilius
1mo ago
My Agent Said It Would Fix the Width. It Rebuilt the Whole Site Instead.
I asked my agent to fix the width on one page. It replied with a confident plan — headers,...

Dev.to · Vilius
1mo ago
We Built an API. Nobody Used It.
A post by Vilius

Dev.to · Vilius
1mo ago
I Broke My Website. Then I Fixed It. Then My Fix Broke It Again.
Agent Autopsy, Day 4 I broke my website today. Not dramatically — just a small fix. A newsletter...

Dev.to · Vilius
1mo ago
How we almost wrote off 3 models as broken — the thinking-mode tax
How we almost wrote off 3 models as broken — the thinking-mode tax By Vilius Vystartas |...

Dev.to · Vilius
1mo ago
1-bit, 545 megabytes, zero API keys — local AI that beats GPT-5.4
1-bit, 545 megabytes, zero API keys — local AI that beats GPT-5.4 By Vilius Vystartas |...

Dev.to · Vilius
1mo ago
We benchmarked 10 LLMs on 10 real agent coding tasks — here are the results
We benchmarked 10 LLMs on 10 real agent coding tasks — here are the results By Vilius...

Dev.to · Vilius
🧠 Large Language Models
⚡ AI Lesson
1mo ago
I Ran 5 LLMs Through 10 Real Agent Coding Tasks. The Free One Won.
What I Tested I gave 5 models the same 10 coding tasks — not LeetCode, not trivia. Tasks...

Dev.to · Vilius
1mo ago
AI Agents Are Finding Bugs in Your Tools. Here's How to Get Notified First.
The Shift Nobody's Talking About Developers are deploying autonomous AI agents that scan...

Dev.to · Vilius
1mo ago
How to Give Your AI Agent a Shared Memory — in 3 Lines
The Problem My agent spent 45 minutes debugging a Python install flag. It found the fix —...
DeepCamp AI