AI Benchmarks Are Lying to You? I Tested 8 Models
Synthetic benchmarks are lying to you. When the newest "State of the Art" AI scores 100% on tests but fails to plan a safe mountain climb, those numbers are worthless. I threw the leaderboards in the trash and tested 8 top AI models on REAL problems to find the actual winner.
In this video, I compare the biggest updates from OpenAI, Google, xAI, and Anthropic against open-source contenders and even a local model running offline on my PC. The results regarding ChatGPT-5.2 were shocking.
📥 Get my Test Prompts for FREE (No Paywall): https://www.patreon.com/posts/146852078/
📺 Watch next: Why I…
Watch on YouTube ↗
(saves to browser)
Chapters (7)
Why benchmarks are lying
1:18
The 8 Models & Testing Methodology
2:09
ChatGPT-5.2
6:29
Gemini 3 Pro Thinking
8:20
Grok 4.1 Beta
9:14
Claude Opus 4.5
10:17
Perplexity (Th
DeepCamp AI