AI Benchmarks Are Lying to You? I Tested 8 Models

Name: AI Benchmarks Are Lying to You? I Tested 8 Models
Uploaded: 2025-12-28T16:27:50+00:00
Channel: Next Tech and AI
Description: Synthetic benchmarks are lying to you. When the newest "State of the Art" AI scores 100% on tests but fails to plan a safe mountain climb, those numbers...

Next Tech and AI · Advanced ·🧠 Large Language Models ·3mo ago

Synthetic benchmarks are lying to you. When the newest "State of the Art" AI scores 100% on tests but fails to plan a safe mountain climb, those numbers are worthless. I threw the leaderboards in the trash and tested 8 top AI models on REAL problems to find the actual winner. In this video, I compare the biggest updates from OpenAI, Google, xAI, and Anthropic against open-source contenders and even a local model running offline on my PC. The results regarding ChatGPT-5.2 were shocking. 📥 Get my Test Prompts for FREE (No Paywall): https://www.patreon.com/posts/146852078/ 📺 Watch next: Why I…

Watch on YouTube ↗ (saves to browser)

Chapters (7)

Why benchmarks are lying

1:18 The 8 Models & Testing Methodology

2:09 ChatGPT-5.2

6:29 Gemini 3 Pro Thinking

8:20 Grok 4.1 Beta

9:14 Claude Opus 4.5

10:17 Perplexity (Th

Next Up

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)