Why Bigger Context Windows Make AI Worse

What's AI by Louis-François Bouchard · Beginner ·🧠 Large Language Models ·6h ago

Skills: LLM Engineering80%

► Try out Search Atlas with a 7-day free trial here: https://searchatlas.com/?utm_source=louis_bouchard&utm_medium=influencer_youtube&utm_campaign=q1_inf_cam&utm_content=primary_link ► Our recent webinar on AI engineering: https://youtu.be/ljOwBCdiHmg ► Learn more in our courses and social media: https://links.louisbouchard.ai/ ► My Newsletter (My AI updates and news clearly explained): https://louisbouchard.substack.com/ Chapters: 0:00 Hey! Tap the Thumbs Up button and Subscribe. You'll learn a lot of cool stuff, I promise. 03:22 Why "More Tokens" Means Worse Results 04:52 "Lost in the Middle" Explained 05:36 The Cost & Complexity of Attention (N²) 07:34 1. Deterministic Trimming (Sliding Window) 08:18 2. Source-Level Filtering (Highest Impact) 09:21 3. Mechanical Compaction 10:06 4. Terminal Sequence Collapse 10:50 5. Semantic Summarization (Map Reduce vs. Stuffing) 12:10 6. Retrieval-Based Compaction & Contextual RAG 13:39 7. Knowledge Graphs & Graph RAG 14:40 8. Learned Prompt Compression (LLMLingua) 15:47 9. Multi-Tier Memory (MemGPT) 16:43 10. Agentic Context Engineering (ACE) 17:40 Bonus: Output Optimization Tricks 19:48 Best Practices: When (and When Not) to Compact 21:35 Multi-Agent & Model Routing Strategies 23:44 Actionable: Order of Operations for AI Engineers #aiengineering #contextengineering #compaction

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

More on: LLM Engineering

View skill →

Build an LLM and RAG-based Chat Application using AlloyDB and LangChain

FULLY LOCAL Mistral AI PDF Processing [Hands-on Tutorial]

FULLY LOCAL Mistral AI PDF Processing [Hands-on Tutorial]

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Ultimate Guide: Deploy Google ADK Agents to Vertex AI & Cloud Run (Step-by-Step Tutorial)

Ultimate Guide: Deploy Google ADK Agents to Vertex AI & Cloud Run (Step-by-Step Tutorial)

Shane | LLM Implementation

How to Make an Asteroids Game Bot (LIVE)

How to Make an Asteroids Game Bot (LIVE)

Using Claude Code + Nano Banana Pro To Create a Dataset of Engineering Drawings

Using Claude Code + Nano Banana Pro To Create a Dataset of Engineering Drawings

Automata Learning Lab

Related AI Lessons

The RAG tool that auto-generates Q&A pairs from your documents

Learn to auto-generate Q&A pairs from documents using RAG tool and improve your document management

Dev.to · retrovirusretro

How to Build Secure AI: Implementing Guardrails for Enterprise LLM

Learn to build secure AI by implementing guardrails for enterprise LLMs, going beyond prompt engineering safety for production-ready defense-in-depth architecture

5 Chinese AI tools with 100K+ stars that the West is ignoring

Discover 5 Chinese AI tools with 100K+ stars on GitHub that the Western world is overlooking, and learn how to explore and utilize them

OpenAI claims it solved an 80-year-old math problem — for real this time

OpenAI's reasoning model claims to have solved an 80-year-old math problem, with mathematicians verifying its solution

Chapters (18)

Hey! Tap the Thumbs Up button and Subscribe. You'll learn a lot of cool stuff, I

3:22 Why "More Tokens" Means Worse Results

4:52 "Lost in the Middle" Explained

5:36 The Cost & Complexity of Attention (N²)

7:34 1. Deterministic Trimming (Sliding Window)

8:18 2. Source-Level Filtering (Highest Impact)

9:21 3. Mechanical Compaction

10:06 4. Terminal Sequence Collapse

10:50 5. Semantic Summarization (Map Reduce vs. Stuffing)

12:10 6. Retrieval-Based Compaction & Contextual RAG

13:39 7. Knowledge Graphs & Graph RAG

14:40 8. Learned Prompt Compression (LLMLingua)

15:47 9. Multi-Tier Memory (MemGPT)

16:43 10. Agentic Context Engineering (ACE)

17:40 Bonus: Output Optimization Tricks

19:48 Best Practices: When (and When Not) to Compact

21:35 Multi-Agent & Model Routing Strategies

23:44 Actionable: Order of Operations for AI Engineers

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)