📰 Dev.to · Alankrit Verma

7 articles · Updated every 3 hours · View all reads

All Articles 92,463 Blog Posts 110,488 Tech Tutorials 23,238 Research Papers 19,242 News 14,919 ⚡ AI Lessons

The Last Pivot: Why Quality Gates Killed My Final KV-Cache Speedup

Dev.to · Alankrit Verma 1mo ago

The Last Pivot: Why Quality Gates Killed My Final KV-Cache Speedup

I wanted to answer one question: After packed-codebook TurboQuant failed, was there still a...

Beating Eager TurboQuant Was Not Enough: Why Dense GPU Attention Still Won

Dev.to · Alankrit Verma 1mo ago

Beating Eager TurboQuant Was Not Enough: Why Dense GPU Attention Still Won

I wanted to answer one question: If I remove eager overhead, can a TurboQuant-style compressed...

When A Good Approximation Still Loses

Dev.to · Alankrit Verma 1mo ago

When A Good Approximation Still Loses

This is Part 2 of a two-part technical write-up. Part 1 ended with the key architecture lesson: A...

A Smaller KV Cache Did Not Make Transformers Faster

Dev.to · Alankrit Verma 🧠 Large Language Models ⚡ AI Lesson 1mo ago

A Smaller KV Cache Did Not Make Transformers Faster

Long-context generation makes the KV cache hard to ignore. Every generated token reuses keys and...

Synthetic Population Testing for Recommendation Systems

Dev.to · Alankrit Verma 2mo ago

Synthetic Population Testing for Recommendation Systems

Offline evaluation is necessary for recommender systems. It is also not a full test of recommender...

Why Offline Evaluation Is Not Enough for Recommendation Systems?

Dev.to · Alankrit Verma 2mo ago

Why Offline Evaluation Is Not Enough for Recommendation Systems?

Why Offline Evaluation Is Not Enough for Recommendation Systems Offline evaluation is...

How GenAI Genesis Began

Dev.to · Alankrit Verma 3mo ago

How GenAI Genesis Began

TL;DR Alankrit Verma came to the University of Toronto as a shy, math-driven student on scholarship...