✕ Clear all filters
26 articles

📰 Dev.to · Papers Mache

26 articles · Updated every 3 hours · View all reads

All Articles 83,455Blog Posts 106,014Tech Tutorials 20,421Research Papers 17,847News 14,028 ⚡ AI Lessons
KV cache eviction improves long‑context performance
Dev.to · Papers Mache 📐 ML Fundamentals 📄 Paper ⚡ AI Lesson 3w ago
KV cache eviction improves long‑context performance
A learned, globally‑calibrated KV‑cache eviction policy can shave memory usage and, paradoxically,...
Self-evolving retrieval lifts benchmark scores 25%
Dev.to · Papers Mache 📄 Paper 3w ago
Self-evolving retrieval lifts benchmark scores 25%
Agents that adapt their retrieval configurations while running deliver roughly a quarter more...
AI/ML Research Digest — May 16, 2026
Dev.to · Papers Mache 📄 Paper 3w ago
AI/ML Research Digest — May 16, 2026
Distillation + low‑rank tricks cut compute Combining knowledge distillation with low‑rank adapters...
Shared expert pool reduces parameters while maintaining performance
Dev.to · Papers Mache 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
Shared expert pool reduces parameters while maintaining performance
Conventional mixture‑of‑experts designs hand each transformer layer its own private expert set,...
HERMES++ answers language queries while predicting roads
Dev.to · Papers Mache 📄 Paper 1mo ago
HERMES++ answers language queries while predicting roads
The prevailing view has been that autonomous‑driving world models must choose between two extremes: a...
Entropy of first token predicts hallucinations
Dev.to · Papers Mache 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
Entropy of first token predicts hallucinations
The entropy of the very first content‑bearing token already separates factual answers from...
AI/ML Research Digest — May 09, 2026
Dev.to · Papers Mache 📄 Paper 1mo ago
AI/ML Research Digest — May 09, 2026
Diffusion as a unifying backbone for multimodal generation Latent diffusion now drives both image...
Distillation that keeps confidence honest
Dev.to · Papers Mache 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
Distillation that keeps confidence honest
On‑policy distillation has become the go‑to recipe for squeezing a large language model’s...
Diffusion models approach AR quality and improve inference speed
Dev.to · Papers Mache 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
Diffusion models approach AR quality and improve inference speed
Diffusion language models have long promised parallel generation, yet their serving speed has lagged...
Flux Attention halves inference cost on long contexts
Dev.to · Papers Mache 📄 Paper 1mo ago
Flux Attention halves inference cost on long contexts
Dynamic sparse routing now delivers two‑ to three‑fold speedups on long‑context inference while...
Fast edit loops improve AI document workflow
Dev.to · Papers Mache 🛠️ AI Tools & Apps 📄 Paper ⚡ AI Lesson 1mo ago
Fast edit loops improve AI document workflow
The moment you hit “regenerate” and watch a 30‑second spinner eat your momentum, the allure of...
Hierarchical skill KB improves performance of weaker models
Dev.to · Papers Mache 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
Hierarchical skill KB improves performance of weaker models
The dominant paradigm for teaching autonomous language‑model agents is to let each instance wander...
Adaptive reasoning reduces token usage up to 90% with minimal accuracy loss
Dev.to · Papers Mache 📄 Paper 1mo ago
Adaptive reasoning reduces token usage up to 90% with minimal accuracy loss
Adaptive reasoning formats that let a model decide on the fly which reasoning steps are truly needed...
Tiny weight edits improve LLM safety
Dev.to · Papers Mache 📄 Paper 1mo ago
Tiny weight edits improve LLM safety
Targeted tweaks to specific attention heads can slash jailbreak success rates by several‑fold (e.g.,...
Micro LM delivers large‑model quality on device
Dev.to · Papers Mache 📄 Paper 1mo ago
Micro LM delivers large‑model quality on device
Edge assistants have been forced to choose between a responsive first word and a thoughtful complete...
Stateless scheduler doubles LLM training speed
Dev.to · Papers Mache 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
Stateless scheduler doubles LLM training speed
Fine‑tuning a 10 B‑parameter model on a single RTX 4090 feels like watching paint dry—most of the GPU...
VideoLLM runs live video QA at 2 FPS
Dev.to · Papers Mache 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
VideoLLM runs live video QA at 2 FPS
Most video‑large language models still operate on pre‑recorded clips, pausing after each inference....
AI agent logs expose reproducibility gaps
Dev.to · Papers Mache 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 1mo ago
AI agent logs expose reproducibility gaps
Across dozens of repeated executions, the same autonomous agent can flip from success to failure by a...
Post‑training tricks cut LLM cost without losing ability
Dev.to · Papers Mache 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
Post‑training tricks cut LLM cost without losing ability
Recent work shows that aligning synthetic data with a student’s style can recover reasoning ability...
AI/ML Research Digest — Apr 18, 2026
Dev.to · Papers Mache 📄 Paper 1mo ago
AI/ML Research Digest — Apr 18, 2026
Semantic and Adaptive Evaluation of LLMs Recent work moves past word‑overlap scores toward...