📰 Dev.to · Raihan

3 articles · Updated every 3 hours · View all reads

All Articles 88,433 Blog Posts 108,028 Tech Tutorials 21,941 Research Papers 18,911 News 14,555 ⚡ AI Lessons

I built the first open benchmark for federal contracting AI. Here's what it shows about frontier LLMs.

Dev.to · Raihan 1mo ago

I built the first open benchmark for federal contracting AI. Here's what it shows about frontier LLMs.

Frontier LLMs hallucinate FAR clause numbers somewhere between 0% and 32% of the time. A specialized 150M-parameter model trained in 4 minutes matches Claude Ha

Where small models beat frontier LLMs (and where they don't): a 125M PHI detector

Dev.to · Raihan 1mo ago

Where small models beat frontier LLMs (and where they don't): a 125M PHI detector

Last month I published a 184M-parameter intent classifier that matches frontier LLMs at 22× lower...

Matching frontier LLMs at 22 lower latency: a 184M-parameter intent classifier for healthcare text

Dev.to · Raihan 🧠 Large Language Models ⚡ AI Lesson 1mo ago

Matching frontier LLMs at 22 lower latency: a 184M-parameter intent classifier for healthcare text

How a 184M-parameter DeBERTa fine-tune matches Claude Haiku 4.5 and GPT-4o within 4 points of accuracy on healthcare intent classification at 22× lower latency