📰 Dev.to · Shuntaro Okuma

Articles from Dev.to · Shuntaro Okuma · 3 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (9392) ArXiv cs.AI Dev.to · FORUM WEB Forbes Innovation Dev.to AI OpenAI News Hugging Face Blog

I Tested 12 LLMs With Few-Shot Examples. The Results Were Not What I Expected.

Dev.to · Shuntaro Okuma 2w ago

I Tested 12 LLMs With Few-Shot Examples. The Results Were Not What I Expected.

In a previous article, I tested 8 models across 4 tasks and reported on "few-shot collapse" — cases...

How I Measure My Dify Chatbot Quality with Scenario Testing

Dev.to · Shuntaro Okuma 2w ago

How I Measure My Dify Chatbot Quality with Scenario Testing

What I did I designed multi-turn conversation scenarios for a Dify chatbot, ran them...

When More Examples Make Your LLM Worse: Discovering Few-Shot Collapse

Dev.to · Shuntaro Okuma 1mo ago

When More Examples Make Your LLM Worse: Discovering Few-Shot Collapse

I found that adding few-shot examples can actively degrade LLM performance. Here are three patterns I discovered and an open-source tool to detect them.