I Increased Retrieval From Top-5 to Top-20. My Answers Got Worse
📰 Dev.to · Md Ayan Arshad
Improving RAG retrieval quality by increasing candidate retrieval can lead to worse answers if not properly filtered
Action Steps
- Retrieve top-20 candidates using a vector database like Faiss or Pinecone
- Filter the retrieved candidates using a ranking model or a simple heuristic like BM25
- Evaluate the impact of increased retrieval on answer quality using metrics like accuracy or F1-score
- Adjust the filtering threshold or ranking model to optimize answer quality
- Test the updated RAG system with a new set of questions to validate the improvements
Who Needs to Know This
This lesson is beneficial for NLP engineers and data scientists working on question-answering systems, as it highlights the importance of filtering in RAG retrieval
Key Insight
💡 More retrieval candidates do not always lead to better answers, proper filtering is crucial
Share This
🤖 Increasing RAG retrieval candidates from top-5 to top-20 can lead to worse answers if not properly filtered 📊
DeepCamp AI