Faithfulness-QA: A Counterfactual Entity Substitution Dataset for Training Context-Faithful RAG Models

📰 ArXiv cs.AI

arXiv:2604.25313v2 Announce Type: cross Abstract: Retrieval-Augmented Generation (RAG) models frequently produce answers grounded in parametric memory rather than the retrieved context, undermining the core promise of retrieval augmentation. A fundamental obstacle to fixing this unfaithfulness is the lack of training data that explicitly requires models to prefer context over internal knowledge. We introduce Faithfulness-QA, a large-scale dataset of 99,094 samples constructed through counterfact

Published 29 Apr 2026

Read full paper → ← Back to Reads