Faithfulness-QA: A Counterfactual Entity Substitution Dataset for Training Context-Faithful RAG Models
📰 ArXiv cs.AI
arXiv:2604.25313v2 Announce Type: cross Abstract: Retrieval-Augmented Generation (RAG) models frequently produce answers grounded in parametric memory rather than the retrieved context, undermining the core promise of retrieval augmentation. A fundamental obstacle to fixing this unfaithfulness is the lack of training data that explicitly requires models to prefer context over internal knowledge. We introduce Faithfulness-QA, a large-scale dataset of 99,094 samples constructed through counterfact
DeepCamp AI