FACT-E: Causality-Inspired Evaluation for Trustworthy Chain-of-Thought Reasoning
📰 ArXiv cs.AI
arXiv:2604.10693v1 Announce Type: new Abstract: Chain-of-Thought (CoT) prompting has improved LLM reasoning, but models often generate explanations that appear coherent while containing unfaithful intermediate steps. Existing self-evaluation approaches are prone to inherent biases: the model may confidently endorse coherence even when the step-to-step implication is not valid, leading to unreliable faithfulness evaluation. We propose FACT-E, a causality-inspired framework for evaluating CoT qual
DeepCamp AI