MACD: Model-Aware Contrastive Decoding via Counterfactual Data
📰 ArXiv cs.AI
arXiv:2602.01740v3 Announce Type: replace Abstract: Video language models (Video-LLMs) are prone to hallucinations, generating plausible but ungrounded content when visual evidence is weak, ambiguous, or biased. Existing methods, such as contrastive decoding (CD), rely on random perturbations to construct contrastive data for hallucination mitigation, but often fail to target the visual cues that drive hallucination or align with model weaknesses. We propose Model-Aware Counterfactual Data based
DeepCamp AI