Focus Matters: Phase-Aware Suppression for Hallucination in Vision-Language Models
📰 ArXiv cs.AI
arXiv:2604.03556v1 Announce Type: cross Abstract: Large Vision-Language Models (LVLMs) have achieved impressive progress in multimodal reasoning, yet they remain prone to object hallucinations, generating descriptions of objects that are not present in the input image. Recent approaches attempt to mitigate hallucinations by suppressing unreliable visual signals in the vision encoder, but many rely on iterative optimization for each input, resulting in substantial inference latency. In this work,
DeepCamp AI