VL-Calibration: Decoupled Confidence Calibration for Large Vision-Language Models Reasoning

📰 ArXiv cs.AI

arXiv:2604.09529v1 Announce Type: cross Abstract: Large Vision Language Models (LVLMs) achieve strong multimodal reasoning but frequently exhibit hallucinations and incorrect responses with high certainty, which hinders their usage in high-stakes domains. Existing verbalized confidence calibration methods, largely developed for text-only LLMs, typically optimize a single holistic confidence score using binary answer-level correctness. This design is mismatched to LVLMs: an incorrect prediction m

Published 13 Apr 2026

Read full paper → ← Back to Reads