Distorted or Fabricated? A Survey on Hallucination in Video LLMs

📰 ArXiv cs.AI

arXiv:2604.12944v1 Announce Type: cross Abstract: Despite significant progress in video-language modeling, hallucinations remain a persistent challenge in Video Large Language Models (Vid-LLMs), referring to outputs that appear plausible yet contradict the content of the input video. This survey presents a comprehensive analysis of hallucinations in Vid-LLMs and introduces a systematic taxonomy that categorizes them into two core types: dynamic distortion and content fabrication, each comprising

Published 15 Apr 2026
Read full paper → ← Back to Reads