VeriSim: A Configurable Framework for Evaluating Medical AI Under Realistic Patient Noise

📰 ArXiv cs.AI

arXiv:2604.10441v1 Announce Type: new Abstract: Medical large language models (LLMs) achieve impressive performance on standardized benchmarks, yet these evaluations fail to capture the complexity of real clinical encounters where patients exhibit memory gaps, limited health literacy, anxiety, and other communication barriers. We introduce VeriSim, a truth-preserving patient simulation framework that injects controllable, clinically evidence-grounded noise into patient responses while maintainin

Published 14 Apr 2026
Read full paper → ← Back to Reads