Enhancing the Safety of Medical Vision-Language Models by Synthetic Demonstrations
📰 ArXiv cs.AI
arXiv:2506.09067v2 Announce Type: replace-cross Abstract: Generative medical vision-language models~(Med-VLMs) are primarily designed to generate complex textual information~(e.g., diagnostic reports) from multimodal inputs including vision modality~(e.g., medical images) and language modality~(e.g., clinical queries). However, their security vulnerabilities remain underexplored. Med-VLMs should be capable of rejecting harmful queries, such as \textit{Provide detailed instructions for using this
DeepCamp AI