IMCBench: A benchmark for multimodal LLMs in Image-grounded Medical Conversations

본문 미리보기

arXiv:2606.28556v1 Announce Type: new Abstract: Recent advances in large language models and vision-language models have enabled reasoning over multimodal data, offering opportunities for clinical applications such as decision support and triaging. However, existing medical AI benchmarks are fragmented: some support multi-turn dialogues but lack images, while others provide multimodal inputs but focus on single-turn QA tasks. To address this gap, we introduce IMCBench, an image-grounded, multi-

IMCBench: A benchmark for multimodal LLMs in Image-grounded Medical Conversations

본문 미리보기

관련 글

What Drives Interactive Improvement from Feedback?

Contrastive Reflection for Iterative Prompt Optimization

How Can AI Find My Model? A Model-Finding Experimental Study Considering Data Formats, Embeddings, and Retrieval Strategies

BayesBench: Evaluating LLM Belief Trajectories Under Multi-Turn Evidence Accumulation