U2-BENCH: Benchmarking Large Vision-Language Models on Ultrasound Understanding

要約

超音波は、グローバルなヘルスケアにとって重要なイメージングモダリティですが、オペレーター、ノイズ、解剖学的構造の画質が変化するため、その解釈は依然として困難です。
大規模なビジョン言語モデル（LVLMS）は、自然および医療ドメイン全体で印象的なマルチモーダル機能を実証していますが、超音波でのパフォーマンスはほとんど未踏のままです。
分類、検出、回帰、およびテキスト生成タスク全体で、超音波理解に関するLVLMSを評価する最初の包括的なベンチマークであるU2ベンチを紹介します。
U2ベンチ集約7,241の15の解剖学的領域にまたがるケースと、50の超音波アプリケーションシナリオにわたって、診断、ビュー認識、病変局在、臨床価値推定、レポート生成など、臨床的にインスピレーションを受けたタスクを定義します。
オープンソースとクローズドソースの両方の最先端のLVLMを評価します。
私たちの結果は、画像レベルの分類に関する強力なパフォーマンスを明らかにしていますが、空間的推論と臨床言語生成における持続的な課題があります。
U2ベンチは、医療超音波イメージングのユニークなマルチモーダルドメインでのLVLM研究を評価および加速するための厳密で統一されたテストベッドを確立します。

要約(オリジナル)

Ultrasound is a widely-used imaging modality critical to global healthcare, yet its interpretation remains challenging due to its varying image quality on operators, noises, and anatomical structures. Although large vision-language models (LVLMs) have demonstrated impressive multimodal capabilities across natural and medical domains, their performance on ultrasound remains largely unexplored. We introduce U2-BENCH, the first comprehensive benchmark to evaluate LVLMs on ultrasound understanding across classification, detection, regression, and text generation tasks. U2-BENCH aggregates 7,241 cases spanning 15 anatomical regions and defines 8 clinically inspired tasks, such as diagnosis, view recognition, lesion localization, clinical value estimation, and report generation, across 50 ultrasound application scenarios. We evaluate 20 state-of-the-art LVLMs, both open- and closed-source, general-purpose and medical-specific. Our results reveal strong performance on image-level classification, but persistent challenges in spatial reasoning and clinical language generation. U2-BENCH establishes a rigorous and unified testbed to assess and accelerate LVLM research in the uniquely multimodal domain of medical ultrasound imaging.

arxiv情報

著者	Anjie Le,Henan Liu,Yue Wang,Zhenyu Liu,Rongkun Zhu,Taohan Weng,Jinze Yu,Boyang Wang,Yalun Wu,Kaiwen Yan,Quanlin Sun,Meirui Jiang,Jialun Pei,Siya Liu,Haoyun Zheng,Zhoujun Li,Alison Noble,Jacques Souquet,Xiaoqing Guo,Manxi Lin,Hongcheng Guo
発行日	2025-05-30 17:02:23+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

U2-BENCH: Benchmarking Large Vision-Language Models on Ultrasound Understanding

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー