MedPromptX: Grounded Multimodal Prompting for Chest X-ray Diagnosis

要約

胸部X線画像は、一般的に急性および慢性の心肺状態を予測するために使用されますが、それらを構造化された臨床データと統合する努力は、不完全な電子健康記録（EHR）により課題に直面しています。
このペーパーでは、マルチモーダル大手言語モデル（MLLMS）、少数のショットプロンプト（FP）、視覚接地（VG）を統合して、画像と胸部X線診断のためのEHRデータを組み合わせる最初の臨床意思決定支援システムであるMEDPROMPTXを紹介します。
事前に訓練されたMLLMが利用され、欠落しているEHR情報を補完し、患者の病歴を包括的に理解することができます。
さらに、FPはMLLMの広範なトレーニングの必要性を減らし、幻覚の問題に効果的に取り組んでいます。
それにもかかわらず、最適な数の少数の例を決定し、高品質の候補者を選択するプロセスは負担がかかる可能性がありますが、モデルのパフォーマンスに大きく影響します。
したがって、新しい患者のシナリオにリアルタイム調整するために、少数のショットデータを動的に改良する新しい手法を提案します。
さらに、VGはX線画像の検索領域を狭め、それにより異常の識別を強化します。
また、MIMIC-IVおよびMimic-CXR-JPGデータベースから派生したインターリーブ画像とEHRデータを含むデータセットに答える新しいコンテスト内の視覚的質問であるMedPromptx-VQAもリリースします。
結果は、MedPromptxのSOTA性能を示しており、ベースラインと比較してF1スコアの11％の改善を達成しています。
コードとデータは、https：//github.com/biomedia-mbzuai/medpromptxで公開されています。

要約(オリジナル)

Chest X-ray images are commonly used for predicting acute and chronic cardiopulmonary conditions, but efforts to integrate them with structured clinical data face challenges due to incomplete electronic health records (EHR). This paper introduces MedPromptX, the first clinical decision support system that integrates multimodal large language models (MLLMs), few-shot prompting (FP) and visual grounding (VG) to combine imagery with EHR data for chest X-ray diagnosis. A pre-trained MLLM is utilized to complement the missing EHR information, providing a comprehensive understanding of patients’ medical history. Additionally, FP reduces the necessity for extensive training of MLLMs while effectively tackling the issue of hallucination. Nevertheless, the process of determining the optimal number of few-shot examples and selecting high-quality candidates can be burdensome, yet it profoundly influences model performance. Hence, we propose a new technique that dynamically refines few-shot data for real-time adjustment to new patient scenarios. Moreover, VG narrows the search area in X-ray images, thereby enhancing the identification of abnormalities. We also release MedPromptX-VQA, a new in-context visual question answering dataset encompassing interleaved images and EHR data derived from MIMIC-IV and MIMIC-CXR-JPG databases. Results demonstrate the SOTA performance of MedPromptX, achieving an 11% improvement in F1-score compared to the baselines. Code and data are publicly available on https://github.com/BioMedIA-MBZUAI/MedPromptX.

arxiv情報

著者	Mai A. Shaaban,Adnan Khan,Mohammad Yaqub
発行日	2025-01-27 18:46:41+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

MedPromptX: Grounded Multimodal Prompting for Chest X-ray Diagnosis

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー