MedCoDi-M: A Multi-Prompt Foundation Model for Multimodal Medical Data Generation

要約

人工知能は医療現場に革命をもたらし、診断の精度と医療提供を強化しています。
しかし、医療現場でのその適応は、データの可用性とプライバシーの制約に関連する重大な課題に依然として直面しています。
合成データは、プライバシーを保護しながらデータ不足に対処し、これらの問題を軽減する有望なソリューションとして浮上しています。
最近、潜在拡散モデルが高品質の合成データを生成するための強力なツールとして登場しました。
一方、さまざまなモダリティの統合が関心を集めており、マルチモーダルな医療データを処理できるモデルの必要性が強調されています。既存のアプローチでは、補完的な情報を統合するのに苦労しており、モダリティを同時に生成する機能が不足しています。
この課題に対処するために、マルチモーダルな医療データ生成用に設計された 67 億 7,000 万パラメータのモデルである MedCoDi-M を紹介します。このモデルは、基礎モデルのパラダイムに従って、対照的な学習と大量のデータを活用して、共通の潜在空間を構築します。
異なるデータモダリティ間の関係。
さらに、さまざまな設定下で MedCoDi-M の生成を大幅に向上させるマルチプロンプトトレーニング手法を紹介します。
当社は MedCoDi-M を広範囲に検証しています。まず、胸部 X 線および放射線レポート生成用の最先端のデータセットである MIMIC-CXR データセットを使用して、5 つの競合他社に対して MedCoDi-M をベンチマークします。
次に、専門の放射線科医と視覚チューリングテストを実行して、生成されたデータの現実性と臨床的関連性を評価し、現実世界のシナリオとの整合性を確保します。
最後に、匿名化、データ不足、不均衡学習などの医療分野の主要な課題に対処する際の MedCoDi-M の有用性を評価します。
この結果は有望であり、医療現場における MedCoDi-M の適用可能性を示しています。
プロジェクトページは https://cosbidev.github.io/MedCoDi-M/ にあります。

要約(オリジナル)

Artificial Intelligence is revolutionizing medical practice, enhancing diagnostic accuracy and healthcare delivery. However, its adaptation in medical settings still faces significant challenges, related to data availability and privacy constraints. Synthetic data has emerged as a promising solution to mitigate these issues, addressing data scarcity while preserving privacy. Recently, Latent Diffusion Models have emerged as a powerful tool for generating high-quality synthetic data. Meanwhile, the integration of different modalities has gained interest, emphasizing the need of models capable of handle multimodal medical data.Existing approaches struggle to integrate complementary information and lack the ability to generate modalities simultaneously. To address this challenge, we present MedCoDi-M, a 6.77-billion-parameter model, designed for multimodal medical data generation, that, following Foundation Model paradigm, exploits contrastive learning and large quantity of data to build a shared latent space which capture the relationships between different data modalities. Further, we introduce the Multi-Prompt training technique, which significantly boosts MedCoDi-M’s generation under different settings. We extensively validate MedCoDi-M: first we benchmark it against five competitors on the MIMIC-CXR dataset, a state-of-the-art dataset for Chest X-ray and radiological report generation. Secondly, we perform a Visual Turing Test with expert radiologists to assess the realism and clinical relevance of the generated data, ensuring alignment with real-world scenarios. Finally, we assess the utility of MedCoDi-M in addressing key challenges in the medical field, such as anonymization, data scarcity and imbalance learning. The results are promising, demonstrating the applicability of MedCoDi-M in medical contexts. Project page is at https://cosbidev.github.io/MedCoDi-M/.

arxiv情報

著者	Daniele Molino,Francesco Di Feola,Eliodoro Faiella,Deborah Fazzini,Domiziana Santucci,Linlin Shen,Valerio Guarrasi,Paolo Soda
発行日	2025-01-08 16:53:56+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

MedCoDi-M: A Multi-Prompt Foundation Model for Multimodal Medical Data Generation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー