MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training in Radiology

要約

タイトル：Radiologyにおける医療知識強化ランゲージ-イメージ・プリトレーニングのMedKLIP

要約：
– 医療ビジュアルランゲージ・プリトレーニング（VLP）において、医療現場で使用される画像テキストレポートのペアを利用して、ドメイン固有の知識を取り入れることを考える
– 既存の研究とは異なり、生のレポートを直接処理するのではなく、医療関連の情報を抽出するトリプレット抽出モジュールを採用する
– エンティティの翻訳による知識ベースのクエリを使用して、医療分野の豊富なドメイン知識を利用し、言語埋め込み空間で医療エンティティ間の関係を暗黙的に構築するトリプレットエンコーディングモジュールを提案する
– 空間的にエンティティの説明と画像パッチの信号を合わせるためにTransformerベースのフュージョンモデルを使用し、医療診断の能力を実現する
– ChestX-ray14、RSNA Pneumonia、SIIM-ACR Pneumothorax、COVIDx CXR-2、COVID Rural、EdemaSeverityなどの多数の公開ベンチマークで実験を行い、アーキテクチャの有効性を検証する。ゼロショットとfine-tuningの両方の設定で、前の方法と比較して、疾患分類とグラウンディングで強力なパフォーマンスを示した。

要約(オリジナル)

In this paper, we consider enhancing medical visual-language pre-training (VLP) with domain-specific knowledge, by exploiting the paired image-text reports from the radiological daily practice. In particular, we make the following contributions: First, unlike existing works that directly process the raw reports, we adopt a novel triplet extraction module to extract the medical-related information, avoiding unnecessary complexity from language grammar and enhancing the supervision signals; Second, we propose a novel triplet encoding module with entity translation by querying a knowledge base, to exploit the rich domain knowledge in medical field, and implicitly build relationships between medical entities in the language embedding space; Third, we propose to use a Transformer-based fusion model for spatially aligning the entity description with visual signals at the image patch level, enabling the ability for medical diagnosis; Fourth, we conduct thorough experiments to validate the effectiveness of our architecture, and benchmark on numerous public benchmarks, e.g., ChestX-ray14, RSNA Pneumonia, SIIM-ACR Pneumothorax, COVIDx CXR-2, COVID Rural, and EdemaSeverity. In both zero-shot and fine-tuning settings, our model has demonstrated strong performance compared with the former methods on disease classification and grounding.

arxiv情報

著者	Chaoyi Wu,Xiaoman Zhang,Ya Zhang,Yanfeng Wang,Weidi Xie
発行日	2023-04-03 09:57:51+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, OpenAI

MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training in Radiology

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー