A Large-scale Medical Visual Task Adaptation Benchmark

要約

視覚タスクの適応は、特殊な学習可能なレイヤーまたはトークンを使用して、事前トレーニングされたビジョントランスフォーマー (ViT) を一般的な下流の視覚タスクに適応させるのに効果的であることが実証されています。
しかし、現実的かつ重要な医療領域、特にカラー画像、X 線、CT などの多様な医療視覚モダリティにおける視覚タスクの適応の効果を完全に調査するための大規模なベンチマークはまだありません。
このギャップを埋めるために、さまざまな臓器、モダリティ、および適応アプローチの 168 万枚の医療画像で構成される大規模な医療視覚タスク適応ベンチマークである Med-VTAB を紹介します。
Med-VTABに基づいて、調整可能なパラメータに関する医療プロンプト調整のスケーリング則と、非医療/医療の事前訓練重みを使用した医療視覚適応の一般化可能性を調査します。
さらに、私たちは配布外の患者 ID が医療の視覚適応に及ぼす影響を研究していますが、これは現実的で困難なシナリオです。
さらに、Med-VTAB の結果は、単一の事前トレーニング済みモデルでは医療タスクへの適応が不十分であることを示しています。
そこで、ゲート型専門家混合アダプターを介して医療用重みと一般的な事前トレーニング重みを組み合わせた新しい方法である GMoE アダプターを紹介し、医療用視覚タスク適応において最先端の結果を達成します。

要約(オリジナル)

Visual task adaptation has been demonstrated to be effective in adapting pre-trained Vision Transformers (ViTs) to general downstream visual tasks using specialized learnable layers or tokens. However, there is yet a large-scale benchmark to fully explore the effect of visual task adaptation on the realistic and important medical domain, particularly across diverse medical visual modalities, such as color images, X-ray, and CT. To close this gap, we present Med-VTAB, a large-scale Medical Visual Task Adaptation Benchmark consisting of 1.68 million medical images for diverse organs, modalities, and adaptation approaches. Based on Med-VTAB, we explore the scaling law of medical prompt tuning concerning tunable parameters and the generalizability of medical visual adaptation using non-medical/medical pre-train weights. Besides, we study the impact of patient ID out-of-distribution on medical visual adaptation, which is a real and challenging scenario. Furthermore, results from Med-VTAB indicate that a single pre-trained model falls short in medical task adaptation. Therefore, we introduce GMoE-Adapter, a novel method that combines medical and general pre-training weights through a gated mixture-of-experts adapter, achieving state-of-the-art results in medical visual task adaptation.

arxiv情報

著者	Shentong Mo,Xufang Luo,Yansen Wang,Dongsheng Li
発行日	2024-04-19 13:25:27+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A Large-scale Medical Visual Task Adaptation Benchmark

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー