MedFLIP: Medical Vision-and-Language Self-supervised Fast Pre-Training with Masked Autoencoder

要約

医療分析の分野では、マスクされたオートエンコーダー (MAE) とマルチモーダルデータの間の相互学習の可能性が広範な研究によって調査されています。
しかし、複合モダリティに対する MAE の影響は依然として重要な課題です。
医療分析のための高速言語画像事前トレーニング手法である MedFLIP を紹介します。
私たちは、クロスドメインでのゼロショット学習のための MAE を探索します。これにより、医療診断における一般的なシナリオである、限られたデータから学習するモデルの能力が強化されます。
画像のマスキングがインターモーダル学習に影響を与えないことを確認します。
さらに、医用画像の特徴に対する表現学習を強化するSVD損失を提案し、そのようなデータの構造的複雑さを活用して分類精度を向上させることを目指しています。
最後に、言語を使用することで医療画像解析のゼロショットのパフォーマンスが向上することを検証します。
マスキングプロセスの MedFLIP スケーリングはこの分野の進歩を示し、従来の計算上のボトルネックを発生させずに、迅速かつ正確な医用画像分析への道を提供します。
実験と検証を通じて、MedFLIP は効率的なパフォーマンスの向上を実証し、医療診断における将来の研究と応用のための検討された標準を設定します。

要約(オリジナル)

Within the domain of medical analysis, extensive research has explored the potential of mutual learning between Masked Autoencoders(MAEs) and multimodal data. However, the impact of MAEs on intermodality remains a key challenge. We introduce MedFLIP, a Fast Language-Image Pre-training method for Medical analysis. We explore MAEs for zero-shot learning with crossed domains, which enhances the model ability to learn from limited data, a common scenario in medical diagnostics. We verify that masking an image does not affect intermodal learning. Furthermore, we propose the SVD loss to enhance the representation learning for characteristics of medical images, aiming to improve classification accuracy by leveraging the structural intricacies of such data. Lastly, we validate using language will improve the zero-shot performance for the medical image analysis. MedFLIP scaling of the masking process marks an advancement in the field, offering a pathway to rapid and precise medical image analysis without the traditional computational bottlenecks. Through experiments and validation, MedFLIP demonstrates efficient performance improvements, setting an explored standard for future research and application in medical diagnostics.

arxiv情報

著者	Lei Li,Tianfang Zhang,Xinglin Zhang,Jiaqi Liu,Bingqi Ma,Yan Luo,Tao Chen
発行日	2024-03-07 16:11:43+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

MedFLIP: Medical Vision-and-Language Self-supervised Fast Pre-Training with Masked Autoencoder

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー