Multimodal Whole Slide Foundation Model for Pathology

要約

計算病理学の分野は、病理組織関心領域 (ROI) を自己教師あり学習 (SSL) を介して多用途かつ転送可能な特徴表現にエンコードする基礎モデルの最近の進歩により変革されました。
しかし、これらの進歩を患者およびスライドレベルでの複雑な臨床課題に対処するために翻訳することは、疾患固有のコホート、特にまれな臨床症状の限られた臨床データによって依然として制約されています。
我々は、視覚的な自己教師あり学習と、対応する病理レポートおよび病理学用のマルチモーダル生成 AI コパイロットから生成された 423,122 個の合成キャプションとの視覚言語調整を介して 335,645 個の WSI を使用して事前トレーニングされたマルチモーダル全体スライド基礎モデルである TITAN を提案します。
微調整や臨床ラベルを必要とせずに、TITAN は汎用スライド表現を抽出し、希少疾患の検索やがんの予後などのリソースが限られた臨床シナリオに一般化する病理レポートを生成できます。
私たちはさまざまな臨床タスクで TITAN を評価し、線形プロービング、少数ショットとゼロショットの分類、希少がんの検索とクロスモーダル検索、病理レポートの生成などの機械学習設定全体で、TITAN が ROI モデルとスライド基礎モデルの両方を上回るパフォーマンスを発揮することを発見しました。

要約(オリジナル)

The field of computational pathology has been transformed with recent advances in foundation models that encode histopathology region-of-interests (ROIs) into versatile and transferable feature representations via self-supervised learning (SSL). However, translating these advancements to address complex clinical challenges at the patient and slide level remains constrained by limited clinical data in disease-specific cohorts, especially for rare clinical conditions. We propose TITAN, a multimodal whole slide foundation model pretrained using 335,645 WSIs via visual self-supervised learning and vision-language alignment with corresponding pathology reports and 423,122 synthetic captions generated from a multimodal generative AI copilot for pathology. Without any finetuning or requiring clinical labels, TITAN can extract general-purpose slide representations and generate pathology reports that generalize to resource-limited clinical scenarios such as rare disease retrieval and cancer prognosis. We evaluate TITAN on diverse clinical tasks and find that TITAN outperforms both ROI and slide foundation models across machine learning settings such as linear probing, few-shot and zero-shot classification, rare cancer retrieval and cross-modal retrieval, and pathology report generation.

arxiv情報

著者	Tong Ding,Sophia J. Wagner,Andrew H. Song,Richard J. Chen,Ming Y. Lu,Andrew Zhang,Anurag J. Vaidya,Guillaume Jaume,Muhammad Shaban,Ahrong Kim,Drew F. K. Williamson,Bowen Chen,Cristina Almagro-Perez,Paul Doucet,Sharifa Sahai,Chengkuan Chen,Daisuke Komura,Akihiro Kawabe,Shumpei Ishikawa,Georg Gerber,Tingying Peng,Long Phi Le,Faisal Mahmood
発行日	2024-11-29 12:39:57+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Multimodal Whole Slide Foundation Model for Pathology

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー