Developing a PET/CT Foundation Model for Cross-Modal Anatomical and Functional Imaging

要約

腫瘍学では、CTからの解剖学的詳細とPETからの分子マーカー発現情報との解剖学的詳細を組み合わせているため、陽性断層型断層撮影断層撮影（PET/CT）は、がん診断、病期分類、および治療モニタリングに広く使用されています。
ただし、既存の人工知能駆動型PET/CT分析は、主にゼロまたは限られたデータセットで訓練されたタスク固有のモデルに依存しており、一般化と堅牢性を制限しています。
これに対処するために、マルチモーダルPET/CTイメージング専用に設計された基礎モデルアプローチを提案します。
全身の解剖学的および機能的または分子的情報を効果的に統合する新しいフレームワークである、兄弟のツインマスクされた自動エンコーダー（Fratmae）を紹介します。
FRATMAEは、マスクされた自動エンコーダートレーニング中のモダリティ間の相乗的相互作用を可能にするクロスアテンションデコーダーとともに、PETおよびCTスキャン用に個別の視覚変圧器（VIT）エンコーダーを採用しています。
さらに、ペットの表現学習を強化するために、テキストメタデータが組み込まれています。
PET/CTデータセットの事前トレーニングにより、Fratmaeは複雑なクロスモーダル関係とグローバルな取り込みパターンを捉え、下流タスクで優れたパフォーマンスを達成し、一般化可能な基盤モデルとしての可能性を実証します。

要約(オリジナル)

In oncology, Positron Emission Tomography-Computed Tomography (PET/CT) is widely used in cancer diagnosis, staging, and treatment monitoring, as it combines anatomical details from CT with functional metabolic activity and molecular marker expression information from PET. However, existing artificial intelligence-driven PET/CT analyses rely predominantly on task-specific models trained from scratch or on limited datasets, limiting their generalizability and robustness. To address this, we propose a foundation model approach specifically designed for multimodal PET/CT imaging. We introduce the Cross-Fraternal Twin Masked Autoencoder (FratMAE), a novel framework that effectively integrates whole-body anatomical and functional or molecular information. FratMAE employs separate Vision Transformer (ViT) encoders for PET and CT scans, along with cross-attention decoders that enable synergistic interactions between modalities during masked autoencoder training. Additionally, it incorporates textual metadata to enhance PET representation learning. By pre-training on PET/CT datasets, FratMAE captures intricate cross-modal relationships and global uptake patterns, achieving superior performance on downstream tasks and demonstrating its potential as a generalizable foundation model.

arxiv情報

著者	Yujin Oh,Robert Seifert,Yihan Cao,Christoph Clement,Justin Ferdinandus,Constantin Lapa,Alessandro Liebich,Michelle Amon,Johanna Enke,Sifan Song,Runqi Meng,Fang Zeng,Ning Guo,Xiang Li,Pedram Heidari,Axel Rominger,Kuangyu Shi,Quanzheng Li
発行日	2025-03-04 17:49:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Developing a PET/CT Foundation Model for Cross-Modal Anatomical and Functional Imaging

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー