Cross-D Conv: Cross-Dimensional Transferable Knowledge Base via Fourier Shifting Operation

要約

生物医学画像解析では、2D データと 3D データの間の二項対立が大きな課題となります。
3D ボリュームは現実世界への優れた適用性を提供しますが、各モダリティでの利用可能性が低く、大規模なトレーニングが容易ではありません。一方、2D サンプルは豊富ですが、包括的ではありません。
この論文では、フーリエ領域での位相シフトを学習することで次元のギャップを埋める新しいアプローチである \texttt{Cross-D Conv} 演算を紹介します。
私たちの方法では、2D と 3D の畳み込み演算の間でシームレスなウェイト転送が可能になり、次元を超えた学習が効果的に促進されます。
提案されたアーキテクチャは、豊富な 2D トレーニングデータを活用して 3D モデルのパフォーマンスを向上させ、3D 医療モデルの事前トレーニングにおけるマルチモーダルデータ不足の課題に対する実用的なソリューションを提供します。
RadImagenet (2D) およびマルチモーダルボリュームセットの実験による検証により、私たちのアプローチが特徴品質評価において同等またはそれ以上のパフォーマンスを達成できることが実証されました。
強化された畳み込み演算により、医療画像における効率的な分類およびセグメンテーションモデルを開発する新たな機会が提供されます。
この研究は、次元を超えたマルチモーダルな医用画像解析の進歩を表しており、2D トレーニングの計算効率を維持しながら 3D モデルの事前トレーニングで 2D 事前学習を利用するための堅牢なフレームワークを提供します。

要約(オリジナル)

In biomedical imaging analysis, the dichotomy between 2D and 3D data presents a significant challenge. While 3D volumes offer superior real-world applicability, they are less available for each modality and not easy to train in large scale, whereas 2D samples are abundant but less comprehensive. This paper introduces \texttt{Cross-D Conv} operation, a novel approach that bridges the dimensional gap by learning the phase shifting in the Fourier domain. Our method enables seamless weight transfer between 2D and 3D convolution operations, effectively facilitating cross-dimensional learning. The proposed architecture leverages the abundance of 2D training data to enhance 3D model performance, offering a practical solution to the multimodal data scarcity challenge in 3D medical model pretraining. Experimental validation on the RadImagenet (2D) and multimodal volumetric sets demonstrates that our approach achieves comparable or superior performance in feature quality assessment. The enhanced convolution operation presents new opportunities for developing efficient classification and segmentation models in medical imaging. This work represents an advancement in cross-dimensional and multimodal medical image analysis, offering a robust framework for utilizing 2D priors in 3D model pretraining while maintaining computational efficiency of 2D training.

arxiv情報

著者	Mehmet Can Yavuz,Yang Yang
発行日	2025-01-22 18:23:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Cross-D Conv: Cross-Dimensional Transferable Knowledge Base via Fourier Shifting Operation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー