Can representation learning for multimodal image registration be improved by supervision of intermediate layers?

要約

マルチモーダルイメージングと相関解析では、通常、画像の位置合わせが必要です。
対照的な学習は、マルチモーダルイメージの表現を生成し、マルチモーダルイメージレジストレーションの困難なタスクをモノモーダル 1 に減らすことができます。
以前は、対照学習の中間層に対する追加の監督により、生物医学画像の分類が改善されました。
同様のアプローチが登録のために学習した表現を改善し、登録パフォーマンスを向上させるかどうかを評価します。
マルチモーダル画像をエンコードする U-Nets のボトルネック層の潜在的な機能に対照的な監督を追加し、3 つの異なる批評家関数を評価する 3 つのアプローチを検討します。
私たちの結果は、潜在的な特徴に対する追加の監督なしで学習した表現が、2 つの公開生物医学データセットへの登録という下流のタスクで最高のパフォーマンスを発揮することを示しています。
分類における対照学習と自己教師あり学習における最近の洞察を活用して、パフォーマンスの低下を調査します。
学習した表現の空間的関係を多次元スケーリングによって視覚化し、ボトルネック層の監視を追加すると、中間埋め込み空間の部分的な次元崩壊が発生する可能性があることを示します。

要約(オリジナル)

Multimodal imaging and correlative analysis typically require image alignment. Contrastive learning can generate representations of multimodal images, reducing the challenging task of multimodal image registration to a monomodal one. Previously, additional supervision on intermediate layers in contrastive learning has improved biomedical image classification. We evaluate if a similar approach improves representations learned for registration to boost registration performance. We explore three approaches to add contrastive supervision to the latent features of the bottleneck layer in the U-Nets encoding the multimodal images and evaluate three different critic functions. Our results show that representations learned without additional supervision on latent features perform best in the downstream task of registration on two public biomedical datasets. We investigate the performance drop by exploiting recent insights in contrastive learning in classification and self-supervised learning. We visualize the spatial relations of the learned representations by means of multidimensional scaling, and show that additional supervision on the bottleneck layer can lead to partial dimensional collapse of the intermediate embedding space.

arxiv情報

著者	Elisabeth Wetzer,Joakim Lindblad,Nataša Sladoje
発行日	2023-03-01 10:51:27+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Can representation learning for multimodal image registration be improved by supervision of intermediate layers?

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー