Tell2Reg: Establishing spatial correspondence between images by the same language prompts

要約

空間的対応は、セグメント化された領域のペアで表すことができ、画像登録ネットワークは、変位フィールドまたは変換パラメーターを予測するのではなく、対応する領域をセグメント化することを目的としています。
この作業では、このような対応する領域ペアは、GroundingDinoとSAMに基づいた事前に訓練された大規模なマルチモーダルモデルを使用して、2つの異なる画像で同じ言語プロンプトによって予測できることを示します。
これにより、幅広い画像登録タスクに一般化可能な潜在的に一般化可能な完全に自動化されたトレーニングのない登録アルゴリズムが可能になります。
このホワイトペーパーでは、患者間の非常に多様な強度と形態の両方を含む、被験者間の前立腺MR画像を登録する挑戦的なタスクの1つを使用して実験結果を提示します。
Tell2regはトレーニングなしで、この登録タスクに以前に必要だった費用と時間のかかるデータキュレーションとラベル付けの必要性を排除します。
このアプローチは、テストされた監視されていない学習ベースの登録方法を上回り、弱く監視された方法に匹敵するパフォーマンスを持っています。
また、言語のセマンティクスと空間的対応の間に初めての相関があることを示唆するために、追加の定性的な結果も提示されます。言語造影領域の空間的不変性や、得られたローカルとグローバルの対応の間の言語の違いの違いも含まれます。
コードはhttps://github.com/yanwenci/tell2reg.gitで入手できます。

要約(オリジナル)

Spatial correspondence can be represented by pairs of segmented regions, such that the image registration networks aim to segment corresponding regions rather than predicting displacement fields or transformation parameters. In this work, we show that such a corresponding region pair can be predicted by the same language prompt on two different images using the pre-trained large multimodal models based on GroundingDINO and SAM. This enables a fully automated and training-free registration algorithm, potentially generalisable to a wide range of image registration tasks. In this paper, we present experimental results using one of the challenging tasks, registering inter-subject prostate MR images, which involves both highly variable intensity and morphology between patients. Tell2Reg is training-free, eliminating the need for costly and time-consuming data curation and labelling that was previously required for this registration task. This approach outperforms unsupervised learning-based registration methods tested, and has a performance comparable to weakly-supervised methods. Additional qualitative results are also presented to suggest that, for the first time, there is a potential correlation between language semantics and spatial correspondence, including the spatial invariance in language-prompted regions and the difference in language prompts between the obtained local and global correspondences. Code is available at https://github.com/yanwenCi/Tell2Reg.git.

arxiv情報

著者	Wen Yan,Qianye Yang,Shiqi Huang,Yipei Wang,Shonit Punwani,Mark Emberton,Vasilis Stavrinides,Yipeng Hu,Dean Barratt
発行日	2025-02-05 12:25:02+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Tell2Reg: Establishing spatial correspondence between images by the same language prompts

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー