Cross-domain and Cross-dimension Learning for Image-to-Graph Transformers

要約

画像からグラフへの直接変換は、単一モデルでオブジェクトの検出と関係予測を解決することを伴う困難なタスクです。
このタスクは複雑であるため、多くのドメインでは大規模なトレーニングデータセットはまれであり、深層学習手法のトレーニングが困難になっています。
このデータの希薄性により、一般的なコンピュータービジョンにおける最先端技術に似た転移学習戦略が必要になります。
この研究では、画像からグラフへの変換のためのクロスドメインおよびクロスディメンション学習を可能にする一連の方法を紹介します。
我々は、(1) エッジの数が異なる複数のドメインにおけるオブジェクトの関係を効果的に学習するための正規化されたエッジサンプリング損失、(2) 異なるドメインからの画像レベルとグラフレベルの特徴を調整する画像からグラフへの変換のためのドメイン適応フレームワーク、
(3) 2D データを 3D トランスフォーマーの学習に使用できる投影機能。
私たちは、クロスドメインおよびクロスディメンションの実験でこの方法の有用性を実証します。そこでは、2D 道路ネットワークからのラベル付きデータを利用して、大きく異なるターゲットドメインでの同時学習を行います。
私たちの手法は、網膜または全脳の血管グラフ抽出などの難しいベンチマークにおいて、標準的な転移学習や自己教師あり事前トレーニングよりも常に優れたパフォーマンスを発揮します。

要約(オリジナル)

Direct image-to-graph transformation is a challenging task that involves solving object detection and relationship prediction in a single model. Due to this task’s complexity, large training datasets are rare in many domains, making the training of deep-learning methods challenging. This data sparsity necessitates transfer learning strategies akin to the state-of-the-art in general computer vision. In this work, we introduce a set of methods enabling cross-domain and cross-dimension learning for image-to-graph transformers. We propose (1) a regularized edge sampling loss to effectively learn object relations in multiple domains with different numbers of edges, (2) a domain adaptation framework for image-to-graph transformers aligning image- and graph-level features from different domains, and (3) a projection function that allows using 2D data for training 3D transformers. We demonstrate our method’s utility in cross-domain and cross-dimension experiments, where we utilize labeled data from 2D road networks for simultaneous learning in vastly different target domains. Our method consistently outperforms standard transfer learning and self-supervised pretraining on challenging benchmarks, such as retinal or whole-brain vessel graph extraction.

arxiv情報

著者	Alexander H. Berger,Laurin Lux,Suprosanna Shit,Ivan Ezhov,Georgios Kaissis,Martin J. Menten,Daniel Rueckert,Johannes C. Paetzold
発行日	2024-12-05 15:19:47+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Cross-domain and Cross-dimension Learning for Image-to-Graph Transformers

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー