AbHE: All Attention-based Homography Estimation

要約

ホモグラフィ推定は基本的なコンピュータービジョンタスクであり、画像の位置合わせのために多視点画像から変換を取得することを目的としています。
教師なし学習ホモグラフィ推定は、特徴抽出と変換行列回帰のために畳み込みニューラルネットワークをトレーニングします。
最先端のホモグラフィ法は畳み込みニューラルネットワークに基づいていますが、高レベルの視覚タスクで優位性を示す変換器に焦点を当てた研究はほとんどありません。
この論文では、Swin Transformer に基づく強力なベースラインモデルを提案します。これは、ローカル機能の畳み込みニューラルネットワークとグローバル機能の変換モジュールを組み合わせたものです。
さらに、特徴マップ内の一致する特徴を粗く検索するために、クロス非ローカル層が導入されます。
ホモグラフィ回帰段階では、相関ボリュームのチャネルに注意層を採用します。これにより、いくつかの弱い相関特徴点が削除される可能性があります。
実験は、8 自由度 (DOF) ホモグラフィ推定で、私たちの方法が最先端の方法より優れていることを示しています。

要約(オリジナル)

Homography estimation is a basic computer vision task, which aims to obtain the transformation from multi-view images for image alignment. Unsupervised learning homography estimation trains a convolution neural network for feature extraction and transformation matrix regression. While the state-of-theart homography method is based on convolution neural networks, few work focuses on transformer which shows superiority in highlevel vision tasks. In this paper, we propose a strong-baseline model based on the Swin Transformer, which combines convolution neural network for local features and transformer module for global features. Moreover, a cross non-local layer is introduced to search the matched features within the feature maps coarsely. In the homography regression stage, we adopt an attention layer for the channels of correlation volume, which can drop out some weak correlation feature points. The experiment shows that in 8 Degree-of-Freedoms(DOFs) homography estimation our method overperforms the state-of-the-art method.

arxiv情報

著者	Mingxiao Huo,Zhihao Zhang,Xianqiang Yang
発行日	2022-12-07 02:04:41+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

AbHE: All Attention-based Homography Estimation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー