2T-UNET: A Two-Tower UNet with Depth Clues for Robust Stereo Depth Estimation

要約

ステレオ対応マッチングは、マルチステップステレオ深度推定プロセスの重要な部分です。
このホワイトペーパーでは、単純な 2 タワー畳み込みニューラルネットワークを使用して、明示的なステレオマッチングステップを回避し、深度推定の問題を再検討します。
提案されたアルゴリズムは、2T-UNet と呼ばれます。
2T-UNet の背後にあるアイデアは、コストボリュームの構築をツインコンボリューションタワーに置き換えることです。
これらのタワーは、それらの間で異なる重量を許容しています。
さらに、2T-UNet のツインエンコーダーの入力は、既存のステレオ方式とは異なります。
通常、ステレオネットワークは、左右のイメージペアを入力として取り、シーンジオメトリを決定します。
ただし、2T-UNet モデルでは、右のステレオ画像が 1 つの入力として取り込まれ、左のステレオ画像とその単眼深度の手がかり情報が別の入力として取り込まれます。
深度の手がかりは、予測されるシーンジオメトリの品質を向上させるのに役立つ補完的な提案を提供します。
2T-UNet は、量的にも質的にも、困難なシーンフローデータセットに対する最先端の単眼およびステレオ深度推定方法を上回ります。
このアーキテクチャは、複雑な自然のシーンで非常に優れたパフォーマンスを発揮し、さまざまなリアルタイムアプリケーションでの有用性を強調しています。
事前トレーニング済みの重みとコードは、すぐに利用できるようになります。

要約(オリジナル)

Stereo correspondence matching is an essential part of the multi-step stereo depth estimation process. This paper revisits the depth estimation problem, avoiding the explicit stereo matching step using a simple two-tower convolutional neural network. The proposed algorithm is entitled as 2T-UNet. The idea behind 2T-UNet is to replace cost volume construction with twin convolution towers. These towers have an allowance for different weights between them. Additionally, the input for twin encoders in 2T-UNet are different compared to the existing stereo methods. Generally, a stereo network takes a right and left image pair as input to determine the scene geometry. However, in the 2T-UNet model, the right stereo image is taken as one input and the left stereo image along with its monocular depth clue information, is taken as the other input. Depth clues provide complementary suggestions that help enhance the quality of predicted scene geometry. The 2T-UNet surpasses state-of-the-art monocular and stereo depth estimation methods on the challenging Scene flow dataset, both quantitatively and qualitatively. The architecture performs incredibly well on complex natural scenes, highlighting its usefulness for various real-time applications. Pretrained weights and code will be made readily available.

arxiv情報

著者	Rohit Choudhary,Mansi Sharma,Rithvik Anil
発行日	2022-10-27 12:34:41+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

2T-UNET: A Two-Tower UNet with Depth Clues for Robust Stereo Depth Estimation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー