Stereo Anywhere: Robust Zero-Shot Deep Stereo Matching Even Where Either Stereo or Mono Fail

要約

Stereo Anywhere は、幾何学的制約と単眼深度ビジョン基盤モデル (VFM) からの堅牢な事前分布を組み合わせた新しいステレオマッチングフレームワークです。
デュアルブランチアーキテクチャを通じてこれらの相補的な世界をエレガントに結合することで、学習したコンテキストキューとステレオマッチングをシームレスに統合します。
この設計に従って、私たちのフレームワークは、テクスチャレス領域、オクルージョン、非ランバートサーフェスなどの重要な課題を効果的に処理する新しいコストボリュームフュージョンメカニズムを導入しています。
新しい錯視データセット、MonoTrap、および複数のベンチマークにわたる広範な評価を通じて、合成のみでトレーニングされたモデルがゼロショット汎化で最先端の結果を達成し、既存のソリューションを大幅に上回るパフォーマンスを示しながら、課題に対する顕著な堅牢性を示していることを実証します。
鏡や透明フィルムなどのケース。

要約(オリジナル)

We introduce Stereo Anywhere, a novel stereo-matching framework that combines geometric constraints with robust priors from monocular depth Vision Foundation Models (VFMs). By elegantly coupling these complementary worlds through a dual-branch architecture, we seamlessly integrate stereo matching with learned contextual cues. Following this design, our framework introduces novel cost volume fusion mechanisms that effectively handle critical challenges such as textureless regions, occlusions, and non-Lambertian surfaces. Through our novel optical illusion dataset, MonoTrap, and extensive evaluation across multiple benchmarks, we demonstrate that our synthetic-only trained model achieves state-of-the-art results in zero-shot generalization, significantly outperforming existing solutions while showing remarkable robustness to challenging cases such as mirrors and transparencies.

arxiv情報

著者	Luca Bartolomei,Fabio Tosi,Matteo Poggi,Stefano Mattoccia
発行日	2024-12-05 18:59:58+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Stereo Anywhere: Robust Zero-Shot Deep Stereo Matching Even Where Either Stereo or Mono Fail

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー