Structure-Centric Robust Monocular Depth Estimation via Knowledge Distillation

要約

自己教師あり学習によって可能になる単眼の奥行き推定は、コンピュータービジョンにおける 3D 認識の重要な技術です。
ただし、悪天候の変化、モーションブラー、夜間の照明条件が悪いシーンなど、現実のシナリオでは大きな課題に直面しています。
私たちの研究では、単眼奥行き推定を 3 つのサブ問題、つまり奥行き構造の一貫性、局所テクスチャの曖昧さの排除、および意味構造の相関に分割できることが明らかになりました。
私たちのアプローチは、構造中心の視点を採用し、セマンティクスと照明によって実証されるシーン構造の特性を利用することにより、干渉テクスチャに対する既存の自己教師付き単眼奥行き推定モデルの非ロバスト性に取り組みます。
私たちは、ローカルテクスチャへの過度の依存を減らし、パターンの欠落や干渉に対する堅牢性を高める新しいアプローチを考案しました。
さらに、セマンティックエキスパートモデルを教師として組み込み、学習可能な同型グラフを介してモデル間の特徴依存関係を構築して、セマンティック構造知識の集約を可能にします。
私たちのアプローチは、さまざまな公共の不利なシナリオのデータセットにわたって、最先端の分布外単眼深度推定パフォーマンスを実現します。
大規模なモデルエンジニアリングを必要とせずに、顕著なスケーラビリティと互換性を示します。
これは、さまざまな産業用途向けにモデルをカスタマイズできる可能性を示しています。

要約(オリジナル)

Monocular depth estimation, enabled by self-supervised learning, is a key technique for 3D perception in computer vision. However, it faces significant challenges in real-world scenarios, which encompass adverse weather variations, motion blur, as well as scenes with poor lighting conditions at night. Our research reveals that we can divide monocular depth estimation into three sub-problems: depth structure consistency, local texture disambiguation, and semantic-structural correlation. Our approach tackles the non-robustness of existing self-supervised monocular depth estimation models to interference textures by adopting a structure-centered perspective and utilizing the scene structure characteristics demonstrated by semantics and illumination. We devise a novel approach to reduce over-reliance on local textures, enhancing robustness against missing or interfering patterns. Additionally, we incorporate a semantic expert model as the teacher and construct inter-model feature dependencies via learnable isomorphic graphs to enable aggregation of semantic structural knowledge. Our approach achieves state-of-the-art out-of-distribution monocular depth estimation performance across a range of public adverse scenario datasets. It demonstrates notable scalability and compatibility, without necessitating extensive model engineering. This showcases the potential for customizing models for diverse industrial applications.

arxiv情報

著者	Runze Chen,Haiyong Luo,Fang Zhao,Jingze Yu,Yupeng Jia,Juan Wang,Xuepeng Ma
発行日	2024-10-09 15:20:29+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Structure-Centric Robust Monocular Depth Estimation via Knowledge Distillation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー