Camera Height Doesn’t Change: Unsupervised Monocular Scale-Aware Road-Scene Depth Estimation

要約

単眼式の深さ推定器は、補助センサーによる明示的なスケール監視を必要とするか、スケールのあいまいさの問題が発生するため、下流のアプリケーションでの展開が困難になります。
スケールの原因としては、シーン内で見つかったオブジェクトのサイズが考えられますが、位置特定が不正確なため、悪用が困難になります。
この論文では、補助センサーや監視を必要としない、StableCamH と呼ばれる新しいスケールアウェア単眼深度推定方法を紹介します。
重要なアイデアは、シーン内のオブジェクトの高さに関する事前の知識を活用しながら、高さの手がかりを道路ビデオシーケンスのすべてのフレームに共通する単一の不変の尺度、つまりカメラの高さに集約することです。
カメラの高さの最適化として単眼の奥行き推定を定式化することで、堅牢で正確な教師なしエンドツーエンドのトレーニングを実現します。
StableCamH を実現するために、私たちは車の外観をその寸法に直接変換できる新しい学習ベースのサイズプリアを考案しました。
KITTI と Cityscapes に関する広範な実験により、StableCamH の有効性、関連手法と比較した最先端の精度、および一般化可能性が示されています。
StableCamH のトレーニングフレームワークは、あらゆる単眼深度推定方法に使用でき、今後の研究の基本的な構成要素になることが期待されます。

要約(オリジナル)

Monocular depth estimators either require explicit scale supervision through auxiliary sensors or suffer from scale ambiguity, which renders them difficult to deploy in downstream applications. A possible source of scale is the sizes of objects found in the scene, but inaccurate localization makes them difficult to exploit. In this paper, we introduce a novel scale-aware monocular depth estimation method called StableCamH that does not require any auxiliary sensor or supervision. The key idea is to exploit prior knowledge of object heights in the scene but aggregate the height cues into a single invariant measure common to all frames in a road video sequence, namely the camera height. By formulating monocular depth estimation as camera height optimization, we achieve robust and accurate unsupervised end-to-end training. To realize StableCamH, we devise a novel learning-based size prior that can directly convert car appearance into its dimensions. Extensive experiments on KITTI and Cityscapes show the effectiveness of StableCamH, its state-of-the-art accuracy compared with related methods, and its generalizability. The training framework of StableCamH can be used for any monocular depth estimation method and will hopefully become a fundamental building block for further work.

arxiv情報

著者	Genki Kinoshita,Ko Nishino
発行日	2023-12-07 18:50:01+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Camera Height Doesn’t Change: Unsupervised Monocular Scale-Aware Road-Scene Depth Estimation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー