A Compacted Structure for Cross-domain learning on Monocular Depth and Flow Estimation

要約

正確なモーションと深度の回復は、自動運転を含む多くのロボットビジョンタスクにとって重要です。
以前の研究のほとんどは、事前定義された損失関数またはクロスドメイン予測のいずれかを介して、協調的なマルチタスク相互作用を達成しました。
この論文では、Flow to Depth (F2D)、Depth to Flow (D2F)、指数移動平均 (EMA) を使用して相互支援を実現するマルチタスクスキームを紹介します。
F2D および D2F メカニズムは、微分可能な浅いネットに基づいて、オプティカルフローと深度ドメイン間のマルチスケールの情報統合を可能にします。
デュアルヘッドメカニズムを使用して、分割統治法に基づいて剛体および非剛体運動のオプティカルフローを予測します。これにより、オプティカルフローの推定パフォーマンスが大幅に向上します。
さらに、予測をより堅牢で安定させるために、EMA をマルチタスクトレーニングに使用しています。
KITTI データセットの実験結果は、マルチタスクスキームが他のマルチタスクスキームよりも優れており、予測結果を著しく改善することを示しています。

要約(オリジナル)

Accurate motion and depth recovery is important for many robot vision tasks including autonomous driving. Most previous studies have achieved cooperative multi-task interaction via either pre-defined loss functions or cross-domain prediction. This paper presents a multi-task scheme that achieves mutual assistance by means of our Flow to Depth (F2D), Depth to Flow (D2F), and Exponential Moving Average (EMA). F2D and D2F mechanisms enable multi-scale information integration between optical flow and depth domain based on differentiable shallow nets. A dual-head mechanism is used to predict optical flow for rigid and non-rigid motion based on a divide-and-conquer manner, which significantly improves the optical flow estimation performance. Furthermore, to make the prediction more robust and stable, EMA is used for our multi-task training. Experimental results on KITTI datasets show that our multi-task scheme outperforms other multi-task schemes and provide marked improvements on the prediction results.

arxiv情報

著者	Yu Chen,Xu Cao,Xiaoyi Lin,Baoru Huang,Xiao-Yun Zhou,Jian-Qing Zheng,Guang-Zhong Yang
発行日	2022-08-25 10:46:29+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A Compacted Structure for Cross-domain learning on Monocular Depth and Flow Estimation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー