Enhancing Monocular Depth Estimation with Multi-Source Auxiliary Tasks

要約

単眼深度推定 (MDE) は、コンピュータービジョンにおける困難なタスクであり、高品質のラベル付きデータセットのコストと不足によって妨げられることがよくあります。
私たちは、事前トレーニングされたビジョン基盤モデルの上に構築された共有デコーダーを備えた交互トレーニングスキームに、関連するビジョンタスクからの補助データセットを使用してこの課題に取り組み、同時に MDE に高い比重を与えます。
広範な実験を通じて、さまざまなドメイン内の補助データセットとタスクを組み込むことで MDE の品質が平均で最大 11% 向上する利点を実証しました。
私たちの実験分析では、補助タスクがさまざまな影響を与えることが示されており、タスク選択の重要性が確認され、単にデータを追加するだけでは品質の向上が達成されないことが強調されています。
注目すべきことに、私たちの調査では、セマンティックセグメンテーションデータセットをマルチラベル密分類 (MLDC) として使用すると、追加の品質向上が得られることが多いことが明らかになりました。
最後に、私たちの方法は、検討中の MDE データセットのデータ効率を大幅に向上させ、サイズを少なくとも 80% 削減しながら品質を向上させます。
これにより、利用可能な高品質のラベル付きデータが限られているにもかかわらず、関連タスクからの補助データを使用して MDE の品質を向上させる道が開かれます。
コードは https://jugit.fz-juelich.de/ias-8/mdeaux で入手できます。

要約(オリジナル)

Monocular depth estimation (MDE) is a challenging task in computer vision, often hindered by the cost and scarcity of high-quality labeled datasets. We tackle this challenge using auxiliary datasets from related vision tasks for an alternating training scheme with a shared decoder built on top of a pre-trained vision foundation model, while giving a higher weight to MDE. Through extensive experiments we demonstrate the benefits of incorporating various in-domain auxiliary datasets and tasks to improve MDE quality on average by ~11%. Our experimental analysis shows that auxiliary tasks have different impacts, confirming the importance of task selection, highlighting that quality gains are not achieved by merely adding data. Remarkably, our study reveals that using semantic segmentation datasets as Multi-Label Dense Classification (MLDC) often results in additional quality gains. Lastly, our method significantly improves the data efficiency for the considered MDE datasets, enhancing their quality while reducing their size by at least 80%. This paves the way for using auxiliary data from related tasks to improve MDE quality despite limited availability of high-quality labeled data. Code is available at https://jugit.fz-juelich.de/ias-8/mdeaux.

arxiv情報

著者	Alessio Quercia,Erenus Yildiz,Zhuo Cao,Kai Krajsek,Abigail Morrison,Ira Assent,Hanno Scharr
発行日	2025-01-22 12:04:58+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Enhancing Monocular Depth Estimation with Multi-Source Auxiliary Tasks

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー