Learning a Category-level Object Pose Estimator without Pose Annotations

要約

3D オブジェクトの姿勢推定は困難な作業です。
これまでの作品では、3D ポーズの対応関係を学習するために、常に注釈付きのポーズを持つ何千ものオブジェクト画像が必要でしたが、これはラベル付けに手間と時間がかかりました。
この論文では、姿勢アノテーションなしでカテゴリレベルの 3D オブジェクト姿勢推定器を学習することを提案します。
手動で注釈を付けた画像を使用する代わりに、拡散モデル (例: Zero-1-to-3) を活用して、制御された姿勢差の下で一連の画像を生成し、それらの画像を使用してオブジェクトの姿勢推定器を学習することを提案します。
オリジナルの拡散モデルを直接使用すると、ノイズの多いポーズやアーティファクトのある画像が生成されます。
この問題に取り組むために、まず、特別に設計された対比ポーズ学習から学習した画像エンコーダーを利用して、不合理な詳細をフィルタリングし、画像特徴マップを抽出します。
さらに、モデルが標準的な姿勢の位置合わせを知らなくても、生成された画像セットからオブジェクトの姿勢を学習できるようにする新しい学習戦略を提案します。
実験結果は、私たちの方法がシングルショット設定（ポーズ定義として）からカテゴリレベルのオブジェクトの姿勢を推定する機能を備えていると同時に、数ショットのカテゴリレベルのオブジェクトの姿勢推定では他の最先端の方法を大幅に上回ることを示しています。
ベンチマーク。

要約(オリジナル)

3D object pose estimation is a challenging task. Previous works always require thousands of object images with annotated poses for learning the 3D pose correspondence, which is laborious and time-consuming for labeling. In this paper, we propose to learn a category-level 3D object pose estimator without pose annotations. Instead of using manually annotated images, we leverage diffusion models (e.g., Zero-1-to-3) to generate a set of images under controlled pose differences and propose to learn our object pose estimator with those images. Directly using the original diffusion model leads to images with noisy poses and artifacts. To tackle this issue, firstly, we exploit an image encoder, which is learned from a specially designed contrastive pose learning, to filter the unreasonable details and extract image feature maps. Additionally, we propose a novel learning strategy that allows the model to learn object poses from those generated image sets without knowing the alignment of their canonical poses. Experimental results show that our method has the capability of category-level object pose estimation from a single shot setting (as pose definition), while significantly outperforming other state-of-the-art methods on the few-shot category-level object pose estimation benchmarks.

arxiv情報

著者	Fengrui Tian,Yaoyao Liu,Adam Kortylewski,Yueqi Duan,Shaoyi Du,Alan Yuille,Angtian Wang
発行日	2024-04-08 15:59:29+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Learning a Category-level Object Pose Estimator without Pose Annotations

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー