Category-Level Pose Retrieval with Contrastive Features Learnt with Occlusion Augmentation

要約

姿勢推定は通常、ビン分類または回帰問題として取り組まれます。
どちらの場合も、アイデアはオブジェクトの姿勢を直接予測することです。
これは、類似したポーズ間の外観の違いと、異なるポーズ間の類似性のため、重要なタスクです。
代わりに、2 つのポーズを比較する方が、1 つのポーズを直接予測するよりも簡単であるという重要な考え方に従います。
その目的のためにレンダリングと比較のアプローチが採用されてきましたが、これらは不安定で、計算コストが高く、リアルタイムアプリケーションでは遅くなる傾向があります。
ダイナミックマージンと連続ポーズラベル空間を備えた対照的な損失を使用して、埋め込み空間でアライメントメトリックを学習することにより、カテゴリレベルのポーズ推定を行うことを提案します。
効率的な推論のために、事前にレンダリングされ、事前に埋め込まれたレンダリングの参照セットを使用して、単純なリアルタイムの画像検索スキームを使用します。
現実世界の条件に対するロバスト性を実現するために、合成オクルージョン、バウンディングボックスの摂動、および外観の増強を採用しています。
私たちのアプローチは、PASCAL3D と OccludedPASCAL3D で最先端のパフォーマンスを達成し、クロスデータセット評価設定で KITTI3D の競合する方法を上回ります。
コードは現在、https://github.com/gkouros/contrastive-pose-retrieval で入手できます。

要約(オリジナル)

Pose estimation is usually tackled as either a bin classification or a regression problem. In both cases, the idea is to directly predict the pose of an object. This is a non-trivial task due to appearance variations between similar poses and similarities between dissimilar poses. Instead, we follow the key idea that comparing two poses is easier than directly predicting one. Render-and-compare approaches have been employed to that end, however, they tend to be unstable, computationally expensive, and slow for real-time applications. We propose doing category-level pose estimation by learning an alignment metric in an embedding space using a contrastive loss with a dynamic margin and a continuous pose-label space. For efficient inference, we use a simple real-time image retrieval scheme with a pre-rendered and pre-embedded reference set of renderings. To achieve robustness to real-world conditions, we employ synthetic occlusions, bounding box perturbations, and appearance augmentations. Our approach achieves state-of-the-art performance on PASCAL3D and OccludedPASCAL3D and surpasses the competing methods on KITTI3D in a cross-dataset evaluation setting. The code is currently available at https://github.com/gkouros/contrastive-pose-retrieval.

arxiv情報

著者	Georgios Kouros,Shubham Shrivastava,Cédric Picron,Sushruth Nagesh,Punarjay Chakravarty,Tinne Tuytelaars
発行日	2022-10-12 13:00:54+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Category-Level Pose Retrieval with Contrastive Features Learnt with Occlusion Augmentation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー