Diffusion Features for Zero-Shot 6DoF Object Pose Estimation

要約

ゼロショットオブジェクトの姿勢推定により、オブジェクト固有のトレーニングを必要とせずに、画像からオブジェクトの姿勢を取得できます。
最近のアプローチでは、これはビジョンファウンデーションモデル (VFM) によって促進されます。VFM は、事実上汎用の特徴抽出器である事前トレーニングされたモデルです。
これらの VFM が示す特性は、トレーニングデータ、ネットワークアーキテクチャ、トレーニングパラダイムによって異なります。
この分野で一般的な選択肢は、自己監視型ビジョントランスフォーマー (ViT) です。
この研究では、ゼロショット姿勢推定に対する潜在拡散モデル (LDM) バックボーンの影響を評価します。
共通の基盤に基づいて 2 つのモデルファミリ間の比較を容易にするために、最近のアプローチを採用および修正しました。
したがって、LDM を使用してゼロショット方式でポーズを推定するためのテンプレートベースの多段階の方法が提示されます。
提案されたアプローチの有効性は、オブジェクト固有の 6DoF 姿勢推定のための 3 つの標準データセットで経験的に評価されます。
実験では、ViT ベースラインと比較して、平均再現率が最大 27% 向上することが実証されました。
ソースコードは https://github.com/BvG1993/DZOP から入手できます。

要約(オリジナル)

Zero-shot object pose estimation enables the retrieval of object poses from images without necessitating object-specific training. In recent approaches this is facilitated by vision foundation models (VFM), which are pre-trained models that are effectively general-purpose feature extractors. The characteristics exhibited by these VFMs vary depending on the training data, network architecture, and training paradigm. The prevailing choice in this field are self-supervised Vision Transformers (ViT). This study assesses the influence of Latent Diffusion Model (LDM) backbones on zero-shot pose estimation. In order to facilitate a comparison between the two families of models on a common ground we adopt and modify a recent approach. Therefore, a template-based multi-staged method for estimating poses in a zero-shot fashion using LDMs is presented. The efficacy of the proposed approach is empirically evaluated on three standard datasets for object-specific 6DoF pose estimation. The experiments demonstrate an Average Recall improvement of up to 27% over the ViT baseline. The source code is available at: https://github.com/BvG1993/DZOP.

arxiv情報

著者	Bernd Von Gimborn,Philipp Ausserlechner,Markus Vincze,Stefan Thalhammer
発行日	2024-11-25 18:53:56+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Diffusion Features for Zero-Shot 6DoF Object Pose Estimation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー