Object Pose Estimation Annotation Pipeline for Multi-view Monocular Camera Systems in Industrial Settings

要約

倉庫や生産施設などの大規模な産業空間における物体の位置特定、より具体的には物体の姿勢推定は、マテリアルフローの運用に不可欠です。
従来のアプローチは、環境に設置された人工物や過度に高価な機器に依存しており、大規模には適していません。
より現実的なアプローチは、根底にある姿勢推定の問題に対処し、関心のあるオブジェクトの位置を特定するために、そのような空間で既存のカメラを利用することです。
深層学習の最先端の手法を活用してオブジェクトの姿勢を推定するには、大量のデータを収集して注釈を付ける必要があります。
この研究では、手作業を必要とせずに単眼画像の大規模なデータセットにアノテーションを付けるアプローチを提供します。
私たちのアプローチでは、空間内のカメラの位置を特定し、その位置をモーションキャプチャシステムで統合し、一連の線形マッピングを使用して対象オブジェクトの 3D モデルをグラウンドトゥルース 6D ポーズの位置に投影します。
私たちは、意図した運用領域を模倣した工業環境で 8 台のカメラからなるシステムから収集したカスタムデータセットでパイプラインをテストします。
私たちのアプローチでは、人間のアノテーターに必要な時間のほんのわずかな時間で、26,482 個のオブジェクトインスタンスを含むデータセットに一貫した品質のアノテーションを提供することができました。

要約(オリジナル)

Object localization, and more specifically object pose estimation, in large industrial spaces such as warehouses and production facilities, is essential for material flow operations. Traditional approaches rely on artificial artifacts installed in the environment or excessively expensive equipment, that is not suitable at scale. A more practical approach is to utilize existing cameras in such spaces in order to address the underlying pose estimation problem and to localize objects of interest. In order to leverage state-of-the-art methods in deep learning for object pose estimation, large amounts of data need to be collected and annotated. In this work, we provide an approach to the annotation of large datasets of monocular images without the need for manual labor. Our approach localizes cameras in space, unifies their location with a motion capture system, and uses a set of linear mappings to project 3D models of objects of interest at their ground truth 6D pose locations. We test our pipeline on a custom dataset collected from a system of eight cameras in an industrial setting that mimics the intended area of operation. Our approach was able to provide consistent quality annotations for our dataset with 26, 482 object instances at a fraction of the time required by human annotators.

arxiv情報

著者	Hazem Youssef,Frederik Polachowski,Jérôme Rutinowski,Moritz Roidl,Christopher Reining
発行日	2023-10-23 13:21:44+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Object Pose Estimation Annotation Pipeline for Multi-view Monocular Camera Systems in Industrial Settings

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー