NeuralLabeling: A versatile toolset for labeling vision datasets using Neural Radiance Fields

要約

バウンディングボックスまたはメッシュを使用して 3D シーンに注釈を付け、セグメンテーションマスク、アフォーダンスマップ、2D バウンディングボックス、3D バウンディングボックス、6DOF オブジェクトポーズ、深度マップ、およびオブジェクトメッシュを生成するためのラベリングアプローチおよびツールセットである NeuralLabeling を紹介します。
NeuralLabeling はレンダラーとして Neural Radiance Fields (NeRF) を使用し、入力として複数の視点からキャプチャされた画像のみに依存して、オクルージョンなどの幾何学的手がかりを組み込みながら 3D 空間ツールを使用してラベリングを実行できるようにします。
NeuralLabeling のロボット工学における実際的な問題への適用可能性を実証するために、透明オブジェクト RGB の 30,000 フレームにグラウンドトゥルース深度マップと、RGBD センサーを使用してキャプチャされた食器洗い機に置かれたグラスのノイズを含む深度マップを追加し、Dishwasher30k データセットを生成しました。
注釈付きの深度マップを使用して、単純なディープニューラルネットワークを教師付きでトレーニングすると、以前に適用された弱い教師付きアプローチでトレーニングした場合よりも高い再構成パフォーマンスが得られることを示します。
また、NeuralLabeling を使用して生成されたインスタンスセグメンテーションと深度補完データセットをロボットアプリケーションに組み込んで、食器洗い機に置かれた透明な物体を 83.3% の精度で把握する方法 (深度補完なしの場合は 16.3%) を示すことも示します。

要約(オリジナル)

We present NeuralLabeling, a labeling approach and toolset for annotating 3D scenes using either bounding boxes or meshes and generating segmentation masks, affordance maps, 2D bounding boxes, 3D bounding boxes, 6DOF object poses, depth maps, and object meshes. NeuralLabeling uses Neural Radiance Fields (NeRF) as a renderer, allowing labeling to be performed using 3D spatial tools while incorporating geometric clues such as occlusions, relying only on images captured from multiple viewpoints as input. To demonstrate the applicability of NeuralLabeling to a practical problem in robotics, we added ground truth depth maps to 30000 frames of transparent object RGB and noisy depth maps of glasses placed in a dishwasher captured using an RGBD sensor, yielding the Dishwasher30k dataset. We show that training a simple deep neural network with supervision using the annotated depth maps yields a higher reconstruction performance than training with the previously applied weakly supervised approach. We also show how instance segmentation and depth completion datasets generated using NeuralLabeling can be incorporated into a robot application for grasping transparent objects placed in a dishwasher with an accuracy of 83.3%, compared to 16.3% without depth completion.

arxiv情報

著者	Floris Erich,Naoya Chiba,Yusuke Yoshiyasu,Noriaki Ando,Ryo Hanai,Yukiyasu Domae
発行日	2024-07-22 01:39:19+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

NeuralLabeling: A versatile toolset for labeling vision datasets using Neural Radiance Fields

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー