Simultaneous Multi-View Object Recognition and Grasping in Open-Ended Domains

要約

人間の日常作業を支援するために、ロボットはシーン内にどのような物体が存在し、どこにあるのか、そして様々な状況下でどのように物体を把持し操作するのかを知っておく必要がある。そのため、物体認識と把持は、自律型ロボットにとって重要な機能である。最先端の手法では、物体認識と把持はともに視覚入力を用いるにもかかわらず、別々の問題として扱われることがほとんどである。さらに、ロボットの知識は学習段階を経て固定化される。このような場合、ロボットが新しい物体カテゴリに出会ったとき、壊滅的に忘れることなく新しい情報を取り込むために再教育する必要がある。この問題を解決するために、我々は、オープンエンドな物体認識と把持を同時に処理するために、メモリ容量が増強された深層学習アーキテクチャを提案する。具体的には、物体のマルチビューを入力とし、ピクセル単位の把持形状と、スケールおよび回転に不変な深層表現を出力として推定する。得られた表現は、メタアクティブな学習手法により、自由形状の物体認識に用いられる。我々は、シミュレーションと実世界の両方で、見たことのない物体を把持し、現場でごく少数の例を用いて新しい物体カテゴリを迅速に学習する本アプローチの能力を実証している。これらの実験の動画は、https://youtu.be/n9SMpuEkOgk でご覧いただけます。

要約(オリジナル)

To aid humans in everyday tasks, robots need to know which objects exist in the scene, where they are, and how to grasp and manipulate them in different situations. Therefore, object recognition and grasping are two key functionalities for autonomous robots. Most state-of-the-art approaches treat object recognition and grasping as two separate problems, even though both use visual input. Furthermore, the knowledge of the robot is fixed after the training phase. In such cases, if the robot encounters new object categories, it must be retrained to incorporate new information without catastrophic forgetting. In order to resolve this problem, we propose a deep learning architecture with an augmented memory capacity to handle open-ended object recognition and grasping simultaneously. In particular, our approach takes multi-views of an object as input and jointly estimates pixel-wise grasp configuration as well as a deep scale- and rotation-invariant representation as output. The obtained representation is then used for open-ended object recognition through a meta-active learning technique. We demonstrate the ability of our approach to grasp never-seen-before objects and to rapidly learn new object categories using very few examples on-site in both simulation and real-world settings. A video of these experiments is available online at: https://youtu.be/n9SMpuEkOgk

arxiv情報

著者	Hamidreza Kasaei,Sha Luo,Remo Sasso,Mohammadreza Kasaei
発行日	2022-12-06 11:34:30+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Simultaneous Multi-View Object Recognition and Grasping in Open-Ended Domains

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー