Multi-label affordance mapping from egocentric vision

要約

ピクセル精度による正確なアフォーダンス検出とセグメンテーションは、ロボットや支援デバイスなどのインタラクションに基づく多くの複雑なシステムにおいて重要な要素です。
正確なマルチラベルセグメンテーションを可能にするアフォーダンス知覚への新しいアプローチを紹介します。
私たちのアプローチは、アフォーダンスの位置をピクセルレベルの精度で提供する環境の 3D マップを使用して、インタラクションの一人称ビデオからグラウンディングされたアフォーダンスを自動的に抽出するために使用できます。
この方法を使用して、EPIC-Kitchen データセット、EPIC-Aff に基づくアフォーダンスに関する最大かつ最も完全なデータセットを構築します。EPIC-Aff は、インタラクションに基づいた、マルチラベル、メトリックおよび空間アフォーダンスアノテーションを提供します。
次に、複数のアフォーダンスが同じオブジェクトに関連付けられている場合など、同じ空間に共存できるようにする、マルチラベル検出に基づくアフォーダンスセグメンテーションへの新しいアプローチを提案します。
いくつかのセグメンテーションアーキテクチャを使用したマルチラベル検出のいくつかの戦略を紹介します。
実験結果は、マルチラベル検出の重要性を強調しています。
最後に、空間アクション中心のゾーンでインタラクションホットスポットのマップを構築するためにメトリック表現を活用し、その表現を使用してタスク指向のナビゲーションを実行する方法を示します。

要約(オリジナル)

Accurate affordance detection and segmentation with pixel precision is an important piece in many complex systems based on interactions, such as robots and assitive devices. We present a new approach to affordance perception which enables accurate multi-label segmentation. Our approach can be used to automatically extract grounded affordances from first person videos of interactions using a 3D map of the environment providing pixel level precision for the affordance location. We use this method to build the largest and most complete dataset on affordances based on the EPIC-Kitchen dataset, EPIC-Aff, which provides interaction-grounded, multi-label, metric and spatial affordance annotations. Then, we propose a new approach to affordance segmentation based on multi-label detection which enables multiple affordances to co-exists in the same space, for example if they are associated with the same object. We present several strategies of multi-label detection using several segmentation architectures. The experimental results highlight the importance of the multi-label detection. Finally, we show how our metric representation can be exploited for build a map of interaction hotspots in spatial action-centric zones and use that representation to perform a task-oriented navigation.

arxiv情報

著者	Lorenzo Mur-Labadia,Jose J. Guerrero,Ruben Martinez-Cantin
発行日	2023-09-05 10:56:23+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Multi-label affordance mapping from egocentric vision

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー