RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation

要約

本論文では、様々な物体、環境、形態における汎用性を特徴とする、RAMと呼ばれるゼロショットロボット操作のための検索・転送フレームワークを提案する。高価な領域内デモンストレーションから操作を学習する既存のアプローチとは異なり、RAMは検索に基づくアフォーダンス転送パラダイムを活用し、豊富な領域外データから汎用的な操作能力を獲得する。まずRAMは、ロボットデータ、HOI（Human-Object Interaction）データ、カスタムデータなど、多様なデモソースから統一されたアフォーダンスを大規模に抽出し、包括的なアフォーダンスメモリを構築する。次に、言語命令が与えられると、RAMはアフォーダンスメモリから最も類似したデモンストレーションを階層的に検索し、そのような領域外の2Dアフォーダンスを、ゼロショットかつ身体性を問わない方法で領域内の3D実行可能アフォーダンスに転送します。広範なシミュレーションと実世界での評価により、我々のRAMが多様な日常的タスクにおいて一貫して既存の作品を凌駕していることが実証されています。さらに、RAMは、自動的で効率的なデータ収集、ワンショットの視覚模倣、LLM/VLMを統合したロングホライズン操作など、下流のアプリケーションへの大きな可能性を示しています。詳細はウェブサイトhttps://yxkryptonite.github.io/RAM/。

要約(オリジナル)

This work proposes a retrieve-and-transfer framework for zero-shot robotic manipulation, dubbed RAM, featuring generalizability across various objects, environments, and embodiments. Unlike existing approaches that learn manipulation from expensive in-domain demonstrations, RAM capitalizes on a retrieval-based affordance transfer paradigm to acquire versatile manipulation capabilities from abundant out-of-domain data. First, RAM extracts unified affordance at scale from diverse sources of demonstrations including robotic data, human-object interaction (HOI) data, and custom data to construct a comprehensive affordance memory. Then given a language instruction, RAM hierarchically retrieves the most similar demonstration from the affordance memory and transfers such out-of-domain 2D affordance to in-domain 3D executable affordance in a zero-shot and embodiment-agnostic manner. Extensive simulation and real-world evaluations demonstrate that our RAM consistently outperforms existing works in diverse daily tasks. Additionally, RAM shows significant potential for downstream applications such as automatic and efficient data collection, one-shot visual imitation, and LLM/VLM-integrated long-horizon manipulation. For more details, please check our website at https://yxkryptonite.github.io/RAM/.

arxiv情報

著者	Yuxuan Kuang,Junjie Ye,Haoran Geng,Jiageng Mao,Congyue Deng,Leonidas Guibas,He Wang,Yue Wang
発行日	2024-07-05 17:50:38+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー