Affordance-Guided Reinforcement Learning via Visual Prompting

要約

Renforcement Learning（RL）を装備したロボットは、報酬信号のみから幅広いスキルを学ぶ可能性があります。
ただし、一般的な操作タスクの堅牢で密な報酬信号を取得することは依然として課題です。
既存の学習ベースのアプローチには、タスク固有の報酬機能を学ぶために、成功と失敗の人間のデモなど、重要なデータが必要です。
最近、物理的なコンテキストで視覚的な推論を実行し、操作タスクの粗いロボットモーションを生成できるロボット工学用の大規模なマルチモーダルファンデーションモデルの採用も増加しています。
このさまざまな能力に動機付けられているこの作業では、自律RLのビジョン言語モデル（VLMS）によって形作られた報酬を活用する方法である改善のためのキーポイントベースのアフォーダンスガイダンス（Kagi）を提示します。
最先端のVLMは、ゼロショットのキーポイントを通じてアフォーダンスに関する印象的な推論を実証しており、これらを使用して、自律的なロボット学習を導く密な報酬を定義します。
自然言語の説明によって指定された現実世界の操作タスクでは、Kagiは自律RLのサンプル効率を改善し、30Kオンライン微調整ステップでタスクの完了を成功させることができます。
さらに、トレーニング前に使用されるドメイン内デモの数の減少に対するKagiの堅牢性を示し、45Kオンラインの微調整ステップで同様のパフォーマンスに達します。
プロジェクトWebサイト：https：//sites.google.com/view/affordance-guided-rl

要約(オリジナル)

Robots equipped with reinforcement learning (RL) have the potential to learn a wide range of skills solely from a reward signal. However, obtaining a robust and dense reward signal for general manipulation tasks remains a challenge. Existing learning-based approaches require significant data, such as human demonstrations of success and failure, to learn task-specific reward functions. Recently, there is also a growing adoption of large multi-modal foundation models for robotics that can perform visual reasoning in physical contexts and generate coarse robot motions for manipulation tasks. Motivated by this range of capability, in this work, we present Keypoint-based Affordance Guidance for Improvements (KAGI), a method leveraging rewards shaped by vision-language models (VLMs) for autonomous RL. State-of-the-art VLMs have demonstrated impressive reasoning about affordances through keypoints in zero-shot, and we use these to define dense rewards that guide autonomous robotic learning. On real-world manipulation tasks specified by natural language descriptions, KAGI improves the sample efficiency of autonomous RL and enables successful task completion in 30K online fine-tuning steps. Additionally, we demonstrate the robustness of KAGI to reductions in the number of in-domain demonstrations used for pre-training, reaching similar performance in 45K online fine-tuning steps. Project website: https://sites.google.com/view/affordance-guided-rl

arxiv情報

著者	Olivia Y. Lee,Annie Xie,Kuan Fang,Karl Pertsch,Chelsea Finn
発行日	2025-03-05 06:53:17+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Affordance-Guided Reinforcement Learning via Visual Prompting

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー