Graph Inverse Reinforcement Learning from Diverse Videos

要約

三人称ビデオからの逆強化学習 (IRL) に関する研究では、ロボットタスクの報酬を手動で設計する必要がなくなるという有望な結果が示されています。
ただし、ほとんどの以前の作品は、ビデオの比較的制限されたドメインからのトレーニングによってまだ制限されています。
このホワイトペーパーでは、三人称 IRL の真の可能性は、ビデオの多様性を高めてスケーリングを改善することにあると主張します。
多様なビデオから報酬関数を学習するために、ビデオに対してグラフ抽象化を実行した後、グラフ空間で時間マッチングを実行してタスクの進行状況を測定することを提案します。
私たちの洞察では、タスクはグラフを形成するエンティティの相互作用によって記述できます。このグラフの抽象化は、テクスチャなどの無関係な情報を削除するのに役立ち、より堅牢な報酬関数をもたらします。
X-MAGICAL でのクロスエンプリメンテーション学習と、実際のロボット操作のための人間のデモンストレーションからの学習に関するアプローチ、GraphIRL を評価します。
以前のアプローチよりも多様なビデオデモンストレーションに対する堅牢性の大幅な向上を示し、実際のロボットプッシュタスクでの手動の報酬設計よりも優れた結果を達成しています。
動画は https://sateeshkumar21.github.io/GraphIRL でご覧いただけます。

要約(オリジナル)

Research on Inverse Reinforcement Learning (IRL) from third-person videos has shown encouraging results on removing the need for manual reward design for robotic tasks. However, most prior works are still limited by training from a relatively restricted domain of videos. In this paper, we argue that the true potential of third-person IRL lies in increasing the diversity of videos for better scaling. To learn a reward function from diverse videos, we propose to perform graph abstraction on the videos followed by temporal matching in the graph space to measure the task progress. Our insight is that a task can be described by entity interactions that form a graph, and this graph abstraction can help remove irrelevant information such as textures, resulting in more robust reward functions. We evaluate our approach, GraphIRL, on cross-embodiment learning in X-MAGICAL and learning from human demonstrations for real-robot manipulation. We show significant improvements in robustness to diverse video demonstrations over previous approaches, and even achieve better results than manual reward design on a real robot pushing task. Videos are available at https://sateeshkumar21.github.io/GraphIRL .

arxiv情報

著者	Sateesh Kumar,Jonathan Zamora,Nicklas Hansen,Rishabh Jangir,Xiaolong Wang
発行日	2022-08-01 17:47:27+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Graph Inverse Reinforcement Learning from Diverse Videos

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー