Imitation Learning with Human Eye Gaze via Multi-Objective Prediction

要約

人間によるデモンストレーションを通じて学習エージェントを教えるアプローチは広く研究されており、複数の領域に適用されて成功しています。
しかし、模倣学習作業の大部分は、デモンストレーターからの行動情報、つまりどのアクションが取られたかのみを利用し、他の有用な情報は無視します。
特に、視線情報は、デモンストレーターが視覚的注意をどこに向けているかについて貴重な洞察を与えることができ、エージェントのパフォーマンスと一般化を向上させる可能性を秘めています。
この研究では、視覚的注意が重要なコンテキストを提供するタスクを解決するために、人間のデモンストレーションと視線の両方から同時に学習する、新しいコンテキスト認識型模倣学習アーキテクチャである Gaze Regularized Imitation Learning (GRIL) を提案します。
GRIL をビジュアルナビゲーションタスクに適用します。このタスクでは、写真のようにリアルなシミュレート環境で、無人のクワッドローターがターゲット車両を検索してナビゲートするように訓練されます。
我々は、GRILがいくつかの最先端の視線ベースの模倣学習アルゴリズムを上回り、同時に人間の視覚的注意を予測することを学習し、トレーニングデータに存在しないシナリオに一般化することを示します。
補足のビデオとコードは、https://sites.google.com/view/gaze-regulatoryized-il/ でご覧いただけます。

要約(オリジナル)

Approaches for teaching learning agents via human demonstrations have been widely studied and successfully applied to multiple domains. However, the majority of imitation learning work utilizes only behavioral information from the demonstrator, i.e. which actions were taken, and ignores other useful information. In particular, eye gaze information can give valuable insight towards where the demonstrator is allocating visual attention, and holds the potential to improve agent performance and generalization. In this work, we propose Gaze Regularized Imitation Learning (GRIL), a novel context-aware, imitation learning architecture that learns concurrently from both human demonstrations and eye gaze to solve tasks where visual attention provides important context. We apply GRIL to a visual navigation task, in which an unmanned quadrotor is trained to search for and navigate to a target vehicle in a photorealistic simulated environment. We show that GRIL outperforms several state-of-the-art gaze-based imitation learning algorithms, simultaneously learns to predict human visual attention, and generalizes to scenarios not present in the training data. Supplemental videos and code can be found at https://sites.google.com/view/gaze-regularized-il/.

arxiv情報

著者	Ravi Kumar Thakur,MD-Nazmus Samin Sunbeam,Vinicius G. Goecks,Ellen Novoseller,Ritwik Bera,Vernon J. Lawhern,Gregory M. Gremillion,John Valasek,Nicholas R. Waytowich
発行日	2023-07-22 19:46:36+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Imitation Learning with Human Eye Gaze via Multi-Objective Prediction

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー