Understanding via Gaze: Gaze-based Task Decomposition for Imitation Learning of Robot Manipulation

要約

ロボット操作の模倣学習では、オブジェクト操作タスクを複数のセマンティックアクションに分解することが不可欠です。
この分解により、さまざまなコンテキストで学んだスキルの再利用と、実証された動きを単に複製するのではなく、新しいタスクを実行するための獲得スキルの組み合わせを可能にします。
進行中のイベントを理解するための進化ツールであるGazeは、人間のオブジェクト操作において重要な役割を果たし、モーション計画と強く相関しています。
この研究では、視線遷移に基づいて、シンプルで堅牢なタスク分解方法を提案します。
模倣エージェントの視線制御が特定のランドマークに固定され、それらの間の移行が固定されていると仮定して、自然にセグメントがサブタスクへの操作を実証しました。
特に、私たちの方法は、すべてのデモンストレーションで一貫したタスク分解を実現します。これは、機械学習などのコンテキストで望ましいものです。
ロボット操作の模倣学習における一般的なモダリティであるテレオ操作を使用して、さまざまなタスクのデモデータを収集し、セグメンテーション方法を適用し、結果のサブタスクの特性と一貫性を評価しました。
さらに、幅広いハイパーパラメーターの変動にわたる広範なテストを通じて、提案された方法が異なるロボットシステムへの適用に必要な堅牢性を持っていることを実証しました。

要約(オリジナル)

In imitation learning for robotic manipulation, decomposing object manipulation tasks into multiple semantic actions is essential. This decomposition enables the reuse of learned skills in varying contexts and the combination of acquired skills to perform novel tasks, rather than merely replicating demonstrated motions. Gaze, an evolutionary tool for understanding ongoing events, plays a critical role in human object manipulation, where it strongly correlates with motion planning. In this study, we propose a simple yet robust task decomposition method based on gaze transitions. We hypothesize that an imitation agent’s gaze control, fixating on specific landmarks and transitioning between them, naturally segments demonstrated manipulations into sub-tasks. Notably, our method achieves consistent task decomposition across all demonstrations, which is desirable in contexts such as machine learning. Using teleoperation, a common modality in imitation learning for robotic manipulation, we collected demonstration data for various tasks, applied our segmentation method, and evaluated the characteristics and consistency of the resulting sub-tasks. Furthermore, through extensive testing across a wide range of hyperparameter variations, we demonstrated that the proposed method possesses the robustness necessary for application to different robotic systems.

arxiv情報

著者	Ryo Takizawa,Yoshiyuki Ohmura,Yasuo Kuniyoshi
発行日	2025-01-25 04:33:43+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Understanding via Gaze: Gaze-based Task Decomposition for Imitation Learning of Robot Manipulation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー