A Probabilistic Model for Skill Acquisition with Switching Latent Feedback Controllers

要約

多くの場合、操作タスクはサブタスクで構成され、それぞれが明確なスキルを表しています。
これらのスキルを習得することは、ロボットにとって不可欠です。ロボットは、自律性、効率、適応性、環境で作業する能力を向上させるためです。
デモンストレーションから学ぶことで、ロボットはゼロから始めることなく新しいスキルを迅速に獲得することができます。通常、デモンストレーションはタスクを達成するためのスキルをシーケンスすることができます。
デモンストレーションから学習するための動作クローニングアプローチは、一般的に混合密度ネットワーク出力ヘッドに依存して、ロボットアクションを予測します。
この作業では、まず、潜在的な状態で条件付けられたフィードバックコントローラー（またはスキル）のライブラリとして混合密度ネットワークを再解釈します。
これは、1層の線形ネットワークが古典的なフィードバックコントローラーと機能的に同等であり、ネットワークの重みがコントローラーのゲインに対応するという観察から生じます。
この洞察を使用して、これらの要素を組み合わせた確率的グラフィカルモデルを導き出し、スキル獲得プロセスを潜在空間でのセグメンテーションとして説明します。各スキルポリシーは、この潜在空間のフィードバック制御法則として機能します。
私たちのアプローチは、タスクの成功率だけでなく、人間のデモンストレーションで訓練されたときの観察騒音に対する堅牢性も大幅に改善します。
私たちの物理的なロボット実験は、誘導された堅牢性がロボットのモデルの展開を改善することをさらに示しています。

要約(オリジナル)

Manipulation tasks often consist of subtasks, each representing a distinct skill. Mastering these skills is essential for robots, as it enhances their autonomy, efficiency, adaptability, and ability to work in their environment. Learning from demonstrations allows robots to rapidly acquire new skills without starting from scratch, with demonstrations typically sequencing skills to achieve tasks. Behaviour cloning approaches to learning from demonstration commonly rely on mixture density network output heads to predict robot actions. In this work, we first reinterpret the mixture density network as a library of feedback controllers (or skills) conditioned on latent states. This arises from the observation that a one-layer linear network is functionally equivalent to a classical feedback controller, with network weights corresponding to controller gains. We use this insight to derive a probabilistic graphical model that combines these elements, describing the skill acquisition process as segmentation in a latent space, where each skill policy functions as a feedback control law in this latent space. Our approach significantly improves not only task success rate, but also robustness to observation noise when trained with human demonstrations. Our physical robot experiments further show that the induced robustness improves model deployment on robots.

arxiv情報

著者	Juyan Zhang,Dana Kulic,Michael Burke
発行日	2025-05-20 07:55:44+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A Probabilistic Model for Skill Acquisition with Switching Latent Feedback Controllers

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー