To the Noise and Back: Diffusion for Shared Autonomy


共有自律性は、ユーザーと自律エージェントがロボット システムを共同で制御する運用概念です。
最近の研究では、共有自律性をモデルフリーの深層強化学習 (RL) で定式化することにより、これらの仮定の一部を緩和しています。
特に、目標空間 (目標が離散的または制約されているなど) や環境のダイナミクスに関する知識はもはや必要ありません。
このホワイト ペーパーでは、拡散モデルの順方向および逆方向の拡散プロセスの変調を使用する共有自律性への新しいアプローチを提示します。


Shared autonomy is an operational concept in which a user and an autonomous agent collaboratively control a robotic system. It provides a number of advantages over the extremes of full-teleoperation and full-autonomy in many settings. Traditional approaches to shared autonomy rely on knowledge of the environment dynamics, a discrete space of user goals that is known a priori, or knowledge of the user’s policy — assumptions that are unrealistic in many domains. Recent works relax some of these assumptions by formulating shared autonomy with model-free deep reinforcement learning (RL). In particular, they no longer need knowledge of the goal space (e.g., that the goals are discrete or constrained) or environment dynamics. However, they need knowledge of a task-specific reward function to train the policy. Unfortunately, such reward specification can be a difficult and brittle process. On top of that, the formulations inherently rely on human-in-the-loop training, and that necessitates them to prepare a policy that mimics users’ behavior. In this paper, we present a new approach to shared autonomy that employs a modulation of the forward and reverse diffusion process of diffusion models. Our approach does not assume known environment dynamics or the space of user goals, and in contrast to previous work, it does not require any reward feedback, nor does it require access to the user’s policy during training. Instead, our framework learns a distribution over a space of desired behaviors. It then employs a diffusion model to translate the user’s actions to a sample from this distribution. Crucially, we show that it is possible to carry out this process in a manner that preserves the user’s control authority. We evaluate our framework on a series of challenging continuous control tasks, and analyze its ability to effectively correct user actions while maintaining their autonomy.


著者 Takuma Yoneda,Luzhe Sun,and Ge Yang,Bradly Stadie,Matthew Walter
発行日 2023-02-24 14:03:24+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス, Google

カテゴリー: cs.LG, cs.RO パーマリンク