Pretrained Bayesian Non-parametric Knowledge Prior in Robotic Long-Horizon Reinforcement Learning

要約

補強学習（RL）メソッドは通常、新しいタスクをゼロから学習し、学習プロセスを加速する可能性のある事前知識を無視することがよくあります。
いくつかの方法は以前に学習したスキルを組み込んでいますが、通常、スキルプライアーを定義するために、単一のガウス分布などの固定構造に依存しています。
この厳格な仮定は、特に複雑で長期のタスクで、スキルの多様性と柔軟性を制限する可能性があります。
この作業では、潜在的なプリミティブスキルモーションをモデル化する方法を紹介します。
ベイジアンノンパラメトリックモデル、特に誕生と合併により強化されたディリクレプロセスの混合物を利用して、スキルの多様な性質を効果的に捉えるスキルを事前に訓練します。
さらに、学習したスキルは、以前のスペース内で明示的に追跡可能であり、解釈可能性と制御を向上させます。
この柔軟なスキルをRLフレームワークに統合することにより、私たちのアプローチは、長老操作タスクの既存の方法を上回り、複雑な環境でより効率的なスキル転送とタスクの成功を可能にします。
私たちの調査結果は、スキル前のより豊かでノンパラメトリックな表現が、挑戦的なロボットタスクの学習と実行の両方を大幅に改善することを示しています。
すべてのデータ、コード、ビデオはhttps://ghiara.github.io/helios/で入手できます。

要約(オリジナル)

Reinforcement learning (RL) methods typically learn new tasks from scratch, often disregarding prior knowledge that could accelerate the learning process. While some methods incorporate previously learned skills, they usually rely on a fixed structure, such as a single Gaussian distribution, to define skill priors. This rigid assumption can restrict the diversity and flexibility of skills, particularly in complex, long-horizon tasks. In this work, we introduce a method that models potential primitive skill motions as having non-parametric properties with an unknown number of underlying features. We utilize a Bayesian non-parametric model, specifically Dirichlet Process Mixtures, enhanced with birth and merge heuristics, to pre-train a skill prior that effectively captures the diverse nature of skills. Additionally, the learned skills are explicitly trackable within the prior space, enhancing interpretability and control. By integrating this flexible skill prior into an RL framework, our approach surpasses existing methods in long-horizon manipulation tasks, enabling more efficient skill transfer and task success in complex environments. Our findings show that a richer, non-parametric representation of skill priors significantly improves both the learning and execution of challenging robotic tasks. All data, code, and videos are available at https://ghiara.github.io/HELIOS/.

arxiv情報

著者	Yuan Meng,Xiangtong Yao,Kejia Chen,Yansong Wu,Liding Zhang,Zhenshan Bing,Alois Knoll
発行日	2025-03-27 20:43:36+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Pretrained Bayesian Non-parametric Knowledge Prior in Robotic Long-Horizon Reinforcement Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー