EDGI: Equivariant Diffusion for Planning with Embodied Agents

要約

具現化されたエージェントは構造化された世界で動作し、多くの場合、空間的、時間的、および順列の対称性でタスクを解決します。
計画およびモデルベースの強化学習 (MBRL) のほとんどのアルゴリズムは、この豊富な幾何学的構造を考慮に入れていないため、サンプルの非効率性と不十分な一般化につながります。
空間対称群 $\mathrm{SE(3)}$ の積、離散時間並進群 $\
mathbb{Z}$、およびオブジェクト順列グループ $\mathrm{S}_n$。
EDGI は Diffuser フレームワーク (Janner et al. 2022) に従い、世界モデルの学習とその中での計画の両方を条件付き生成モデリングの問題として扱い、オフラインの軌跡データセットで拡散モデルをトレーニングします。
複数の表現をサポートする新しい $\mathrm{SE(3)} \times \mathbb{Z} \times \mathrm{S}_n$-equivariant 拡散モデルを導入します。
このモデルを計画ループに統合し、条件付けと分類子ベースのガイダンスにより、必要に応じて特定のタスクの対称性をソフトに破ることができます。
ナビゲーションおよびオブジェクト操作タスクでは、EDGI はサンプルの効率と一般化を改善します。

要約(オリジナル)

Embodied agents operate in a structured world, often solving tasks with spatial, temporal, and permutation symmetries. Most algorithms for planning and model-based reinforcement learning (MBRL) do not take this rich geometric structure into account, leading to sample inefficiency and poor generalization. We introduce the Equivariant Diffuser for Generating Interactions (EDGI), an algorithm for MBRL and planning that is equivariant with respect to the product of the spatial symmetry group $\mathrm{SE(3)}$, the discrete-time translation group $\mathbb{Z}$, and the object permutation group $\mathrm{S}_n$. EDGI follows the Diffuser framework (Janner et al. 2022) in treating both learning a world model and planning in it as a conditional generative modeling problem, training a diffusion model on an offline trajectory dataset. We introduce a new $\mathrm{SE(3)} \times \mathbb{Z} \times \mathrm{S}_n$-equivariant diffusion model that supports multiple representations. We integrate this model in a planning loop, where conditioning and classifier-based guidance allow us to softly break the symmetry for specific tasks as needed. On navigation and object manipulation tasks, EDGI improves sample efficiency and generalization.

arxiv情報

著者	Johann Brehmer,Joey Bose,Pim de Haan,Taco Cohen
発行日	2023-03-22 09:19:39+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

EDGI: Equivariant Diffusion for Planning with Embodied Agents

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー