Modeling the Real World with High-Density Visual Particle Dynamics

要約

我々は、100,000 以上の粒子を含む巨大な潜在点群を処理することによって実際のシーンの物理的ダイナミクスをエミュレートできる学習済みワールドモデルである高密度ビジュアルパーティクルダイナミクス (HD-VPD) を紹介します。
この規模での効率化を可能にするために、絡み合った線形アテンションパフォーマーレイヤーとグラフベースの近隣アテンションレイヤーを活用する、インターレーサーと呼ばれる点群変換器 (PCT) の新しいファミリーを導入します。
2 台の RGB-D カメラを備えた高自由度両手ロボットのダイナミクスをモデル化することで、HD-VPD の機能を実証します。
以前のグラフニューラルネットワークのアプローチと比較して、インターレーサーダイナミクスは同じ予測品質で 2 倍高速であり、4 倍の数のパーティクルを使用してより高い品質を達成できます。
HD-VPD がロボットによるボックスの押し込みおよび把握タスクを使用して動作計画の品質をどのように評価できるかを説明します。
HD-VPD によってレンダリングされたビデオとパーティクルダイナミクスを https://sites.google.com/view/hd-vpd でご覧ください。

要約(オリジナル)

We present High-Density Visual Particle Dynamics (HD-VPD), a learned world model that can emulate the physical dynamics of real scenes by processing massive latent point clouds containing 100K+ particles. To enable efficiency at this scale, we introduce a novel family of Point Cloud Transformers (PCTs) called Interlacers leveraging intertwined linear-attention Performer layers and graph-based neighbour attention layers. We demonstrate the capabilities of HD-VPD by modeling the dynamics of high degree-of-freedom bi-manual robots with two RGB-D cameras. Compared to the previous graph neural network approach, our Interlacer dynamics is twice as fast with the same prediction quality, and can achieve higher quality using 4x as many particles. We illustrate how HD-VPD can evaluate motion plan quality with robotic box pushing and can grasping tasks. See videos and particle dynamics rendered by HD-VPD at https://sites.google.com/view/hd-vpd.

arxiv情報

著者	William F. Whitney,Jacob Varley,Deepali Jain,Krzysztof Choromanski,Sumeet Singh,Vikas Sindhwani
発行日	2024-06-28 10:13:50+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Modeling the Real World with High-Density Visual Particle Dynamics

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー