PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful Navigators

要約

PoliFormer (Policy Transformer) は、純粋にシミュレーション内でトレーニングされているにもかかわらず、適応せずに現実世界に一般化する大規模な強化学習を使用してエンドツーエンドでトレーニングされた RGB 専用の屋内ナビゲーションエージェントです。
PoliFormer は、長期記憶と推論を可能にするコーザルトランスフォーマーデコーダーを備えた基本的なビジョントランスフォーマーエンコーダーを使用します。
これは、並列化されたマルチマシンロールアウトを利用して、高スループットで効率的なトレーニングを実現し、多様な環境にわたる何億ものインタラクションに合わせてトレーニングされています。
PoliFormer は優れたナビゲーターであり、LoCoBot ロボットとストレッチ RE-1 ロボットという 2 つの異なる実施形態と 4 つのナビゲーションベンチマークにわたって最先端の結果を生み出します。
以前の作業の停滞期を打破し、CHORES-S ベンチマークにおけるオブジェクトゴールナビゲーションで前例のない 85.5% の成功率を達成し、絶対的な 28.5% の向上を達成しました。
PoliFormer は、微調整することなく、オブジェクト追跡、マルチオブジェクトナビゲーション、オープンボキャブラリナビゲーションなどのさまざまなダウンストリームアプリケーションに簡単に拡張することもできます。

要約(オリジナル)

We present PoliFormer (Policy Transformer), an RGB-only indoor navigation agent trained end-to-end with reinforcement learning at scale that generalizes to the real-world without adaptation despite being trained purely in simulation. PoliFormer uses a foundational vision transformer encoder with a causal transformer decoder enabling long-term memory and reasoning. It is trained for hundreds of millions of interactions across diverse environments, leveraging parallelized, multi-machine rollouts for efficient training with high throughput. PoliFormer is a masterful navigator, producing state-of-the-art results across two distinct embodiments, the LoCoBot and Stretch RE-1 robots, and four navigation benchmarks. It breaks through the plateaus of previous work, achieving an unprecedented 85.5% success rate in object goal navigation on the CHORES-S benchmark, a 28.5% absolute improvement. PoliFormer can also be trivially extended to a variety of downstream applications such as object tracking, multi-object navigation, and open-vocabulary navigation with no finetuning.

arxiv情報

著者	Kuo-Hao Zeng,Zichen Zhang,Kiana Ehsani,Rose Hendrix,Jordi Salvador,Alvaro Herrasti,Ross Girshick,Aniruddha Kembhavi,Luca Weihs
発行日	2024-06-28 17:51:10+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful Navigators

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー