Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation

要約

ビジョンは、特に視覚サーボを使用して、操作での使用でよく知られています。
それを堅牢にするには、視野を拡張するために複数のカメラが必要です。
それは計算的に挑戦的です。
複数のビューをマージし、Qラーニングを使用すると、より効果的な表現の設計とサンプル効率の最適化が可能になります。
このようなソリューションは、展開するのに費用がかかる場合があります。
これを緩和するために、ビューを効率的にマージしてサンプル効率を高めながら、シングルビュー機能で増加し、軽量の展開を可能にし、堅牢なポリシーを確保するために、ビューを効率的にマージしてサンプルの効率を高めるマージと解き分析（MAD）アルゴリズムを導入します。
Meta-WorldとManiskill3を使用して、アプローチの効率と堅牢性を示します。
プロジェクトのWebサイトとコードについては、https：//aalmuzaire.github.io/madを参照してください

要約(オリジナル)

Vision is well-known for its use in manipulation, especially using visual servoing. To make it robust, multiple cameras are needed to expand the field of view. That is computationally challenging. Merging multiple views and using Q-learning allows the design of more effective representations and optimization of sample efficiency. Such a solution might be expensive to deploy. To mitigate this, we introduce a Merge And Disentanglement (MAD) algorithm that efficiently merges views to increase sample efficiency while augmenting with single-view features to allow lightweight deployment and ensure robust policies. We demonstrate the efficiency and robustness of our approach using Meta-World and ManiSkill3. For project website and code, see https://aalmuzairee.github.io/mad

arxiv情報

著者	Abdulaziz Almuzairee,Rohan Patil,Dwait Bhatt,Henrik I. Christensen
発行日	2025-05-07 17:59:28+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー