UniGraspTransformer: Simplified Policy Distillation for Scalable Dexterous Robotic Grasping

要約

スケーラビリティとパフォーマンスを向上させながらトレーニングを簡素化する器用なロボット把握のためのユニバーサルトランスベースのネットワークであるUnigrasptransformerを紹介します。
複雑なマルチステップトレーニングパイプラインを必要とするUnidexGrasp ++などの以前の方法とは異なり、Unigrasptransformerは合理化されたプロセスに従います。最初に、専用のポリシーネットワークは、補強学習を使用して成功した把握軌跡を生成するために個々のオブジェクト向けにトレーニングされます。
次に、これらの軌跡は単一の普遍的なネットワークに蒸留されます。
私たちのアプローチにより、Unigrasptransformerは効果的にスケーリングでき、多様なポーズを使用して何千ものオブジェクトを処理するために最大12の自己触媒ブロックを組み込むことができます。
さらに、州ベースの設定とビジョンベースの設定で評価された理想化された入力と現実世界の両方の入力によく一般化されています。
特に、Unigrasptransformerは、さまざまな形状や方向のオブジェクトの幅広い範囲の把握ポーズを生成し、より多様な把握戦略をもたらします。
実験結果は、さまざまなオブジェクトカテゴリにわたって最先端のアート、ユニデクスグラス++にわたって大幅な改善を示し、視覚ベースの設定で、見られたカテゴリ内の3.5％、7.7％、および10.1％、見られたカテゴリ内のオブジェクト、および完全に目に見えないオブジェクトでそれぞれ3.5％、7.7％、および10.1％を達成します。
プロジェクトページ：https：//dexhand.github.io/unigrasptransformer。

要約(オリジナル)

We introduce UniGraspTransformer, a universal Transformer-based network for dexterous robotic grasping that simplifies training while enhancing scalability and performance. Unlike prior methods such as UniDexGrasp++, which require complex, multi-step training pipelines, UniGraspTransformer follows a streamlined process: first, dedicated policy networks are trained for individual objects using reinforcement learning to generate successful grasp trajectories; then, these trajectories are distilled into a single, universal network. Our approach enables UniGraspTransformer to scale effectively, incorporating up to 12 self-attention blocks for handling thousands of objects with diverse poses. Additionally, it generalizes well to both idealized and real-world inputs, evaluated in state-based and vision-based settings. Notably, UniGraspTransformer generates a broader range of grasping poses for objects in various shapes and orientations, resulting in more diverse grasp strategies. Experimental results demonstrate significant improvements over state-of-the-art, UniDexGrasp++, across various object categories, achieving success rate gains of 3.5%, 7.7%, and 10.1% on seen objects, unseen objects within seen categories, and completely unseen objects, respectively, in the vision-based setting. Project page: https://dexhand.github.io/UniGraspTransformer.

arxiv情報

著者	Wenbo Wang,Fangyun Wei,Lei Zhou,Xi Chen,Lin Luo,Xiaohan Yi,Yizhong Zhang,Yaobo Liang,Chang Xu,Yan Lu,Jiaolong Yang,Baining Guo
発行日	2025-03-04 15:26:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

UniGraspTransformer: Simplified Policy Distillation for Scalable Dexterous Robotic Grasping

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー