HybridGen: VLM-Guided Hybrid Planning for Scalable Data Generation of Imitation Learning

要約

ロボット模倣学習の一般化を改善するためには、大規模で多様なデモデータの獲得が不可欠です。
ただし、複雑な操作のためにそのようなデータを生成することは、実際の設定では困難です。
Vision-Language Model（VLM）とハイブリッド計画を統合する自動化されたフレームワークであるHybridgenを紹介します。
Hybridgenは2段階のパイプラインを使用します。まず、VLMは専門家のデモンストレーションを解析し、タスクをエキスパート依存（正確な制御のためのオブジェクト中心のポーズ変換）および計画可能なセグメント（パス計画による多様な軌跡の合成）に分解します。
第二に、変換は第1段階のデータを大幅に拡張します。
重要なことに、Hybridgenは特定のデータ形式を必要とせずに大量のトレーニングデータを生成し、幅広い模倣学習アルゴリズムに広く適用できます。これは、複数のアルゴリズムで経験的に実証する特性です。
7つのタスクとそのバリエーションにわたる評価は、ハイブリッド体で訓練されたエージェントがかなりのパフォーマンスと一般化の利益を達成し、最新の方法よりも平均5％の改善を達成することを示しています。
特に、最も困難なタスクバリアントでは、ハイブリッドゲンは大幅な改善を達成し、平均成功率が59.7％に達し、Mimicenの49.5％を大幅に上回ります。
これらの結果は、その有効性と実用性を示しています。

要約(オリジナル)

The acquisition of large-scale and diverse demonstration data are essential for improving robotic imitation learning generalization. However, generating such data for complex manipulations is challenging in real-world settings. We introduce HybridGen, an automated framework that integrates Vision-Language Model (VLM) and hybrid planning. HybridGen uses a two-stage pipeline: first, VLM to parse expert demonstrations, decomposing tasks into expert-dependent (object-centric pose transformations for precise control) and plannable segments (synthesizing diverse trajectories via path planning); second, pose transformations substantially expand the first-stage data. Crucially, HybridGen generates a large volume of training data without requiring specific data formats, making it broadly applicable to a wide range of imitation learning algorithms, a characteristic which we also demonstrate empirically across multiple algorithms. Evaluations across seven tasks and their variants demonstrate that agents trained with HybridGen achieve substantial performance and generalization gains, averaging a 5% improvement over state-of-the-art methods. Notably, in the most challenging task variants, HybridGen achieves significant improvement, reaching a 59.7% average success rate, significantly outperforming Mimicgen’s 49.5%. These results demonstrating its effectiveness and practicality.

arxiv情報

著者	Wensheng Wang,Ning Tan
発行日	2025-03-17 13:49:43+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

HybridGen: VLM-Guided Hybrid Planning for Scalable Data Generation of Imitation Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー