LVDiffusor: Distilling Functional Rearrangement Priors from Large Models into Diffusor

要約

ロボット工学における基本的な課題であるオブジェクトの再配置には、多様なオブジェクト、構成、機能のニーズに対応するための多彩な戦略が必要です。
これを達成するには、AI ロボットは機能要件を満たす正確な目標を指定するために機能再配置の事前学習を行う必要があります。
従来の方法では通常、このような事前確率を、手間のかかる人による注釈または手動で設計されたヒューリスティックのいずれかから学習するため、スケーラビリティと一般化が制限されます。
この研究では、大規模なモデルを活用して関数の再配置事前分布を抽出する新しいアプローチを提案します。
具体的には、LLM と VLM の両方を使用して多様な配置例を収集し、それらの例を抽出して普及モデルに変換します。
テスト中、学習された拡散モデルは初期構成に基づいて条件付けされ、機能要件を満たすようにオブジェクトの配置をガイドします。
このようにして、条件付き生成モデルと大規模モデルの長所を組み合わせたハンドシェークポイントを作成します。
現実世界のシナリオを含む複数のドメインでの広範な実験により、オブジェクト再配置タスクの互換性のある目標を生成する際のアプローチの有効性が実証され、ベースライン手法を大幅に上回ります。

要約(オリジナル)

Object rearrangement, a fundamental challenge in robotics, demands versatile strategies to handle diverse objects, configurations, and functional needs. To achieve this, the AI robot needs to learn functional rearrangement priors in order to specify precise goals that meet the functional requirements. Previous methods typically learn such priors from either laborious human annotations or manually designed heuristics, which limits scalability and generalization. In this work, we propose a novel approach that leverages large models to distill functional rearrangement priors. Specifically, our approach collects diverse arrangement examples using both LLMs and VLMs and then distills the examples into a diffusion model. During test time, the learned diffusion model is conditioned on the initial configuration and guides the positioning of objects to meet functional requirements. In this manner, we create a handshaking point that combines the strengths of conditional generative models and large models. Extensive experiments on multiple domains, including real-world scenarios, demonstrate the effectiveness of our approach in generating compatible goals for object rearrangement tasks, significantly outperforming baseline methods.

arxiv情報

著者	Yiming Zeng,Mingdong Wu,Long Yang,Jiyao Zhang,Hao Ding,Hui Cheng,Hao Dong
発行日	2024-03-08 16:53:13+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

LVDiffusor: Distilling Functional Rearrangement Priors from Large Models into Diffusor

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー