Language-Conditioned Affordance-Pose Detection in 3D Point Clouds

要約

アフォーダンス検出と姿勢推定は、多くのロボットアプリケーションにおいて非常に重要です。
これらの組み合わせは、ロボットが強化された操作能力を獲得するのに役立ち、生成されたポーズによって対応するアフォーダンスタスクが容易になります。
アフォダンス・ポーズ共同学習のこれまでの方法は、事前に定義された一連のアフォーダンスに限定されていたため、現実世界の環境におけるロボットの適応性が制限されていました。
この論文では、3D点群における言語条件付きアフォーダンス・ポーズ共同学習のための新しい方法を提案します。
3D 点群オブジェクトが与えられると、私たちの方法はアフォーダンス領域を検出し、制約のないアフォーダンスラベルに対して適切な 6-DoF ポーズを生成します。
私たちの方法は、オープン語彙アフォーダンス検出ブランチと、アフォーダンステキストに基づいて 6-DoF ポーズを生成する言語ガイド付き拡散モデルで構成されます。
また、言語駆動型アフォーダンス・ポーズ共同学習のタスク用の新しい高品質データセットも導入します。
集中的な実験結果は、私たちが提案した方法が広範囲のオープン語彙アフォーダンスで効果的に機能し、他のベースラインを大幅に上回るパフォーマンスを示していることを示しています。
さらに、現実世界のロボット応用におけるこの方法の有用性を説明します。
私たちのコードとデータセットは https://3DAPNet.github.io で公開されています。

要約(オリジナル)

Affordance detection and pose estimation are of great importance in many robotic applications. Their combination helps the robot gain an enhanced manipulation capability, in which the generated pose can facilitate the corresponding affordance task. Previous methods for affodance-pose joint learning are limited to a predefined set of affordances, thus limiting the adaptability of robots in real-world environments. In this paper, we propose a new method for language-conditioned affordance-pose joint learning in 3D point clouds. Given a 3D point cloud object, our method detects the affordance region and generates appropriate 6-DoF poses for any unconstrained affordance label. Our method consists of an open-vocabulary affordance detection branch and a language-guided diffusion model that generates 6-DoF poses based on the affordance text. We also introduce a new high-quality dataset for the task of language-driven affordance-pose joint learning. Intensive experimental results demonstrate that our proposed method works effectively on a wide range of open-vocabulary affordances and outperforms other baselines by a large margin. In addition, we illustrate the usefulness of our method in real-world robotic applications. Our code and dataset are publicly available at https://3DAPNet.github.io

arxiv情報

著者	Toan Nguyen,Minh Nhat Vu,Baoru Huang,Tuan Van Vo,Vy Truong,Ngan Le,Thieu Vo,Bac Le,Anh Nguyen
発行日	2023-09-19 20:10:01+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Language-Conditioned Affordance-Pose Detection in 3D Point Clouds

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー