Indoor and Outdoor 3D Scene Graph Generation via Language-Enabled Spatial Ontologies

要約

この論文では、任意の屋内および屋外環境で 3D シーングラフを構築するアプローチを提案します。
このような拡張は困難です。
屋外環境を説明する概念の階層は屋内よりも複雑で、そのような階層を手動で定義するには時間がかかり、拡張性がありません。
さらに、トレーニングデータが不足しているため、屋内環境で使用される学習ベースのツールを直接適用することができません。
これらの課題に対処するために、2 つの新しい拡張機能を提案します。
まず、屋内および屋外のロボット操作に関連する概念と関係を定義する空間オントロジーを構築する方法を開発します。
特に、このようなオントロジーを構築するために大規模言語モデル (LLM) を使用することで、必要な手作業の量が大幅に削減されます。
2 番目に、ロジックテンソルネットワーク (LTN) を使用して 3D シーングラフ構築の空間オントロジーを活用し、論理ルールまたは公理 (例: 「ビーチには砂が含まれている」) を追加します。これにより、トレーニング時に追加の監視信号が提供されるため、
ラベル付きデータを使用して、より適切な予測を提供し、トレーニング時には見えなかった概念を予測することも可能になります。
私たちは、屋内、農村、沿岸環境を含むさまざまなデータセットでアプローチをテストし、まばらに注釈が付けられたデータによる 3D シーングラフ生成の品質の大幅な向上につながることを示しました。

要約(オリジナル)

This paper proposes an approach to build 3D scene graphs in arbitrary indoor and outdoor environments. Such extension is challenging; the hierarchy of concepts that describe an outdoor environment is more complex than for indoors, and manually defining such hierarchy is time-consuming and does not scale. Furthermore, the lack of training data prevents the straightforward application of learning-based tools used in indoor settings. To address these challenges, we propose two novel extensions. First, we develop methods to build a spatial ontology defining concepts and relations relevant for indoor and outdoor robot operation. In particular, we use a Large Language Model (LLM) to build such an ontology, thus largely reducing the amount of manual effort required. Second, we leverage the spatial ontology for 3D scene graph construction using Logic Tensor Networks (LTN) to add logical rules, or axioms (e.g., ‘a beach contains sand’), which provide additional supervisory signals at training time thus reducing the need for labelled data, providing better predictions, and even allowing predicting concepts unseen at training time. We test our approach in a variety of datasets, including indoor, rural, and coastal environments, and show that it leads to a significant increase in the quality of the 3D scene graph generation with sparsely annotated data.

arxiv情報

著者	Jared Strader,Nathan Hughes,William Chen,Alberto Speranzon,Luca Carlone
発行日	2024-04-24 21:57:57+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Indoor and Outdoor 3D Scene Graph Generation via Language-Enabled Spatial Ontologies

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー