HAMMER: Heterogeneous, Multi-Robot Semantic Gaussian Splatting


3D Gaussian Splattingは、表現力豊かなシーンの再構築を提供し、幅広い視覚、幾何学、意味情報をモデル化します。
現実世界の実験では、ハンマーは競合するベースラインと比較してより高い忠実度マップ(2倍)を作成し、セマンティックの目標条件付きナビゲーション(「Go go to the Couch」などのダウンストリームタスクに役立ちます。


3D Gaussian Splatting offers expressive scene reconstruction, modeling a broad range of visual, geometric, and semantic information. However, efficient real-time map reconstruction with data streamed from multiple robots and devices remains a challenge. To that end, we propose HAMMER, a server-based collaborative Gaussian Splatting method that leverages widely available ROS communication infrastructure to generate 3D, metric-semantic maps from asynchronous robot data-streams with no prior knowledge of initial robot positions and varying on-device pose estimators. HAMMER consists of (i) a frame alignment module that transforms local SLAM poses and image data into a global frame and requires no prior relative pose knowledge, and (ii) an online module for training semantic 3DGS maps from streaming data. HAMMER handles mixed perception modes, adjusts automatically for variations in image pre-processing among different devices, and distills CLIP semantic codes into the 3D scene for open-vocabulary language queries. In our real-world experiments, HAMMER creates higher-fidelity maps (2x) compared to competing baselines and is useful for downstream tasks, such as semantic goal-conditioned navigation (e.g., “go to the couch’). Accompanying content available at hammer-project.github.io.


著者 Javier Yu,Timothy Chen,Mac Schwager
発行日 2025-01-24 00:21:10+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.RO パーマリンク