Fast Monocular Scene Reconstruction with Global-Sparse Local-Dense Grids

要約

単眼画像からの屋内シーンの再構築は、拡張現実やロボット工学の開発者によって長い間求められてきました。
神経フィールド表現と単眼事前分布における最近の進歩により、シーンレベルの表面再構成において顕著な結果が得られました。
ただし、多層パーセプトロン (MLP) への依存により、トレーニングとレンダリングの速度が大幅に制限されます。
この研究では、MLP を使用せずに高速かつ正確なシーンを再構築するために、まばらなボクセルブロックグリッドで符号付き距離関数 (SDF) を直接使用することを提案します。
グローバルに疎でローカルに密なデータ構造は、サーフェスの空間的疎性を利用し、キャッシュに優しいクエリを可能にし、色やセマンティックラベルなどのマルチモーダルデータへの直接拡張を可能にします。
この表現を単眼シーンの再構成に適用するために、単眼深度事前分布からの高速幾何学的初期化のためのスケールキャリブレーションアルゴリズムを開発します。
この初期化から微分可能なボリュームレンダリングを適用して、高速な収束で詳細を調整します。
また、シーンオブジェクト間のセマンティックジオメトリの一貫性をさらに活用するために、効率的な高次元の連続ランダムフィールド (CRF) も導入します。
実験の結果、私たちのアプローチは、最先端のニューラル陰的手法と同等の精度を達成しながら、トレーニングでは 10 倍、レンダリングでは 100 倍高速であることが示されています。

要約(オリジナル)

Indoor scene reconstruction from monocular images has long been sought after by augmented reality and robotics developers. Recent advances in neural field representations and monocular priors have led to remarkable results in scene-level surface reconstructions. The reliance on Multilayer Perceptrons (MLP), however, significantly limits speed in training and rendering. In this work, we propose to directly use signed distance function (SDF) in sparse voxel block grids for fast and accurate scene reconstruction without MLPs. Our globally sparse and locally dense data structure exploits surfaces’ spatial sparsity, enables cache-friendly queries, and allows direct extensions to multi-modal data such as color and semantic labels. To apply this representation to monocular scene reconstruction, we develop a scale calibration algorithm for fast geometric initialization from monocular depth priors. We apply differentiable volume rendering from this initialization to refine details with fast convergence. We also introduce efficient high-dimensional Continuous Random Fields (CRFs) to further exploit the semantic-geometry consistency between scene objects. Experiments show that our approach is 10x faster in training and 100x faster in rendering while achieving comparable accuracy to state-of-the-art neural implicit methods.

arxiv情報

著者	Wei Dong,Chris Choy,Charles Loop,Or Litany,Yuke Zhu,Anima Anandkumar
発行日	2023-05-22 16:50:19+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Fast Monocular Scene Reconstruction with Global-Sparse Local-Dense Grids

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー