ES-MVSNet: Efficient Framework for End-to-end Self-supervised Multi-View Stereo

要約

多段階の自己教師付きマルチビューステレオ（MVS）法と比較して、エンドツーエンド（E2E）アプローチは、その簡潔で効率的な学習パイプラインのため、より注目されている。最近のE2E自己教師付きMVS手法は、サードパーティモデル（オプティカルフローモデル、セマンティックセグメンテーションモデル、NeRFモデルなど）を統合し、追加の整合性制約を提供しているが、これはGPUのメモリ消費を増大させ、モデルの構造と学習パイプラインを複雑にしている。本研究では、ES-MVSNetと名付けられた、エンドツーエンドの自己教師付きMVSのための効率的なフレームワークを提案する。現在のE2E自己教師付きMVSフレームワークの高いメモリ消費量を緩和するために、メモリ効率の良いアーキテクチャを提示し、モデルの性能を損なうことなくメモリ使用量を43%削減する。さらに、非対称ビュー選択ポリシーと領域を考慮した深さ一貫性の新しい設計により、追加の一貫性信号のためにサードパーティのモデルに依存することなく、E2E自己教師付きMVS手法の中で最先端の性能を達成する。DTUとTanks&Templesベンチマークを用いた広範な実験により、提案するES-MVSNetアプローチが、E2E自己教師ありMVS手法の中で最先端の性能を達成し、多くの教師あり自己教師あり手法や多段自己教師あり手法に対して競争力のある性能を達成することが実証される。

要約(オリジナル)

Compared to the multi-stage self-supervised multi-view stereo (MVS) method, the end-to-end (E2E) approach has received more attention due to its concise and efficient training pipeline. Recent E2E self-supervised MVS approaches have integrated third-party models (such as optical flow models, semantic segmentation models, NeRF models, etc.) to provide additional consistency constraints, which grows GPU memory consumption and complicates the model’s structure and training pipeline. In this work, we propose an efficient framework for end-to-end self-supervised MVS, dubbed ES-MVSNet. To alleviate the high memory consumption of current E2E self-supervised MVS frameworks, we present a memory-efficient architecture that reduces memory usage by 43% without compromising model performance. Furthermore, with the novel design of asymmetric view selection policy and region-aware depth consistency, we achieve state-of-the-art performance among E2E self-supervised MVS methods, without relying on third-party models for additional consistency signals. Extensive experiments on DTU and Tanks&Temples benchmarks demonstrate that the proposed ES-MVSNet approach achieves state-of-the-art performance among E2E self-supervised MVS methods and competitive performance to many supervised and multi-stage self-supervised methods.

arxiv情報

著者	Qiang Zhou,Chaohui Yu,Jingliang Li,Yuang Liu,Jing Wang,Zhibin Wang
発行日	2023-08-04 08:16:47+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

ES-MVSNet: Efficient Framework for End-to-end Self-supervised Multi-View Stereo

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー