V4d: voxel for 4d novel view synthesis

要約

ニューラル放射輝度フィールドは、3D 静的シーンでの新しいビュー合成タスクにおいて目覚ましい進歩を遂げました。
ただし、4D 環境 (動的シーンなど) の場合、既存の方法のパフォーマンスは、通常は多層パーセプトロンネットワーク (MLP) のニューラルネットワークの容量によって依然として制限されます。
この論文では、3D ボクセルを利用して 4D 神経放射フィールド (略称 V4D) をモデル化します。3D ボクセルには 2 つの形式があります。
1 つ目は、3D 空間を定期的にモデル化し、サンプリングされたローカル 3D 特徴を時間インデックスとともに使用して、小さな MLP によって密度フィールドとテクスチャフィールドをモデル化することです。
2 つ目は、ピクセルレベルのリファインメント用のルックアップテーブル (LUT) 形式であり、ボリュームレンダリングによって生成された擬似サーフェスが、2D ピクセルレベルのリファインメントマッピングを学習するためのガイダンス情報として利用されます。
提案された LUT ベースの改良モジュールは、わずかな計算コストでパフォーマンスの向上を達成し、新規ビュー合成タスクにおけるプラグアンドプレイモジュールとして機能する可能性があります。
さらに、無視できる計算負担でパフォーマンスの向上を達成する、4D データに対するより効果的な条件付き位置エンコーディングを提案します。
広範な実験により、提案された方法が低い計算コストで最先端のパフォーマンスを達成できることが実証されています。

要約(オリジナル)

Neural radiance fields have made a remarkable breakthrough in the novel view synthesis task at the 3D static scene. However, for the 4D circumstance (e.g., dynamic scene), the performance of the existing method is still limited by the capacity of the neural network, typically in a multilayer perceptron network (MLP). In this paper, we utilize 3D Voxel to model the 4D neural radiance field, short as V4D, where the 3D voxel has two formats. The first one is to regularly model the 3D space and then use the sampled local 3D feature with the time index to model the density field and the texture field by a tiny MLP. The second one is in look-up tables (LUTs) format that is for the pixel-level refinement, where the pseudo-surface produced by the volume rendering is utilized as the guidance information to learn a 2D pixel-level refinement mapping. The proposed LUTs-based refinement module achieves the performance gain with little computational cost and could serve as the plug-and-play module in the novel view synthesis task. Moreover, we propose a more effective conditional positional encoding toward the 4D data that achieves performance gain with negligible computational burdens. Extensive experiments demonstrate that the proposed method achieves state-of-the-art performance at a low computational cost.

arxiv情報

著者	Wanshui Gan,Hongbin Xu,Yi Huang,Shifeng Chen,Naoto Yokoya
発行日	2024-08-13 15:43:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

V4d: voxel for 4d novel view synthesis

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー