FeatureNeRF: Learning Generalizable NeRFs by Distilling Foundation Models

要約

一般化可能な NeRF に関する最近の研究では、単一または少数の画像からの新しいビュー合成に関する有望な結果が示されています。
ただし、そのようなモデルは、セマンティックの理解や解析など、合成以外の他のダウンストリームタスクに適用されることはめったにありません。
このホワイトペーパーでは、事前にトレーニングされたビジョン基盤モデル (DINO、潜在拡散など) を抽出することによって一般化可能な NeRF を学習する、FeatureNeRF という名前の新しいフレームワークを提案します。
FeatureNeRF は、ニューラルレンダリングを介して 2D の事前トレーニング済みの基盤モデルを 3D 空間に活用し、NeRF MLP から 3D クエリポイントの深い特徴を抽出します。
その結果、2D 画像を連続した 3D セマンティックフィーチャボリュームにマッピングし、さまざまなダウンストリームタスクに使用できます。
2D/3D セマンティックキーポイント転送と 2D/3D オブジェクトパーツセグメンテーションのタスクで FeatureNeRF を評価します。
私たちの広範な実験は、一般化可能な 3D セマンティック特徴抽出器としての FeatureNeRF の有効性を示しています。
私たちのプロジェクトページは https://jianglongye.com/featurenerf/ にあります。

要約(オリジナル)

Recent works on generalizable NeRFs have shown promising results on novel view synthesis from single or few images. However, such models have rarely been applied on other downstream tasks beyond synthesis such as semantic understanding and parsing. In this paper, we propose a novel framework named FeatureNeRF to learn generalizable NeRFs by distilling pre-trained vision foundation models (e.g., DINO, Latent Diffusion). FeatureNeRF leverages 2D pre-trained foundation models to 3D space via neural rendering, and then extract deep features for 3D query points from NeRF MLPs. Consequently, it allows to map 2D images to continuous 3D semantic feature volumes, which can be used for various downstream tasks. We evaluate FeatureNeRF on tasks of 2D/3D semantic keypoint transfer and 2D/3D object part segmentation. Our extensive experiments demonstrate the effectiveness of FeatureNeRF as a generalizable 3D semantic feature extractor. Our project page is available at https://jianglongye.com/featurenerf/ .

arxiv情報

著者	Jianglong Ye,Naiyan Wang,Xiaolong Wang
発行日	2023-03-22 17:57:01+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

FeatureNeRF: Learning Generalizable NeRFs by Distilling Foundation Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー