Real3D: Scaling Up Large Reconstruction Models with Real-World Images

要約

シングルビューの大規模再構成モデル (LRM) をトレーニングするためのデフォルトの戦略は、合成 3D アセットまたはマルチビューキャプチャの大規模なデータセットを使用した、完全に監視されたルートに従います。
これらのリソースはトレーニング手順を簡素化しますが、既存のデータセットを超えてスケールアップするのは難しく、必ずしもオブジェクト形状の実際の分布を表すものではありません。
これらの制限に対処するために、このホワイトペーパーでは、単一ビューの実世界画像を使用してトレーニングできる初の LRM システムである Real3D を紹介します。
Real3D は、既存の合成データと多様な単一ビューの実画像の両方から恩恵を受けることができる新しい自己トレーニングフレームワークを導入します。
私たちは、グラウンドトゥルース 3D や新しいビューを持たないトレーニング例であっても、ピクセルおよびセマンティックレベルで LRM を監視できるようにする 2 つの教師なし損失を提案します。
パフォーマンスをさらに向上させ、画像データをスケールアップするために、実際の画像から高品質のサンプルを収集する自動データキュレーションアプローチを開発しました。
私たちの実験では、Real3D が、実データと合成データ、ドメイン内形状とドメイン外形状の両方を含む 4 つの多様な評価設定において、一貫して以前の研究を上回るパフォーマンスを示しています。
コードとモデルはここにあります: https://hwjiang1510.github.io/Real3D/

要約(オリジナル)

The default strategy for training single-view Large Reconstruction Models (LRMs) follows the fully supervised route using large-scale datasets of synthetic 3D assets or multi-view captures. Although these resources simplify the training procedure, they are hard to scale up beyond the existing datasets and they are not necessarily representative of the real distribution of object shapes. To address these limitations, in this paper, we introduce Real3D, the first LRM system that can be trained using single-view real-world images. Real3D introduces a novel self-training framework that can benefit from both the existing synthetic data and diverse single-view real images. We propose two unsupervised losses that allow us to supervise LRMs at the pixel- and semantic-level, even for training examples without ground-truth 3D or novel views. To further improve performance and scale up the image data, we develop an automatic data curation approach to collect high-quality examples from in-the-wild images. Our experiments show that Real3D consistently outperforms prior work in four diverse evaluation settings that include real and synthetic data, as well as both in-domain and out-of-domain shapes. Code and model can be found here: https://hwjiang1510.github.io/Real3D/

arxiv情報

著者	Hanwen Jiang,Qixing Huang,Georgios Pavlakos
発行日	2024-06-12 17:59:08+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Real3D: Scaling Up Large Reconstruction Models with Real-World Images

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー