Dora: Sampling and Benchmarking for 3D Shape Variational Auto-Encoders

要約

最近の3Dコンテンツ生成パイプラインは、一般に変分自動エンコーダー（VAE）を採用して、拡散ベースの生成のためにシェイプをコンパクトな潜在表現にエンコードします。
ただし、形状型トレーニングで広く採用されている均一な点サンプリング戦略は、しばしば幾何学的な詳細の大幅な損失につながり、形状再建の品質と下流の生成タスクを制限します。
ドラバエは、提案されているシャープエッジサンプリング戦略と二重の分析メカニズムを通じて、VAEの再建を強化する新しいアプローチです。
トレーニング中に幾何学的な複雑さが高い地域を特定して優先順位を付けることにより、私たちの方法は、きめ細かい形状の特徴の保存を大幅に改善します。
このようなサンプリング戦略と二重の注意メカニズムにより、VAEは均一なサンプリングアプローチで通常見逃される重要な幾何学的な詳細に焦点を合わせることができます。
VAEの再構築品質を体系的に評価するために、鋭いエッジの密度を通じて形状の複雑さを定量化するベンチマークであるドラベンチをさらに提案し、これらの顕著な幾何学的特徴の再構築精度に焦点を当てた新しいメトリックを導入します。
ドラベンチでの広範な実験は、ドラバエが最先端の密集したXcube-vaeに匹敵する再構築品質を達成し、少なくとも8ドルの時間$ $ $ $ small（1,280対> 10,000コード）を必要とすることを示しています。

要約(オリジナル)

Recent 3D content generation pipelines commonly employ Variational Autoencoders (VAEs) to encode shapes into compact latent representations for diffusion-based generation. However, the widely adopted uniform point sampling strategy in Shape VAE training often leads to a significant loss of geometric details, limiting the quality of shape reconstruction and downstream generation tasks. We present Dora-VAE, a novel approach that enhances VAE reconstruction through our proposed sharp edge sampling strategy and a dual cross-attention mechanism. By identifying and prioritizing regions with high geometric complexity during training, our method significantly improves the preservation of fine-grained shape features. Such sampling strategy and the dual attention mechanism enable the VAE to focus on crucial geometric details that are typically missed by uniform sampling approaches. To systematically evaluate VAE reconstruction quality, we additionally propose Dora-bench, a benchmark that quantifies shape complexity through the density of sharp edges, introducing a new metric focused on reconstruction accuracy at these salient geometric features. Extensive experiments on the Dora-bench demonstrate that Dora-VAE achieves comparable reconstruction quality to the state-of-the-art dense XCube-VAE while requiring a latent space at least 8$\times$ smaller (1,280 vs. > 10,000 codes).

arxiv情報

著者	Rui Chen,Jianfeng Zhang,Yixun Liang,Guan Luo,Weiyu Li,Jiarui Liu,Xiu Li,Xiaoxiao Long,Jiashi Feng,Ping Tan
発行日	2025-03-24 16:41:50+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Dora: Sampling and Benchmarking for 3D Shape Variational Auto-Encoders

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー