T-Pixel2Mesh: Combining Global and Local Transformer for 3D Mesh Generation from a Single Image

要約

Pixel2Mesh (P2M) は、粗いメッシュから細かいメッシュへの変形を通じて、単一カラー画像から 3D 形状を再構築するための古典的なアプローチです。
P2M は妥当なグローバル形状を生成できますが、グラフ畳み込みネットワーク (GCN) は過度に滑らかな結果を生成することが多く、きめの細かいジオメトリの詳細が失われます。
さらに、P2M はオクルージョンされた領域に対して信頼できない特徴を生成し、合成データから現実世界の画像までのドメインギャップと格闘します。これは、シングルビュー 3D 再構成手法にとって共通の課題です。
これらの課題に対処するために、P2M の粗いものから細かいものへのアプローチからインスピレーションを得た、T-Pixel2Mesh という名前の新しい Transformer ブーストアーキテクチャを提案します。
具体的には、グローバルトランスフォーマーを使用して全体的な形状を制御し、ローカルトランスフォーマーを使用して、グラフベースのポイントアップサンプリングでローカルジオメトリの詳細を段階的に調整します。
現実世界の再構成を強化するために、入力の前処理中に迅速な調整として機能する、シンプルでありながら効果的な線形スケール検索 (LSS) を紹介します。
ShapeNet での実験では最先端のパフォーマンスが実証され、実世界のデータでの結果は一般化機能が示されています。

要約(オリジナル)

Pixel2Mesh (P2M) is a classical approach for reconstructing 3D shapes from a single color image through coarse-to-fine mesh deformation. Although P2M is capable of generating plausible global shapes, its Graph Convolution Network (GCN) often produces overly smooth results, causing the loss of fine-grained geometry details. Moreover, P2M generates non-credible features for occluded regions and struggles with the domain gap from synthetic data to real-world images, which is a common challenge for single-view 3D reconstruction methods. To address these challenges, we propose a novel Transformer-boosted architecture, named T-Pixel2Mesh, inspired by the coarse-to-fine approach of P2M. Specifically, we use a global Transformer to control the holistic shape and a local Transformer to progressively refine the local geometry details with graph-based point upsampling. To enhance real-world reconstruction, we present the simple yet effective Linear Scale Search (LSS), which serves as prompt tuning during the input preprocessing. Our experiments on ShapeNet demonstrate state-of-the-art performance, while results on real-world data show the generalization capability.

arxiv情報

著者	Shijie Zhang,Boyan Jiang,Keke He,Junwei Zhu,Ying Tai,Chengjie Wang,Yinda Zhang,Yanwei Fu
発行日	2024-03-20 15:14:22+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

T-Pixel2Mesh: Combining Global and Local Transformer for 3D Mesh Generation from a Single Image

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー