MeT: A Graph Transformer for Semantic Segmentation of 3D Meshes

要約

ポリゴンメッシュは、不均一な形状を捕捉する際の効率性と高い柔軟性のおかげで、3D 形状を離散的に近似するための標準となっています。
ただし、この不均一性によりメッシュ構造が不規則になり、3D メッシュのセグメンテーションなどのタスクが特に困難になります。
3D メッシュのセマンティックセグメンテーションは通常、CNN ベースのアプローチを通じて対処されており、高い精度が得られます。
最近、トランスフォーマーは NLP とコンピュータービジョンの両方の分野で十分な勢いを増しており、少なくとも CNN モデルと同等のパフォーマンスを達成し、長年求められてきたアーキテクチャの普遍性をサポートしています。
この傾向に従って、グローバルアテンションメカニズムによるメッシュのグラフ構造のより適切なモデリングを動機とする、3D メッシュのセマンティックセグメンテーションのためのトランスフォーマーベースの方法を提案します。
3D メッシュの場合のように非シーケンシャルデータの相対位置をモデリングする際や、ローカルコンテキストをキャプチャする際の標準トランスフォーマーアーキテクチャの制限に対処するために、隣接行列のラプラシアン固有ベクトルを使用して位置エンコードを実行します。
、従来の正弦波位置エンコーディングを置き換え、セルフアテンションおよびクロスアテンションオペレータにクラスタリングベースの機能を導入します。
3 セットの Shape COSEG データセット、Maron et al., 2017 で提案されたヒューマンセグメンテーションデータセット、および ShapeNet ベンチマークで実行された実験結果は、提案されたアプローチがセマンティックセグメンテーションでどのように最先端のパフォーマンスを生み出すかを示しています。
3D メッシュの。

要約(オリジナル)

Polygonal meshes have become the standard for discretely approximating 3D shapes, thanks to their efficiency and high flexibility in capturing non-uniform shapes. This non-uniformity, however, leads to irregularity in the mesh structure, making tasks like segmentation of 3D meshes particularly challenging. Semantic segmentation of 3D mesh has been typically addressed through CNN-based approaches, leading to good accuracy. Recently, transformers have gained enough momentum both in NLP and computer vision fields, achieving performance at least on par with CNN models, supporting the long-sought architecture universalism. Following this trend, we propose a transformer-based method for semantic segmentation of 3D mesh motivated by a better modeling of the graph structure of meshes, by means of global attention mechanisms. In order to address the limitations of standard transformer architectures in modeling relative positions of non-sequential data, as in the case of 3D meshes, as well as in capturing the local context, we perform positional encoding by means the Laplacian eigenvectors of the adjacency matrix, replacing the traditional sinusoidal positional encodings, and by introducing clustering-based features into the self-attention and cross-attention operators. Experimental results, carried out on three sets of the Shape COSEG Dataset, on the human segmentation dataset proposed in Maron et al., 2017 and on the ShapeNet benchmark, show how the proposed approach yields state-of-the-art performance on semantic segmentation of 3D meshes.

arxiv情報

著者	Giuseppe Vecchio,Luca Prezzavento,Carmelo Pino,Francesco Rundo,Simone Palazzo,Concetto Spampinato
発行日	2023-07-03 15:45:14+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

MeT: A Graph Transformer for Semantic Segmentation of 3D Meshes

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー