3D Human Pose Lifting with Grid Convolution

要約

2D シングルビューポーズから 3D 人間のポーズを回帰するための既存のリフティングネットワークは、通常、グラフ構造表現学習に基づく線形レイヤーで構築されます。
それらとは対照的に、このホワイトペーパーでは、画像空間での通常の畳み込み操作の知恵を模倣したグリッド畳み込み (GridConv) を紹介します。
GridConv は、新しいセマンティックグリッド変換 (SGT) に基づいています。これは、バイナリ割り当て行列を利用して、不規則なグラフ構造の人間のポーズを、通常の織りのようなグリッドポーズ表現にジョイントごとにマッピングし、GridConv 操作によるレイヤーごとの特徴学習を可能にします。
手作りの学習可能なデザインを含む、SGT を実装する 2 つの方法を提供します。
驚くべきことに、両方の設計が有望な結果を達成することが判明し、学習可能な方が優れており、この新しいリフティング表現学習定式化の大きな可能性を示しています。
GridConv がコンテキストキューをエンコードする機能を改善するために、畳み込みカーネルにアテンションモジュールを導入し、グリッド畳み込み操作を入力依存、空間認識、およびグリッド固有にします。
完全畳み込みグリッドリフティングネットワークが、(1) Human3.6M での従来の評価と (2) MPI-INF-3DHP での相互評価の下で、顕著なマージンを備えた最先端の方法よりも優れていることを示します。
コードは https://github.com/OSVAI/GridConv で入手できます

要約(オリジナル)

Existing lifting networks for regressing 3D human poses from 2D single-view poses are typically constructed with linear layers based on graph-structured representation learning. In sharp contrast to them, this paper presents Grid Convolution (GridConv), mimicking the wisdom of regular convolution operations in image space. GridConv is based on a novel Semantic Grid Transformation (SGT) which leverages a binary assignment matrix to map the irregular graph-structured human pose onto a regular weave-like grid pose representation joint by joint, enabling layer-wise feature learning with GridConv operations. We provide two ways to implement SGT, including handcrafted and learnable designs. Surprisingly, both designs turn out to achieve promising results and the learnable one is better, demonstrating the great potential of this new lifting representation learning formulation. To improve the ability of GridConv to encode contextual cues, we introduce an attention module over the convolutional kernel, making grid convolution operations input-dependent, spatial-aware and grid-specific. We show that our fully convolutional grid lifting network outperforms state-of-the-art methods with noticeable margins under (1) conventional evaluation on Human3.6M and (2) cross-evaluation on MPI-INF-3DHP. Code is available at https://github.com/OSVAI/GridConv

arxiv情報

著者	Yangyuxuan Kang,Yuyang Liu,Anbang Yao,Shandong Wang,Enhua Wu
発行日	2023-02-17 08:52:16+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

3D Human Pose Lifting with Grid Convolution

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー