A Unified Masked Autoencoder with Patchified Skeletons for Motion Synthesis

要約

人間の動きの合成は伝統的に、将来の動きの予測や、既知のキーポーズを条件とした中間ポーズの埋め込みなど、特定の課題に焦点を当てたタスク依存モデルを通じて対処されてきました。
この論文では、統合アーキテクチャを使用してこれらの課題に効果的に対処できる、UNIMASK-M と呼ばれる新しいタスク独立モデルを紹介します。
当社のモデルは、各分野で最先端のモデルと同等以上のパフォーマンスを実現します。
ビジョントランスフォーマー (ViT) からインスピレーションを得た当社の UNIMASK-M モデルは、人間のポーズを体の部分に分解し、人間の動きに存在する時空間関係を活用します。
さらに、入力として与えられたさまざまなマスキングパターンを使用して、さまざまなポーズ条件付きモーション合成タスクを再構築問題として再定式化します。
マスクされたジョイントについてモデルに明示的に通知することで、UNIMASK-M はオクルージョンに対してより堅牢になります。
実験結果は、私たちのモデルが Human3.6M データセット上の人間の動きをうまく予測できることを示しています。
さらに、LaFAN1 データセット上の中間の動き、特に長い遷移期間において最先端の結果が得られます。
詳細については、プロジェクトの Web サイト https://sites.google.com/view/estevevallsmascaro/publications/unimask-m をご覧ください。

要約(オリジナル)

The synthesis of human motion has traditionally been addressed through task-dependent models that focus on specific challenges, such as predicting future motions or filling in intermediate poses conditioned on known key-poses. In this paper, we present a novel task-independent model called UNIMASK-M, which can effectively address these challenges using a unified architecture. Our model obtains comparable or better performance than the state-of-the-art in each field. Inspired by Vision Transformers (ViTs), our UNIMASK-M model decomposes a human pose into body parts to leverage the spatio-temporal relationships existing in human motion. Moreover, we reformulate various pose-conditioned motion synthesis tasks as a reconstruction problem with different masking patterns given as input. By explicitly informing our model about the masked joints, our UNIMASK-M becomes more robust to occlusions. Experimental results show that our model successfully forecasts human motion on the Human3.6M dataset. Moreover, it achieves state-of-the-art results in motion inbetweening on the LaFAN1 dataset, particularly in long transition periods. More information can be found on the project website https://sites.google.com/view/estevevallsmascaro/publications/unimask-m.

arxiv情報

著者	Esteve Valls Mascaro,Hyemin Ahn,Dongheui Lee
発行日	2023-08-14 17:39:44+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A Unified Masked Autoencoder with Patchified Skeletons for Motion Synthesis

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー