Reconstructing Signing Avatars From Video Using Linguistic Priors

要約

タイトル：言語の先行条件を用いたビデオから手話アバターを再構築する

要約：

– 手話は世界中で7000万人のろう者の主要なコミュニケーション手段である。
– 個々の手話の辞書は、手話の学習における中心的な道具である。
– これらを3Dアバターに置き換えることにより、学習を支援し、AR/VRアプリケーションを可能にすることができ、テクノロジーやオンラインメディアへのアクセスを改善できる。
– しかし、遮蔽、ノイズ、モーションブラーなどがあり、単語の間の曖昧さを解決するのが困難であるため、少数の研究者しか手話ビデオから3Dアバターの作成を試みることができない。
– この問題に対処するために、本研究は、普遍的に手話に適用可能な言語の先行条件を導入し、孤立した手話サインの中の曖昧さを解決するために3D手のポーズに制約を提供する。
– SGNifyという手法により、野外単眼手話ビデオから、細かい手のポーズ、顔の表情、体の運動を完全自動的に捕捉することができる。
– SGNifyを定量的に評価するために、商用モーションキャプチャシステムを使用して、単眼ビデオに同期した3Dアバターを計算する。
– SGNifyは、SLビデオの状態-of-the-art 3D body-pose- and shape-estimation方法を上回る性能を発揮する。
– 感覚的な研究では、SGNifyの3D再構成が過去の手法よりも理解しやすく自然であり、ソースビデオと同等であることが示された。

– 手話の辞書を3Dアバターに置き換えることができ、AR/VRアプリケーションの作成が可能になる。
– 言語の先行条件を使用することで、手話サインの曖昧さを解決することができ、3D手のポーズを制約することができる。
– SGNifyは、野外の手話ビデオから自動的に細かい手のポーズ、顔の表情、体の運動を捕捉する。
– SGNifyは、3D body-pose- and shape-estimation方法よりも優れた性能を発揮し、3D再構成がより自然で理解しやすくなることが示されている。

要約(オリジナル)

Sign language (SL) is the primary method of communication for the 70 million Deaf people around the world. Video dictionaries of isolated signs are a core SL learning tool. Replacing these with 3D avatars can aid learning and enable AR/VR applications, improving access to technology and online media. However, little work has attempted to estimate expressive 3D avatars from SL video; occlusion, noise, and motion blur make this task difficult. We address this by introducing novel linguistic priors that are universally applicable to SL and provide constraints on 3D hand pose that help resolve ambiguities within isolated signs. Our method, SGNify, captures fine-grained hand pose, facial expression, and body movement fully automatically from in-the-wild monocular SL videos. We evaluate SGNify quantitatively by using a commercial motion-capture system to compute 3D avatars synchronized with monocular video. SGNify outperforms state-of-the-art 3D body-pose- and shape-estimation methods on SL videos. A perceptual study shows that SGNify’s 3D reconstructions are significantly more comprehensible and natural than those of previous methods and are on par with the source videos. Code and data are available at $\href{http://sgnify.is.tue.mpg.de}{\text{sgnify.is.tue.mpg.de}}$.

arxiv情報

著者	Maria-Paola Forte,Peter Kulits,Chun-Hao Huang,Vasileios Choutas,Dimitrios Tzionas,Katherine J. Kuchenbecker,Michael J. Black
発行日	2023-04-20 17:29:50+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, OpenAI

Reconstructing Signing Avatars From Video Using Linguistic Priors

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー