Deep Semantic Manipulation of Facial Videos

要約

ビデオの顔の特徴の編集と操作は、映画のポストプロダクションや視覚効果から、ビデオゲームや仮想アシスタント用のリアルなアバターに至るまで、数多くの用途がある興味深い重要な研究分野です。
この方法は、ニューラルレンダリングと 3D ベースの表情モデリングに基づくセマンティックビデオ操作をサポートします。
顔の表情を変更および制御することにより、ビデオのインタラクティブな操作に焦点を当て、有望なフォトリアリスティックな結果を達成します。
提案された方法は、3D 顔の形状と活動のもつれた表現と推定に基づいており、ユーザーに入力ビデオの表情の直感的で使いやすい制御を提供します。
また、入力ビデオの特定の部分で必要な表現操作に関する人間が読み取れるセマンティックラベルを処理し、フォトリアリスティックな操作ビデオを合成する、ユーザーフレンドリーでインタラクティブな AI ツールも紹介します。
感情ラベルを Valence-Arousal 空間 (ここで Valence は感情がポジティブかネガティブかを定量化し、Arousal は感情活性化の力を定量化します) 上のポイントにマッピングすることでそれを実現します。
特別に設計され、訓練された表現デコーダーネットワーク。
この論文は、詳細な定性的および定量的実験を提示し、システムの有効性とそれが達成する有望な結果を示しています。

要約(オリジナル)

Editing and manipulating facial features in videos is an interesting and important field of research with a plethora of applications, ranging from movie post-production and visual effects to realistic avatars for video games and virtual assistants. Our method supports semantic video manipulation based on neural rendering and 3D-based facial expression modelling. We focus on interactive manipulation of the videos by altering and controlling the facial expressions, achieving promising photorealistic results. The proposed method is based on a disentangled representation and estimation of the 3D facial shape and activity, providing the user with intuitive and easy-to-use control of the facial expressions in the input video. We also introduce a user-friendly, interactive AI tool that processes human-readable semantic labels about the desired expression manipulations in specific parts of the input video and synthesizes photorealistic manipulated videos. We achieve that by mapping the emotion labels to points on the Valence-Arousal space (where Valence quantifies how positive or negative is an emotion and Arousal quantifies the power of the emotion activation), which in turn are mapped to disentangled 3D facial expressions through an especially-designed and trained expression decoder network. The paper presents detailed qualitative and quantitative experiments, which demonstrate the effectiveness of our system and the promising results it achieves.

arxiv情報

著者	Girish Kumar Solanki,Anastasios Roussos
発行日	2022-10-17 14:34:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Deep Semantic Manipulation of Facial Videos

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー