Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection

要約

顔が操作されたビデオを検出するための最も差し迫った課題の 1 つは、圧縮などの一般的な破損の下で効果を維持しながら、トレーニング中には見られない偽造方法に一般化することです。
この論文では、自然な顔の外観と行動に関する豊富な情報を含み、オンラインで大量に入手できる実際の話している顔のビデオを利用して、この問題に取り組むことができるかどうかを調べます。
RealForensics と呼ばれる私たちの方法は、2 つの段階で構成されています。
まず、実際のビデオの視覚的モダリティと聴覚的モダリティの間の自然な対応を利用して、顔の動き、表情、アイデンティティなどの要素をキャプチャする時間的に密なビデオ表現を自己教師ありのクロスモーダル方法で学習します。
次に、これらの学習した表現を、通常のバイナリ偽造分類タスクとともに、偽造検出器によって予測されるターゲットとして使用します。
これにより、上記の要因に基づいて本物/偽物の決定を下すことが奨励されます.
私たちの方法がクロスマニピュレーション一般化とロバスト性実験で最先端のパフォーマンスを達成することを示し、そのパフォーマンスに寄与する要因を調べます。
私たちの結果は、自然でラベルのないビデオを活用することが、より堅牢な顔偽造検出器の開発にとって有望な方向性であることを示唆しています。

要約(オリジナル)

One of the most pressing challenges for the detection of face-manipulated videos is generalising to forgery methods not seen during training while remaining effective under common corruptions such as compression. In this paper, we examine whether we can tackle this issue by harnessing videos of real talking faces, which contain rich information on natural facial appearance and behaviour and are readily available in large quantities online. Our method, termed RealForensics, consists of two stages. First, we exploit the natural correspondence between the visual and auditory modalities in real videos to learn, in a self-supervised cross-modal manner, temporally dense video representations that capture factors such as facial movements, expression, and identity. Second, we use these learned representations as targets to be predicted by our forgery detector along with the usual binary forgery classification task; this encourages it to base its real/fake decision on said factors. We show that our method achieves state-of-the-art performance on cross-manipulation generalisation and robustness experiments, and examine the factors that contribute to its performance. Our results suggest that leveraging natural and unlabelled videos is a promising direction for the development of more robust face forgery detectors.

arxiv情報

著者	Alexandros Haliassos,Rodrigo Mira,Stavros Petridis,Maja Pantic
発行日	2022-10-21 11:36:32+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー