Vulnerability-Aware Spatio-Temporal Learning for Generalizable and Interpretable Deepfake Video Detection

要約

偽造シーケンスには空間的および時間的なアーチファクトが複雑に絡み合っているため、ディープフェイクビデオの検出は非常に困難です。
最近のアプローチは、本物のデータと偽のデータの両方でトレーニングされたバイナリ分類器に依存しています。
ただし、そのような方法では重要な成果物に焦点を当てるのが難しい場合があり、一般化能力が妨げられる可能性があります。
さらに、これらのモデルは解釈可能性に欠けていることが多く、予測がどのように行われるかを理解することが困難になります。
これらの問題に対処するために、私たちは 2 つの重要な貢献を提供する FakeStormer を提案します。
まず、モデルが微妙な時空間アーティファクトに焦点を当てることを可能にする追加の空間的および時間的ブランチを備えたマルチタスク学習フレームワークを導入します。
これらのブランチは、アーティファクトを含む可能性のあるビデオ領域を強調表示することにより、解釈可能性も提供します。
2 番目に、微妙なアーティファクトを含む疑似フェイクビデオを生成するビデオレベルのデータ合成アルゴリズムを提案し、空間的および時間的ブランチに高品質のサンプルとグラウンドトゥルースデータをモデルに提供します。
いくつかの挑戦的なベンチマークに関する広範な実験により、最近の最先端の手法と比較した当社のアプローチの競争力が実証されています。
コードは https://github.com/10Ring/FakeSTormer で入手できます。

要約(オリジナル)

Detecting deepfake videos is highly challenging due to the complex intertwined spatial and temporal artifacts in forged sequences. Most recent approaches rely on binary classifiers trained on both real and fake data. However, such methods may struggle to focus on important artifacts, which can hinder their generalization capability. Additionally, these models often lack interpretability, making it difficult to understand how predictions are made. To address these issues, we propose FakeSTormer, offering two key contributions. First, we introduce a multi-task learning framework with additional spatial and temporal branches that enable the model to focus on subtle spatio-temporal artifacts. These branches also provide interpretability by highlighting video regions that may contain artifacts. Second, we propose a video-level data synthesis algorithm that generates pseudo-fake videos with subtle artifacts, providing the model with high-quality samples and ground truth data for our spatial and temporal branches. Extensive experiments on several challenging benchmarks demonstrate the competitiveness of our approach compared to recent state-of-the-art methods. The code is available at https://github.com/10Ring/FakeSTormer.

arxiv情報

著者	Dat Nguyen,Marcella Astrid,Anis Kacem,Enjie Ghorbel,Djamila Aouada
発行日	2025-01-16 17:11:06+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Vulnerability-Aware Spatio-Temporal Learning for Generalizable and Interpretable Deepfake Video Detection

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー