Cross-Forgery Analysis of Vision Transformers and CNNs for Deepfake Image Detection

要約

ディープフェイク生成技術は急速に進化しており、リアルな操作画像やビデオを作成することを可能にし、現代社会の静けさを危険にさらしています。
新しく多様な技術の継続的な出現は、直面するさらなる問題をもたらします。つまり、最新の方法を使用して実行された操作を識別できるようにするために、ディープフェイク検出モデルが迅速に更新する機能です。
モデルのトレーニングには大量のデータが必要であり、ディープフェイクの生成方法が最近のものである場合は取得が困難であるため、これは解決するのが非常に複雑な問題です。
さらに、ネットワークを継続的に再トレーニングすることは不可能です。
この論文では、さまざまなディープラーニング手法の中に、トレーニングで使用される1つ以上の特定のディープフェイク生成方法に縛られない程度にディープフェイクの概念を一般化できるものがあるかどうかを自問します。
設定。
ForgeryNetデータセットに基づくクロスフォージェリコンテキストで、VisionTransformerとEfficientNetV2を比較しました。
私たちの実験から、EfficientNetV2はトレーニング方法でより良い結果を得ることに特化する傾向が強いことがわかりますが、Vision Transformersは優れた一般化能力を示し、新しい方法で生成された画像でもより有能になります。

要約(オリジナル)

Deepfake Generation Techniques are evolving at a rapid pace, making it possible to create realistic manipulated images and videos and endangering the serenity of modern society. The continual emergence of new and varied techniques brings with it a further problem to be faced, namely the ability of deepfake detection models to update themselves promptly in order to be able to identify manipulations carried out using even the most recent methods. This is an extremely complex problem to solve, as training a model requires large amounts of data, which are difficult to obtain if the deepfake generation method is too recent. Moreover, continuously retraining a network would be unfeasible. In this paper, we ask ourselves if, among the various deep learning techniques, there is one that is able to generalise the concept of deepfake to such an extent that it does not remain tied to one or more specific deepfake generation methods used in the training set. We compared a Vision Transformer with an EfficientNetV2 on a cross-forgery context based on the ForgeryNet dataset. From our experiments, It emerges that EfficientNetV2 has a greater tendency to specialize often obtaining better results on training methods while Vision Transformers exhibit a superior generalization ability that makes them more competent even on images generated with new methodologies.

arxiv情報

著者	Davide Alessandro Coccomini,Roberto Caldelli,Fabrizio Falchi,Claudio Gennaro,Giuseppe Amato
発行日	2022-06-28 08:50:22+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Cross-Forgery Analysis of Vision Transformers and CNNs for Deepfake Image Detection

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー