The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio

要約

Audio Language Model (ALM) ベースのディープフェイクオーディオの普及に伴い、汎用化された検出方法が緊急に必要とされています。
ALM ベースのディープフェイクオーディオは現在、広範で高度な欺瞞性とタイプの多様性を示しており、ボコードデータのみでトレーニングされた現在のオーディオディープフェイク検出 (ADD) モデルに大きな課題をもたらしています。
ALM ベースのディープフェイクオーディオを効果的に検出するために、ALM ベースのオーディオ生成方法のメカニズム、つまりニューラルコーデックから波形への変換に焦点を当てます。
最初に、ALM ベースの音声検出に焦点を当てた、2 つの言語、100 万を超える音声サンプル、およびさまざまなテスト条件を含むオープンソースの大規模データセットである Codecfake データセットを構築します。
対策として、ディープフェイクオーディオの普遍的な検出を達成し、オリジナルの SAM のドメイン上昇バイアスの問題に取り組むために、ドメインのバランスがとれた一般化された最小値を学習する CSAM 戦略を提案します。
私たちの実験では、Codecfake データセットを使用した ADD モデルのトレーニングが ALM ベースのオーディオを効果的に検出できることをまず実証します。
さらに、私たちが提案した一般化対策では、ベースラインモデルと比較して、すべてのテスト条件にわたって平均等誤り率 (EER) が 0.616% と最も低くなります。
データセットと関連コードはオンラインで入手できます。

要約(オリジナル)

With the proliferation of Audio Language Model (ALM) based deepfake audio, there is an urgent need for generalized detection methods. ALM-based deepfake audio currently exhibits widespread, high deception, and type versatility, posing a significant challenge to current audio deepfake detection (ADD) models trained solely on vocoded data. To effectively detect ALM-based deepfake audio, we focus on the mechanism of the ALM-based audio generation method, the conversion from neural codec to waveform. We initially construct the Codecfake dataset, an open-source large-scale dataset, including 2 languages, over 1M audio samples, and various test conditions, focus on ALM-based audio detection. As countermeasure, to achieve universal detection of deepfake audio and tackle domain ascent bias issue of original SAM, we propose the CSAM strategy to learn a domain balanced and generalized minima. In our experiments, we first demonstrate that ADD model training with the Codecfake dataset can effectively detects ALM-based audio. Furthermore, our proposed generalization countermeasure yields the lowest average Equal Error Rate (EER) of 0.616% across all test conditions compared to baseline models. The dataset and associated code are available online.

arxiv情報

著者	Yuankun Xie,Yi Lu,Ruibo Fu,Zhengqi Wen,Zhiyong Wang,Jianhua Tao,Xin Qi,Xiaopeng Wang,Yukun Liu,Haonan Cheng,Long Ye,Yi Sun
発行日	2024-05-15 12:24:52+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー