Human Perception of Audio Deepfakes

要約

最近のディープフェイクの出現により、操作され生成されたコンテンツが機械学習研究の最前線に浮上しています。
ディープフェイクの自動検出では、多くの新しい機械学習技術が導入されていますが、人間による検出機能についてはあまり研究されていません。
この論文では、誰かの声を模倣するために使用される音声ディープフェイクを検出する人間と機械の能力を比較した結果を紹介します。
このために、ゲームとして定式化された Web ベースのアプリケーションフレームワークを使用します。
参加者は、本物のオーディオサンプルと偽物のオーディオサンプルを区別するように求められました。
私たちの実験では、472 人のユニークユーザーが最先端の AI ディープフェイク検出アルゴリズムと合計 14912 ラウンドのゲームを競い合いました。
人間とディープフェイク検出アルゴリズムは同様の長所と短所を共有しており、どちらも特定の種類の攻撃を検出するのに苦労していることがわかりました。
これは、物体検出や顔認識などの多くの応用分野における AI の超人的なパフォーマンスとは対照的です。
人間の成功要因に関しては、IT プロフェッショナルは非専門家よりも有利ではありませんが、ネイティブスピーカーは非ネイティブスピーカーよりも有利であることがわかりました。
さらに、高齢の参加者は若い参加者よりも影響を受けやすい傾向があることがわかりました。
これらの洞察は、人間向けの将来のサイバーセキュリティトレーニングを設計するときや、より優れた検出アルゴリズムを開発するときに役立つ可能性があります。

要約(オリジナル)

The recent emergence of deepfakes has brought manipulated and generated content to the forefront of machine learning research. Automatic detection of deepfakes has seen many new machine learning techniques, however, human detection capabilities are far less explored. In this paper, we present results from comparing the abilities of humans and machines for detecting audio deepfakes used to imitate someone’s voice. For this, we use a web-based application framework formulated as a game. Participants were asked to distinguish between real and fake audio samples. In our experiment, 472 unique users competed against a state-of-the-art AI deepfake detection algorithm for 14912 total of rounds of the game. We find that humans and deepfake detection algorithms share similar strengths and weaknesses, both struggling to detect certain types of attacks. This is in contrast to the superhuman performance of AI in many application areas such as object detection or face recognition. Concerning human success factors, we find that IT professionals have no advantage over non-professionals but native speakers have an advantage over non-native speakers. Additionally, we find that older participants tend to be more susceptible than younger ones. These insights may be helpful when designing future cybersecurity training for humans as well as developing better detection algorithms.

arxiv情報

著者	Nicolas M. Müller,Karla Pizzi,Jennifer Williams
発行日	2024-08-27 15:19:45+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Human Perception of Audio Deepfakes

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー