Adversarial Representation Learning for Robust Privacy Preservation in Audio

要約

音イベント検出システムは、監視や環境モニタリングなどの様々なアプリケーションで広く使用されており、データは自動的に収集、処理され、音認識のためにクラウドに送信される。しかし、このプロセスは、ユーザやその周囲の環境に関する機密情報を不注意に明らかにする可能性があるため、プライバシーに関する懸念が生じる。本研究では、音声録音の潜在的特徴から音声活動の検出を効果的に防ぐ、音声録音の表現を学習するための新しい敵対的学習法を提案する。提案手法は、音声分類器によって非音声録音と区別されない、音声を含む音声録音の不変な潜在的表現を生成するモデルを学習する。本研究の新規性は最適化アルゴリズムにあり、音声分類器の重みは、教師ありの方法で学習された分類器の重みと定期的に置き換えられる。これにより、敵対的訓練中に常に音声分類器の識別力を増加させ、敵対的訓練ループの外で訓練された新しい音声分類器を用いても、音声が区別できない潜在的表現を生成する動機付けを与える。提案手法を、プライバシー対策を行わないベースラインアプローチと、事前の敵対的訓練手法と比較して評価し、ベースラインアプローチと比較してプライバシー侵害が大幅に減少することを示す。さらに、先行する敵対的手法がこの目的には実質的に有効でないことを示す。

要約(オリジナル)

Sound event detection systems are widely used in various applications such as surveillance and environmental monitoring where data is automatically collected, processed, and sent to a cloud for sound recognition. However, this process may inadvertently reveal sensitive information about users or their surroundings, hence raising privacy concerns. In this study, we propose a novel adversarial training method for learning representations of audio recordings that effectively prevents the detection of speech activity from the latent features of the recordings. The proposed method trains a model to generate invariant latent representations of speech-containing audio recordings that cannot be distinguished from non-speech recordings by a speech classifier. The novelty of our work is in the optimization algorithm, where the speech classifier’s weights are regularly replaced with the weights of classifiers trained in a supervised manner. This increases the discrimination power of the speech classifier constantly during the adversarial training, motivating the model to generate latent representations in which speech is not distinguishable, even using new speech classifiers trained outside the adversarial training loop. The proposed method is evaluated against a baseline approach with no privacy measures and a prior adversarial training method, demonstrating a significant reduction in privacy violations compared to the baseline approach. Additionally, we show that the prior adversarial method is practically ineffective for this purpose.

arxiv情報

著者	Shayan Gharib,Minh Tran,Diep Luong,Konstantinos Drossos,Tuomas Virtanen
発行日	2024-01-03 13:51:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Adversarial Representation Learning for Robust Privacy Preservation in Audio

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー