Electroencephalogram-based Multi-class Decoding of Attended Speakers’ Direction with Audio Spatial Spectrum

要約

聴取者の脳波 (EEG) 信号から聴取者の焦点の指向性を解読することは、聴覚障害を持つ個人の生活の質を向上させるブレインコンピューターインターフェイスの開発に不可欠です。
これまでの研究は、バイナリ指向性フォーカスデコード、つまり、参加している話者がリスナーの左側にいるのか右側にいるのかを判断することに集中していました。
ただし、効果的な音声処理には、話者の正確な方向をより正確にデコードする必要があります。
さらに、オーディオの空間情報が効果的に活用されておらず、デコード結果が最適ではありません。
この論文では、最近提示された 15 クラスの方向性焦点を備えたデータセットでは、EEG 入力のみに依存するモデルは、1 被験者を除外する場合と 1 試行を残す場合の両方で方向性焦点をデコードするときに精度が大幅に低いことを観察しています。
-アウトシナリオ。
オーディオ空間スペクトルを EEG 特徴と統合することにより、デコード精度を効果的に向上させることができます。
CNN、LSM-CNN、および EEG-Deformer モデルを使用して、補助オーディオ空間スペクトルを使用してリスナーの EEG 信号から指向性フォーカスをデコードします。
提案された Sp-Aux-Deformer モデルは、被験者を 1 名残すシナリオと試行 1 名を残すシナリオで、それぞれ 57.48% と 61.83% という注目すべき 15 クラスのデコード精度を達成します。

要約(オリジナル)

Decoding the directional focus of an attended speaker from listeners’ electroencephalogram (EEG) signals is essential for developing brain-computer interfaces to improve the quality of life for individuals with hearing impairment. Previous works have concentrated on binary directional focus decoding, i.e., determining whether the attended speaker is on the left or right side of the listener. However, a more precise decoding of the exact direction of the attended speaker is necessary for effective speech processing. Additionally, audio spatial information has not been effectively leveraged, resulting in suboptimal decoding results. In this paper, we observe that, on our recently presented dataset with 15-class directional focus, models relying exclusively on EEG inputs exhibits significantly lower accuracy when decoding the directional focus in both leave-one-subject-out and leave-one-trial-out scenarios. By integrating audio spatial spectra with EEG features, the decoding accuracy can be effectively improved. We employ the CNN, LSM-CNN, and EEG-Deformer models to decode the directional focus from listeners’ EEG signals with the auxiliary audio spatial spectra. The proposed Sp-Aux-Deformer model achieves notable 15-class decoding accuracies of 57.48% and 61.83% in leave-one-subject-out and leave-one-trial-out scenarios, respectively.

arxiv情報

著者	Yuanming Zhang,Jing Lu,Zhibin Lin,Fei Chen,Haoliang Du,Xia Gao
発行日	2024-11-11 12:32:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Electroencephalogram-based Multi-class Decoding of Attended Speakers’ Direction with Audio Spatial Spectrum

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー