CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information

要約

脳波 (EEG) 信号は、その非侵襲性と視覚刺激の解読における時間的感度の高さにより、研究者から大きな注目を集めています。
しかし、最近の研究のほとんどは、EEG と画像データのペアの関係のみに焦点を当てており、EEG 信号に埋め込まれている貴重な「画像モダリティを超えた」情報は無視されています。
これにより、EEG における重要なマルチモーダル情報が失われます。
この制限に対処するために、マルチモーダルデータを完全に活用して EEG 信号を表現する統合フレームワークである CognitionCapturer を提案します。
具体的には、CognitionCapturer は、EEG モダリティからクロスモーダル情報を抽出するために、モダリティごとにモダリティエキスパートエンコーダーをトレーニングします。
次に、EEG 埋め込み空間を CLIP 埋め込み空間にマッピングする前に拡散を導入し、事前トレーニングされた生成モデルを使用することで、提案されたフレームワークは高い意味論的および構造的忠実度で視覚刺激を再構築できます。
特に、このフレームワークは生成モデルの微調整を必要とせず、より多くのモダリティを組み込むように拡張できます。
広範な実験を通じて、CognitionCapturer が質的にも量的にも最先端の方法よりも優れていることを実証しました。
コード: https://github.com/XiaoZhangYES/CognitionCapturer。

要約(オリジナル)

Electroencephalogram (EEG) signals have attracted significant attention from researchers due to their non-invasive nature and high temporal sensitivity in decoding visual stimuli. However, most recent studies have focused solely on the relationship between EEG and image data pairs, neglecting the valuable “beyond-image-modality’ information embedded in EEG signals. This results in the loss of critical multimodal information in EEG. To address this limitation, we propose CognitionCapturer, a unified framework that fully leverages multimodal data to represent EEG signals. Specifically, CognitionCapturer trains Modality Expert Encoders for each modality to extract cross-modal information from the EEG modality. Then, it introduces a diffusion prior to map the EEG embedding space to the CLIP embedding space, followed by using a pretrained generative model, the proposed framework can reconstruct visual stimuli with high semantic and structural fidelity. Notably, the framework does not require any fine-tuning of the generative models and can be extended to incorporate more modalities. Through extensive experiments, we demonstrate that CognitionCapturer outperforms state-of-the-art methods both qualitatively and quantitatively. Code: https://github.com/XiaoZhangYES/CognitionCapturer.

arxiv情報

著者	Kaifan Zhang,Lihuo He,Xin Jiang,Wen Lu,Di Wang,Xinbo Gao
発行日	2024-12-24 13:03:44+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー