Guess What I Think: Streamlined EEG-to-Image Generation with Latent Diffusion Models

要約

脳波から画像を生成することは、脳信号がどのように視覚的手がかりをエンコードするかを理解することでブレイン・コンピューター・インターフェース（BCI）システムを進歩させる可能性があるため、ますます注目を集めています。
fMRI は高い空間分解能を特徴とするため、ほとんどの文献は fMRI から画像へのタスクに焦点を当てています。
ただし、fMRI は高価な神経画像診断法であり、リアルタイムの BCI は可能ではありません。
一方、脳波検査 (EEG) は、低コスト、非侵襲性、ポータブルな神経画像技術であり、将来のリアルタイムアプリケーションにとって魅力的な選択肢となります。
それにもかかわらず、EEG には空間解像度が低く、ノイズやアーティファクトの影響を受けやすいため、固有の課題があり、EEG から画像を生成することがより困難になります。
この論文では、EEG 信号を通じて潜在拡散モデル (LDM) を調整するための ControlNet アダプターに基づく合理化されたフレームワークを使用して、これらの問題に対処します。
私たちは、提案された方法が他の最先端のモデルを上回ることを実証するために、一般的なベンチマークで実験とアブレーション研究を実施します。
多くの場合、大規模な前処理、事前トレーニング、さまざまな損失、およびキャプションモデルが必要となるこれらの方法とは異なり、私たちのアプローチは効率的かつ簡単で、最小限の前処理といくつかのコンポーネントのみが必要です。
コードは https://github.com/LuigiSigillo/GWIT で入手できます。

要約(オリジナル)

Generating images from brain waves is gaining increasing attention due to its potential to advance brain-computer interface (BCI) systems by understanding how brain signals encode visual cues. Most of the literature has focused on fMRI-to-Image tasks as fMRI is characterized by high spatial resolution. However, fMRI is an expensive neuroimaging modality and does not allow for real-time BCI. On the other hand, electroencephalography (EEG) is a low-cost, non-invasive, and portable neuroimaging technique, making it an attractive option for future real-time applications. Nevertheless, EEG presents inherent challenges due to its low spatial resolution and susceptibility to noise and artifacts, which makes generating images from EEG more difficult. In this paper, we address these problems with a streamlined framework based on the ControlNet adapter for conditioning a latent diffusion model (LDM) through EEG signals. We conduct experiments and ablation studies on popular benchmarks to demonstrate that the proposed method beats other state-of-the-art models. Unlike these methods, which often require extensive preprocessing, pretraining, different losses, and captioning models, our approach is efficient and straightforward, requiring only minimal preprocessing and a few components. The code is available at https://github.com/LuigiSigillo/GWIT.

arxiv情報

著者	Eleonora Lopez,Luigi Sigillo,Federica Colonnese,Massimo Panella,Danilo Comminiello
発行日	2025-01-10 18:14:56+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Guess What I Think: Streamlined EEG-to-Image Generation with Latent Diffusion Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー