Second Sight: Using brain-optimized encoding models to align image distributions with human brain activity


最近の 2 つの開発により、人間の脳活動からの画像再構成の進歩が加速しています。1 つは、何千もの自然シーンに応じた脳活動のサンプルを提供する大規模なデータセット、もう 1 つは、低および高の両方を受け入れる強力な確率的画像生成器のオープンソースです。
我々は、ボクセルごとの符号化モデルの予測とターゲット画像によって引き起こされる脳活動パターンとの間の整合性を明示的に最大化するために、画像分布を反復的に改良する新しい再構成手順 (Second Sight) を導入します。
したがって、Second Sight は、視覚脳領域にわたる表現の多様性を探索するための簡潔で斬新な方法を提供します。


Two recent developments have accelerated progress in image reconstruction from human brain activity: large datasets that offer samples of brain activity in response to many thousands of natural scenes, and the open-sourcing of powerful stochastic image-generators that accept both low- and high-level guidance. Most work in this space has focused on obtaining point estimates of the target image, with the ultimate goal of approximating literal pixel-wise reconstructions of target images from the brain activity patterns they evoke. This emphasis belies the fact that there is always a family of images that are equally compatible with any evoked brain activity pattern, and the fact that many image-generators are inherently stochastic and do not by themselves offer a method for selecting the single best reconstruction from among the samples they generate. We introduce a novel reconstruction procedure (Second Sight) that iteratively refines an image distribution to explicitly maximize the alignment between the predictions of a voxel-wise encoding model and the brain activity patterns evoked by any target image. We show that our process converges on a distribution of high-quality reconstructions by refining both semantic content and low-level image details across iterations. Images sampled from these converged image distributions are competitive with state-of-the-art reconstruction algorithms. Interestingly, the time-to-convergence varies systematically across visual cortex, with earlier visual areas generally taking longer and converging on narrower image distributions, relative to higher-level brain areas. Second Sight thus offers a succinct and novel method for exploring the diversity of representations across visual brain areas.


著者 Reese Kneeland,Jordyn Ojeda,Ghislain St-Yves,Thomas Naselaris
発行日 2023-06-01 17:31:07+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス, Google

カテゴリー: cs.CV, cs.LG, q-bio.NC パーマリンク