Meta-Learning an In-Context Transformer Model of Human Higher Visual Cortex

要約

より高い視覚皮質内の機能表現を理解することは、計算神経科学の基本的な問題です。
大規模なデータセットで前処理された人工ニューラルネットワークは、人間の神経反応と顕著な表現の整合性を示しますが、視覚皮質の画像コンパート可能なモデルの学習は、個人レベルの大規模なfMRIデータセットに依存しています。
高価で、時間型であり、しばしば非実用的なデータ収集の必要性は、新しい主題と刺激に対するエンコーダーの一般化可能性を制限します。
BrainCorlは、新しい被験者や刺激に追加の微調整なしに、少数のショットの例からボクセルワイズの神経反応を予測するためにコンテキスト内の学習を使用します。
さまざまな数のコンテキスト画像刺激を柔軟に条件に条件にできる変圧器アーキテクチャを活用し、複数の被験者に対する誘導バイアスを学習します。
トレーニング中に、コンテキスト内学習のモデルを明示的に最適化します。
画像機能とボクセルの活性化を共同で条件付けすることにより、モデルは、より高い視覚皮質のより良いパフォーマンスのボクセルワイズモデルを直接生成することを学びます。
BrainCorlは、完全に新しい画像で評価されたときに、低DATAレジームで既存のVoxelWiseエンコーダー設計を常に上回ると同時に、強力なテスト時間スケーリング動作を示すことを実証します。
このモデルは、異なるサブジェクトとfMRIデータ収集パラメーターを使用するまったく新しいVisual fMRIデータセットにも一般化します。
さらに、BrainCorlは、意味的に関連する刺激に参加することにより、より高い視覚皮質の神経信号のより良い解釈可能性を促進します。
最後に、私たちのフレームワークは、自然言語のクエリからボクセル選択性までの解釈可能なマッピングを可能にすることを示します。

要約(オリジナル)

Understanding functional representations within higher visual cortex is a fundamental question in computational neuroscience. While artificial neural networks pretrained on large-scale datasets exhibit striking representational alignment with human neural responses, learning image-computable models of visual cortex relies on individual-level, large-scale fMRI datasets. The necessity for expensive, time-intensive, and often impractical data acquisition limits the generalizability of encoders to new subjects and stimuli. BraInCoRL uses in-context learning to predict voxelwise neural responses from few-shot examples without any additional finetuning for novel subjects and stimuli. We leverage a transformer architecture that can flexibly condition on a variable number of in-context image stimuli, learning an inductive bias over multiple subjects. During training, we explicitly optimize the model for in-context learning. By jointly conditioning on image features and voxel activations, our model learns to directly generate better performing voxelwise models of higher visual cortex. We demonstrate that BraInCoRL consistently outperforms existing voxelwise encoder designs in a low-data regime when evaluated on entirely novel images, while also exhibiting strong test-time scaling behavior. The model also generalizes to an entirely new visual fMRI dataset, which uses different subjects and fMRI data acquisition parameters. Further, BraInCoRL facilitates better interpretability of neural signals in higher visual cortex by attending to semantically relevant stimuli. Finally, we show that our framework enables interpretable mappings from natural language queries to voxel selectivity.

arxiv情報

著者	Muquan Yu,Mu Nan,Hossein Adeli,Jacob S. Prince,John A. Pyles,Leila Wehbe,Margaret M. Henderson,Michael J. Tarr,Andrew F. Luo
発行日	2025-05-21 17:59:41+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Meta-Learning an In-Context Transformer Model of Human Higher Visual Cortex

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー