Do Large Language Models Mirror Cognitive Language Processing?

要約

大規模言語モデル (LLM) は、テキスト理解と論理的推論において顕著な能力を実証しており、LLM によって学習されたテキスト表現が言語処理能力を促進できることを示しています。
神経科学では、通常、脳の認知処理信号は人間の言語処理を研究するために利用されます。
したがって、LLM からのテキスト埋め込みが脳の認知処理信号とどの程度よく一致しているのか、またトレーニング戦略が LLM と脳の一致にどのように影響するのかを疑問に思うのは自然なことです。
この論文では、表現類似性分析 (RSA) を使用して、23 の主流 LLM と脳の fMRI 信号の間の整合性を測定し、LLM が認知言語処理をどの程度効果的にシミュレートしているかを評価します。
我々は、このような LLM 脳のアライメントに対するさまざまな要因 (トレーニング前のデータサイズ、モデルのスケーリング、アライメントトレーニング、プロンプトなど) の影響を経験的に調査しています。
実験結果は、トレーニング前のデータサイズとモデルのスケーリングが LLM 脳の類似性と正の相関関係があり、アライメントトレーニングによって LLM 脳の類似性を大幅に改善できることを示しています。
明示的なプロンプトは LLM と脳の認知言語処理との一貫性に貢献しますが、無意味なノイズの多いプロンプトはそのような整合性を弱める可能性があります。
さらに、幅広い LLM 評価 (MMLU、Chatbot Arena など) のパフォーマンスは、LLM 脳の類似性と高度に相関しています。

要約(オリジナル)

Large Language Models (LLMs) have demonstrated remarkable abilities in text comprehension and logical reasoning, indicating that the text representations learned by LLMs can facilitate their language processing capabilities. In neuroscience, brain cognitive processing signals are typically utilized to study human language processing. Therefore, it is natural to ask how well the text embeddings from LLMs align with the brain cognitive processing signals, and how training strategies affect the LLM-brain alignment? In this paper, we employ Representational Similarity Analysis (RSA) to measure the alignment between 23 mainstream LLMs and fMRI signals of the brain to evaluate how effectively LLMs simulate cognitive language processing. We empirically investigate the impact of various factors (e.g., pre-training data size, model scaling, alignment training, and prompts) on such LLM-brain alignment. Experimental results indicate that pre-training data size and model scaling are positively correlated with LLM-brain similarity, and alignment training can significantly improve LLM-brain similarity. Explicit prompts contribute to the consistency of LLMs with brain cognitive language processing, while nonsensical noisy prompts may attenuate such alignment. Additionally, the performance of a wide range of LLM evaluations (e.g., MMLU, Chatbot Arena) is highly correlated with the LLM-brain similarity.

arxiv情報

著者	Yuqi Ren,Renren Jin,Tongxuan Zhang,Deyi Xiong
発行日	2025-01-15 04:47:36+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Do Large Language Models Mirror Cognitive Language Processing?

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー