Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working Memory

要約

作業記憶 (WM) は、情報の一時的な保存、統合、操作、検索を促進する基本的な認知プロセスであり、推論や意思決定のタスクにおいて重要な役割を果たします。
AI WM モデルの効果的な開発と評価には、WM の多面的な性質を捉える堅牢なベンチマークデータセットが不可欠です。
ここでは、この目的のための包括的なワーキングメモリ (WorM) ベンチマークデータセットを紹介します。
WorM は 10 のタスクと合計 100 万のトライアルで構成され、WM の 4 つの機能、3 つのドメイン、および 11 の行動および神経特性を評価します。
私たちは、これらすべてのタスクに関して最先端のリカレントニューラルネットワークとトランスフォーマーを共同でトレーニングし、テストしました。
また、比較の上限として人間の行動ベンチマークも含めます。
私たちの結果は、AIモデルが脳内のWMのいくつかの特徴、特に初頭性効果と最新性効果、およびWMのさまざまなドメインと機能に特化した神経クラスターと相関関係を再現していることを示唆しています。
実験では、人間の行動を近似するための既存のモデルのいくつかの制限も明らかにしました。
このデータセットは、認知心理学、神経科学、AI のコミュニティにとって貴重なリソースとして機能し、WM モデルを比較および強化し、WM の神経基盤を調査し、人間のような機能を備えた WM モデルを開発するための標準化されたフレームワークを提供します。
私たちのソースコードとデータは https://github.com/ZhangLab-DeepNeuroCogLab/WorM で入手できます。

要約(オリジナル)

Working memory (WM), a fundamental cognitive process facilitating the temporary storage, integration, manipulation, and retrieval of information, plays a vital role in reasoning and decision-making tasks. Robust benchmark datasets that capture the multifaceted nature of WM are crucial for the effective development and evaluation of AI WM models. Here, we introduce a comprehensive Working Memory (WorM) benchmark dataset for this purpose. WorM comprises 10 tasks and a total of 1 million trials, assessing 4 functionalities, 3 domains, and 11 behavioral and neural characteristics of WM. We jointly trained and tested state-of-the-art recurrent neural networks and transformers on all these tasks. We also include human behavioral benchmarks as an upper bound for comparison. Our results suggest that AI models replicate some characteristics of WM in the brain, most notably primacy and recency effects, and neural clusters and correlates specialized for different domains and functionalities of WM. In the experiments, we also reveal some limitations in existing models to approximate human behavior. This dataset serves as a valuable resource for communities in cognitive psychology, neuroscience, and AI, offering a standardized framework to compare and enhance WM models, investigate WM’s neural underpinnings, and develop WM models with human-like capabilities. Our source code and data are available at https://github.com/ZhangLab-DeepNeuroCogLab/WorM.

arxiv情報

著者	Ankur Sikarwar,Mengmi Zhang
発行日	2023-07-20 10:57:02+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working Memory

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー