Embodied World Models Emerge from Navigational Task in Open-Ended Environments

要約

人工システムがどのように空間的認識と推論を開発できるかを理解することは、AIの研究では長い間課題でした。
従来のモデルはしばしば受動的な観察に依存していますが、具体化された認知理論は、環境との積極的な相互作用からより深い理解が現れることを示唆しています。
この研究では、ニューラルネットワークが相互作用を通じて空間概念を自律的に内在化し、平面ナビゲーションタスクに焦点を当てることができるかどうかを調査します。
Gated Recurrentユニット（Grus）とメタ補強学習（Meta-RL）を組み合わせて、エージェントが方向、距離、障害物回避などの空間特性をエンコードすることを学ぶことができることを示します。
ハイブリッド動的システム（HDS）を導入して、エージェントと環境の相互作用を閉じた動的システムとしてモデル化し、最適なナビゲーション戦略に対応する安定した制限サイクルを明らかにします。
尾根表現により、ナビゲーションパスを固定次元の行動空間にマッピングして、神経状態との比較を可能にします。
標準相関分析（CCA）は、これらの表現間の強いアライメントを確認し、エージェントの神経状態が空間知識を積極的にエンコードすることを示唆しています。
介入実験により、特定の神経次元がナビゲーションのパフォーマンスに因果関係があることがさらに示されています。
この作業は、AIのアクションと知覚のギャップを埋めるためのアプローチを提供し、複雑な環境全体で一般化できる適応的で解釈可能なモデルを構築するための新しい洞察を提供します。
神経表現の因果的検証は、AIシステムの内部メカニズムを理解して制御するための新しい道を開き、動的で現実世界のシナリオでマシンの学習と推論の境界を押し広げます。

要約(オリジナル)

Understanding how artificial systems can develop spatial awareness and reasoning has long been a challenge in AI research. Traditional models often rely on passive observation, but embodied cognition theory suggests that deeper understanding emerges from active interaction with the environment. This study investigates whether neural networks can autonomously internalize spatial concepts through interaction, focusing on planar navigation tasks. Using Gated Recurrent Units (GRUs) combined with Meta-Reinforcement Learning (Meta-RL), we show that agents can learn to encode spatial properties like direction, distance, and obstacle avoidance. We introduce Hybrid Dynamical Systems (HDS) to model the agent-environment interaction as a closed dynamical system, revealing stable limit cycles that correspond to optimal navigation strategies. Ridge Representation allows us to map navigation paths into a fixed-dimensional behavioral space, enabling comparison with neural states. Canonical Correlation Analysis (CCA) confirms strong alignment between these representations, suggesting that the agent’s neural states actively encode spatial knowledge. Intervention experiments further show that specific neural dimensions are causally linked to navigation performance. This work provides an approach to bridging the gap between action and perception in AI, offering new insights into building adaptive, interpretable models that can generalize across complex environments. The causal validation of neural representations also opens new avenues for understanding and controlling the internal mechanisms of AI systems, pushing the boundaries of how machines learn and reason in dynamic, real-world scenarios.

arxiv情報

著者	Li Jin,Liu Jia
発行日	2025-04-15 17:35:13+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Embodied World Models Emerge from Navigational Task in Open-Ended Environments

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー