AGITB: A Signal-Level Benchmark for Evaluating Artificial General Intelligence

要約

機械学習の驚くべき進歩にもかかわらず、現在のAIシステムは、真の人間のような知性に及ばないままです。
大規模な言語モデル（LLM）はパターン認識と応答生成に優れていますが、真の理解はありません – 人工的な一般情報（AGI）の本質的な特徴です。
既存のAGI評価方法は、実用的、漸進的、有益なメトリックを提供できません。
このペーパーでは、認知能力の潜在的な出現のための信号処理レベルの基礎を形成する12の厳密なテストを含む、人工的な一般情報テストベッド（AGITB）を紹介します。
AGITBは、シンボリック表現や事前削除に依存することなく、時間を越えてバイナリ信号を予測するモデルの能力を通じて知性を評価します。
言語や知覚に基づいた高レベルのテストとは異なり、AGITBは、決定論、感度、一般化などの生物学的知性を反映したコア計算不変物に焦点を当てています。
テストベッドは、以前のバイアスを想定せず、セマンティックな意味とは独立して動作し、ブルートフォースまたは暗記を通じて解決能力を保証します。
人間は設計上AgitBを通過しますが、現在のAIシステムはその基準を満たしていないため、AgitBはAGIへの進歩を導き、認識するための説得力のあるベンチマークになりました。

要約(オリジナル)

Despite remarkable progress in machine learning, current AI systems continue to fall short of true human-like intelligence. While Large Language Models (LLMs) excel in pattern recognition and response generation, they lack genuine understanding – an essential hallmark of Artificial General Intelligence (AGI). Existing AGI evaluation methods fail to offer a practical, gradual, and informative metric. This paper introduces the Artificial General Intelligence Test Bed (AGITB), comprising twelve rigorous tests that form a signal-processing-level foundation for the potential emergence of cognitive capabilities. AGITB evaluates intelligence through a model’s ability to predict binary signals across time without relying on symbolic representations or pretraining. Unlike high-level tests grounded in language or perception, AGITB focuses on core computational invariants reflective of biological intelligence, such as determinism, sensitivity, and generalisation. The test bed assumes no prior bias, operates independently of semantic meaning, and ensures unsolvability through brute force or memorization. While humans pass AGITB by design, no current AI system has met its criteria, making AGITB a compelling benchmark for guiding and recognizing progress toward AGI.

arxiv情報

著者	Matej Šprogar
発行日	2025-05-09 11:25:57+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

AGITB: A Signal-Level Benchmark for Evaluating Artificial General Intelligence

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー