jarxiv | Japanese arxiv

Understand the Implication: Learning to Think for Pragmatic Understanding

投稿日: 2025年6月17日作成者: jarxiv

要約

プラグマティクスは、文字通りの解釈を超えて意味を推測する能力であり、社会的認知とコミュニケーションにとって重要です。
LLMは実用的な理解のためにベンチマークされていますが、パフォーマンスを改善することは依然として不足していません。
既存の方法は注釈付きラベルに依存していますが、人間が暗黙の意味を解釈するために自然に使用する推論プロセスを見落としています。
このギャップを埋めるために、正しい解釈と誤った解釈の両方の明示的な推論（思考）を含む、新しい実用的なデータセット、empliedMeaningPreferenceを紹介します。
優先順位と監視された微調整を通じて、思考に基づいた学習がLLMSの実用的な理解を大幅に向上させ、モデルファミリ全体で精度を11.12％改善することを実証します。
さらに、トレーニング時間中に見られないプラグマティクスの他のタスク（前提条件、Deixis）の思考ベースのトレーニングのパフォーマンスを評価し、ラベルトレーニングモデルと比較して16.10％の改善を観察する転送学習研究についてさらに説明します。

要約(オリジナル)

Pragmatics, the ability to infer meaning beyond literal interpretation, is crucial for social cognition and communication. While LLMs have been benchmarked for their pragmatic understanding, improving their performance remains underexplored. Existing methods rely on annotated labels but overlook the reasoning process humans naturally use to interpret implicit meaning. To bridge this gap, we introduce a novel pragmatic dataset, ImpliedMeaningPreference, that includes explicit reasoning (thoughts) for both correct and incorrect interpretations. Through preference-tuning and supervised fine-tuning, we demonstrate that thought-based learning significantly enhances LLMs’ pragmatic understanding, improving accuracy by 11.12% across model families. We further discuss a transfer-learning study where we evaluate the performance of thought-based training for the other tasks of pragmatics (presupposition, deixis) that are not seen during the training time and observe an improvement of 16.10% compared to label-trained models.

arxiv情報

著者	Settaluri Lakshmi Sravanthi,Kishan Maharaj,Sravani Gunnu,Abhijit Mishra,Pushpak Bhattacharyya
発行日	2025-06-16 14:45:08+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

A Production Scheduling Framework for Reinforcement Learning Under Real-World Constraints

投稿日: 2025年6月17日作成者: jarxiv

要約

クラシックジョブショップのスケジューリング問題（JSSP）は、決定論的制約の下でMakepanの最適化に焦点を当てています。
現実世界の生産環境は、従来のスケジューリングアプローチの効果が低下する追加の複雑さを導入します。
RENFERTION LEANINE（RL）は、エージェントが適応スケジューリング戦略を学習できるようにするため、これらの課題に対処する可能性を秘めています。
ただし、実際の制約の下でRLエージェントを効果的にトレーニングおよび評価するための包括的な汎用フレームワークが不足しています。
このギャップに対処するために、輸送ロジスティクス、バッファ管理、機械の故障、セットアップ時間、確率的処理条件など、ShopFloorに固有のキー\ Mbox {Real-World}制約を組み込むことにより、古典的なJSSP製剤を拡張するモジュラーフレームワークを提案し、多注文の最適化もサポートします。
このフレームワークは、問題インスタンスの定義とシミュレーションパラメーターの構成に柔軟性を提供し、多様な生産シナリオへの適応を可能にするカスタマイズ可能なソリューションです。
標準化されたインターフェイスにより、さまざまなRLアプローチとの互換性が保証され、RLエージェントをトレーニングするための堅牢な環境を提供し、動的および不確実な条件下での異なるスケジューリング方法の標準化された比較を促進します。
jobshoplabは、研究と産業用アプリケーションの両方のオープンソースツールとしてリリースされます。https：//github.com/proto-lab-ro/jobshoplabでアクセスできます

要約(オリジナル)

The classical Job Shop Scheduling Problem (JSSP) focuses on optimizing makespan under deterministic constraints. Real-world production environments introduce additional complexities that cause traditional scheduling approaches to be less effective. Reinforcement learning (RL) holds potential in addressing these challenges, as it allows agents to learn adaptive scheduling strategies. However, there is a lack of a comprehensive, general-purpose frameworks for effectively training and evaluating RL agents under real-world constraints. To address this gap, we propose a modular framework that extends classical JSSP formulations by incorporating key \mbox{real-world} constraints inherent to the shopfloor, including transport logistics, buffer management, machine breakdowns, setup times, and stochastic processing conditions, while also supporting multi-objective optimization. The framework is a customizable solution that offers flexibility in defining problem instances and configuring simulation parameters, enabling adaptation to diverse production scenarios. A standardized interface ensures compatibility with various RL approaches, providing a robust environment for training RL agents and facilitating the standardized comparison of different scheduling methods under dynamic and uncertain conditions. We release JobShopLab as an open-source tool for both research and industrial applications, accessible at: https://github.com/proto-lab-ro/jobshoplab

arxiv情報

著者	Jonathan Hoss,Felix Schelling,Noah Klarmann
発行日	2025-06-16 14:50:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

Can you see how I learn? Human observers’ inferences about Reinforcement Learning agents’ learning processes

投稿日: 2025年6月17日作成者: jarxiv

要約

強化学習（RL）エージェントは、多くの場合、人間の観察者が直感的に解釈できない学習行動を示し、共同教育設定で最適ではないフィードバックをもたらす可能性があります。
しかし、人間がどのようにRLエージェントの学習行動を知覚し、解釈するかはほとんど不明です。
2つの実験を使用したボトムアップアプローチでは、この作業は、エージェントの学習プロセスに対する人間の観察者の理解の要因に関するデータ駆動型の理解を提供します。
エージェント学習に関する人間の推論を直接評価するための新しい観察ベースのパラダイムが開発されました。
探索的インタビュー調査（\ textit {n} = 9）で、人間の解釈における4つのコアテーマを特定します：エージェントの目標、知識、意思決定、学習メカニズム。
2番目の確認研究（\ textIT {n} = 34）は、2つのタスク（ナビゲーション/操作）と2つのRLアルゴリズム（表面/関数近似）にわたってパラダイムの拡張バージョンを適用しました。
816回の応答の分析により、パラダイムの信頼性が確認され、テーマのフレームワークが改良され、これらのテーマが時間とともに進化し、相互に関連する方法を明らかにしました。
私たちの調査結果は、人々がエージェントの学習をどのように理解するかについての人間中心の理解を提供し、解釈可能なRLシステムを設計し、人間とロボットの相互作用の透明性を向上させるための実用的な洞察を提供します。

要約(オリジナル)

Reinforcement Learning (RL) agents often exhibit learning behaviors that are not intuitively interpretable by human observers, which can result in suboptimal feedback in collaborative teaching settings. Yet, how humans perceive and interpret RL agent’s learning behavior is largely unknown. In a bottom-up approach with two experiments, this work provides a data-driven understanding of the factors of human observers’ understanding of the agent’s learning process. A novel, observation-based paradigm to directly assess human inferences about agent learning was developed. In an exploratory interview study (\textit{N}=9), we identify four core themes in human interpretations: Agent Goals, Knowledge, Decision Making, and Learning Mechanisms. A second confirmatory study (\textit{N}=34) applied an expanded version of the paradigm across two tasks (navigation/manipulation) and two RL algorithms (tabular/function approximation). Analyses of 816 responses confirmed the reliability of the paradigm and refined the thematic framework, revealing how these themes evolve over time and interrelate. Our findings provide a human-centered understanding of how people make sense of agent learning, offering actionable insights for designing interpretable RL systems and improving transparency in Human-Robot Interaction.

arxiv情報

著者	Bernhard Hilpert,Muhan Hou,Kim Baraka,Joost Broekens
発行日	2025-06-16 15:04:27+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.HC, cs.RO | コメントを受け付けていません

From Data-Driven to Purpose-Driven Artificial Intelligence: Systems Thinking for Data-Analytic Automation of Patient Care

投稿日: 2025年6月17日作成者: jarxiv

要約

この作業では、AI駆動型の患者ケアの自動化に基づいているデータ駆動型のモデリングパラダイムを振り返ります。
機械学習のための既存の実際の患者データセットの再利用は、患者ケアの望ましくない結果につながる可能性があるため、モデル開発への最適なアプローチを常に表しているとは限らないと主張します。
データ分析の歴史を振り返って、データ駆動型のパラダイムがどのように人気に昇格したかを説明し、システム思考と臨床ドメイン理論が、人間中心の結果に到達する際に既存のモデル開発アプローチを補完する方法を想定しています。
私たちは、臨床理論と現実世界の運用コンテキストの社会工学的現実に基づいた目的駆動型の機械学習パラダイムを求めています。
既存の患者データセットのユーティリティを理解するには、データ生成に向かって上流、自動化目標に向かって下流の2つの方向を見る必要があると主張します。
AIシステム開発に対するこの目的駆動型の視点は、新しい方法論的機会を開き、患者ケアのAI自動化の約束を保持しています。

要約(オリジナル)

In this work, we reflect on the data-driven modeling paradigm that is gaining ground in AI-driven automation of patient care. We argue that the repurposing of existing real-world patient datasets for machine learning may not always represent an optimal approach to model development as it could lead to undesirable outcomes in patient care. We reflect on the history of data analysis to explain how the data-driven paradigm rose to popularity, and we envision ways in which systems thinking and clinical domain theory could complement the existing model development approaches in reaching human-centric outcomes. We call for a purpose-driven machine learning paradigm that is grounded in clinical theory and the sociotechnical realities of real-world operational contexts. We argue that understanding the utility of existing patient datasets requires looking in two directions: upstream towards the data generation, and downstream towards the automation objectives. This purpose-driven perspective to AI system development opens up new methodological opportunities and holds promise for AI automation of patient care.

arxiv情報

著者	Daniel Anadria,Roel Dobbe,Anastasia Giachanou,Ruurd Kuiper,Richard Bartels,Íñigo Martínez de Rituerto de Troya,Carmen Zürcher,Daniel Oberski
発行日	2025-06-16 15:07:44+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG, cs.SY, eess.SY, math.ST, stat.ME, stat.TH | コメントを受け付けていません

Consistency of Neural Causal Partial Identification

投稿日: 2025年6月17日作成者: jarxiv

要約

神経因果モデル（NCMS）の最近の進捗状況は、特定の因果グラフにエンコードされた制約を尊重する神経生成モデルのトレーニングを通じて、因果効果の識別と部分的識別を自動的に実行する方法を示しています[Xia et al。
2022年、Balazadeh et al。
2022]。
ただし、これらの方法の正式な一貫性は、離散変数の場合、または線形因果モデルのみでのみ証明されています。
この作業では、連続変数とカテゴリ変数の両方を備えた一般的な設定でのNCMSを介した部分的識別の一貫性を証明します。
さらに、我々の結果は、深さと接続性の観点から、基礎となるニューラルネットワークアーキテクチャの設計が、トレーニングフェーズでLipschitzの正規化を適用することの重要性を強調しています。
特に、Lipschitzの正規化がなければ、この方法が漸近的に一貫していない可能性があることを示す反論を提供します。
私たちの結果は、神経生成モデルを介した構造因果モデル（SCM）の近似性に関する新しい結果と、結果として生成されるアーキテクチャのサンプルの複雑さの分析と、それが部分的な識別境界を定義する制約された最適化問題のエラーにどのように変換されるかによって有効になります。

要約(オリジナル)

Recent progress in Neural Causal Models (NCMs) showcased how identification and partial identification of causal effects can be automatically carried out via training of neural generative models that respect the constraints encoded in a given causal graph [Xia et al. 2022, Balazadeh et al. 2022]. However, formal consistency of these methods has only been proven for the case of discrete variables or only for linear causal models. In this work, we prove the consistency of partial identification via NCMs in a general setting with both continuous and categorical variables. Further, our results highlight the impact of the design of the underlying neural network architecture in terms of depth and connectivity as well as the importance of applying Lipschitz regularization in the training phase. In particular, we provide a counterexample showing that without Lipschitz regularization this method may not be asymptotically consistent. Our results are enabled by new results on the approximability of Structural Causal Models (SCMs) via neural generative models, together with an analysis of the sample complexity of the resulting architectures and how that translates into an error in the constrained optimization problem that defines the partial identification bounds.

arxiv情報

著者	Jiyuan Tan,Jose Blanchet,Vasilis Syrgkanis
発行日	2025-06-16 15:15:21+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG, stat.ML | コメントを受け付けていません

Agent Capability Negotiation and Binding Protocol (ACNBP)

投稿日: 2025年6月17日作成者: jarxiv

要約

マルチエージェントシステムがますます多様で専門的なエージェントを包含するように進化するにつれて、不均一なエージェント間の効果的なコラボレーションを可能にするという課題が最も重要になります。
このホワイトペーパーでは、エージェントの能力交渉と結合プロトコル（ACNBP）を提示します。これは、包括的な発見、交渉、拘束力を提供するエージェントネームサービス（ANS）インフラストラクチャとの統合を通じて、不均一なマルチエージェントシステムのエージェント間の安全で効率的で検証可能な相互作用を促進するために設計された新しいフレームワークです。
プロトコルは、デジタル署名、能力証明、包括的な脅威緩和戦略を含む組み込みのセキュリティ対策との組み込みのセキュリティ対策との組み込みのセキュリティ対策との組み込みのセキュリティ対策との能力の発見、候補者の候補者の選択、交渉段階、拘束力のあるコミットメントを含む構造化された10ステッププロセスを導入します。
セキュリティと相互運用性を維持しながら。
マエストロの脅威モデリングフレームワーク、実用的な実装の考慮事項、およびドキュメント翻訳シナリオでのプロトコルのアプリケーションを示す詳細な例を使用した包括的なセキュリティ分析を通じてACNBPの有効性を実証し、プロトコルがエージェントの自動化、能力検証、安全なコミュニケーション、および拡張性エージェントエコシステム管理における重要な課題に対処します。

要約(オリジナル)

As multi-agent systems evolve to encompass increasingly diverse and specialized agents, the challenge of enabling effective collaboration between heterogeneous agents has become paramount, with traditional agent communication protocols often assuming homogeneous environments or predefined interaction patterns that limit their applicability in dynamic, open-world scenarios. This paper presents the Agent Capability Negotiation and Binding Protocol (ACNBP), a novel framework designed to facilitate secure, efficient, and verifiable interactions between agents in heterogeneous multi-agent systems through integration with an Agent Name Service (ANS) infrastructure that provides comprehensive discovery, negotiation, and binding mechanisms. The protocol introduces a structured 10-step process encompassing capability discovery, candidate pre-screening and selection, secure negotiation phases, and binding commitment with built-in security measures including digital signatures, capability attestation, and comprehensive threat mitigation strategies, while a key innovation of ACNBP is its protocolExtension mechanism that enables backward-compatible protocol evolution and supports diverse agent architectures while maintaining security and interoperability. We demonstrate ACNBP’s effectiveness through a comprehensive security analysis using the MAESTRO threat modeling framework, practical implementation considerations, and a detailed example showcasing the protocol’s application in a document translation scenario, with the protocol addressing critical challenges in agent autonomy, capability verification, secure communication, and scalable agent ecosystem management.

arxiv情報

著者	Ken Huang,Akram Sheriff,Vineeth Sai Narajala,Idan Habler
発行日	2025-06-16 15:18:24+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CR, cs.MA | コメントを受け付けていません

CAMS: A CityGPT-Powered Agentic Framework for Urban Human Mobility Simulation

投稿日: 2025年6月17日作成者: jarxiv

要約

人間のモビリティシミュレーションは、さまざまな現実世界のアプリケーションで重要な役割を果たします。
最近、従来のデータ駆動型アプローチの制限に対処するために、研究者は、人間のモビリティシミュレーションを加速するために、大規模な言語モデル（LLM）の常識的な知識と推論能力を活用することを調査しました。
ただし、これらの方法は、都市空間の不十分なモデリングや、個々のモビリティパターンと集合的なモビリティ分布の両方との統合が不十分であるなど、いくつかの重要な欠点に悩まされています。
これらの課題に対処するために、\ textbf {c} itygpt-powed \ textbf {a} \ textbf {m} obility \ textbf {s} imulation（\ textbf {cams}）のgenticフレームワークを提案します。
\ textBF {cams}は、テンプレートモビリティパターンを抽出し、ユーザープロファイルに基づいて新しいものを合成するMobextractorを含む3つのコアモジュール、地球源、集合的な知識を考慮してアンカーポイントを生成し、CityGPTの拡張バージョンを使用して候補者の地理的知識を生成するアンカーポイントを生成するためのアンカーポイントを生成する3つのコアモジュールで構成されています。
DPOを介した軌道優先アライメント。
実際のデータセットでの実験は、\ textBf {cams}が外部から提供された地理空間情報に依存することなく優れたパフォーマンスを達成することを示しています。
さらに、個々のモビリティパターンと集合的なモビリティの制約の両方を全体的にモデル化することにより、\ textBf {cams}は、より現実的でもっともらしい軌跡を生成します。
一般に、\ textbf {cams}は、エージェントフレームワークを人間のモビリティシミュレーションのための都市知識のあるLLMと統合する新しいパラダイムを確立します。

要約(オリジナル)

Human mobility simulation plays a crucial role in various real-world applications. Recently, to address the limitations of traditional data-driven approaches, researchers have explored leveraging the commonsense knowledge and reasoning capabilities of large language models (LLMs) to accelerate human mobility simulation. However, these methods suffer from several critical shortcomings, including inadequate modeling of urban spaces and poor integration with both individual mobility patterns and collective mobility distributions. To address these challenges, we propose \textbf{C}ityGPT-Powered \textbf{A}gentic framework for \textbf{M}obility \textbf{S}imulation (\textbf{CAMS}), an agentic framework that leverages the language based urban foundation model to simulate human mobility in urban space. \textbf{CAMS} comprises three core modules, including MobExtractor to extract template mobility patterns and synthesize new ones based on user profiles, GeoGenerator to generate anchor points considering collective knowledge and generate candidate urban geospatial knowledge using an enhanced version of CityGPT, TrajEnhancer to retrieve spatial knowledge based on mobility patterns and generate trajectories with real trajectory preference alignment via DPO. Experiments on real-world datasets show that \textbf{CAMS} achieves superior performance without relying on externally provided geospatial information. Moreover, by holistically modeling both individual mobility patterns and collective mobility constraints, \textbf{CAMS} generates more realistic and plausible trajectories. In general, \textbf{CAMS} establishes a new paradigm that integrates the agentic framework with urban-knowledgeable LLMs for human mobility simulation.

arxiv情報

著者	Yuwei Du,Jie Feng,Jian Yuan,Yong Li
発行日	2025-06-16 15:24:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

The ASP-based Nurse Scheduling System at the University of Yamanashi Hospital

投稿日: 2025年6月17日作成者: jarxiv

要約

回答セットプログラミング（ASP）を使用して構築され、山下大学病院で首尾よく展開された看護師スケジューリングシステムの設計原則を提示します。
看護師のスケジューリングは、個々の看護師の好みをさまざまな病棟で病院の人員配置のニーズと調整する必要がある複雑な最適化問題です。
これには、ハードとソフトの制約のバランスとインタラクティブ調整の柔軟性が含まれます。
アカデミアで広く研究されている間、現実世界の看護師のスケジューリングは、典型的なベンチマークの問題や競争を超えたユニークな課題を提示します。
このペーパーでは、ヤマナシ大学病院でのこれらの課題に対処するためのASPの実用的な応用について、実世界の展開の複雑さを効果的に管理するために必要なASPテクノロジーの進歩に焦点を当てています。

要約(オリジナル)

We present the design principles of a nurse scheduling system built using Answer Set Programming (ASP) and successfully deployed at the University of Yamanashi Hospital. Nurse scheduling is a complex optimization problem requiring the reconciliation of individual nurse preferences with hospital staffing needs across various wards. This involves balancing hard and soft constraints and the flexibility of interactive adjustments. While extensively studied in academia, real-world nurse scheduling presents unique challenges that go beyond typical benchmark problems and competitions. This paper details the practical application of ASP to address these challenges at the University of Yamanashi Hospital, focusing on the insights gained and the advancements in ASP technology necessary to effectively manage the complexities of real-world deployment.

arxiv情報

著者	Hidetomo Nabeshima,Mutsunori Banbara,Torsten Schaub,Takehide Soh
発行日	2025-06-16 15:25:06+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: 68T30, cs.AI | コメントを受け付けていません

An Investigation into Value Misalignment in LLM-Generated Texts for Cultural Heritage

投稿日: 2025年6月17日作成者: jarxiv

要約

大規模な言語モデル（LLM）が、歴史的記念碑の説明を生成し、古代のテキストを翻訳し、口頭での伝統を維持し、教育コンテンツを作成するなど、文化遺産に関連するタスクでますます一般的になるにつれて、正確で文化的に整合したテキストを生成する能力がユーザーと研究者によってますます依存しています。
しかし、歴史的事実の不実表示、文化的アイデンティティの侵食、深刻な結果につながる可能性のある複雑な文化的物語の単純化など、生成されたテキストには文化的価値の不整合が存在する可能性があります。
したがって、文化遺産のためのLLMの文脈における価値の不整合を調査することは、これらのリスクを緩和するために重要ですが、この分野での体系的で包括的な研究と調査が重大な欠如がありました。
このギャップを埋めるために、文化的遺産関連のタスクのために文化的に整合したテキストを生成する際のLLMの信頼性を体系的に評価します。
5つのオープンソースLLMにわたって文化遺産の知識フレームワーク内で17の側面を持つ5つの広く認識されたカテゴリをカバーする1066のクエリタスクの広範なセットを編集することにより、包括的な評価を実施し、生成されたテキストの文化的価値の誤りの両方のタイプとレートの両方を調べます。
自動化されたアプローチと手動アプローチの両方を使用して、LLMで生成されたテキストの文化的価値の不整合を効果的に検出および分析します。
私たちの調査結果は懸念されています。生成されたテキストの65％以上が顕著な文化的不整合を示し、特定のタスクは重要な文化的価値とほぼ完全な不整列を示しています。
これらの調査結果を超えて、このペーパーでは、LLMの文化的感度と信頼性を高めることを目的とした将来の研究の貴重なリソースとして役立つベンチマークデータセットと包括的な評価ワークフローを紹介します。

要約(オリジナル)

As Large Language Models (LLMs) become increasingly prevalent in tasks related to cultural heritage, such as generating descriptions of historical monuments, translating ancient texts, preserving oral traditions, and creating educational content, their ability to produce accurate and culturally aligned texts is being increasingly relied upon by users and researchers. However, cultural value misalignments may exist in generated texts, such as the misrepresentation of historical facts, the erosion of cultural identity, and the oversimplification of complex cultural narratives, which may lead to severe consequences. Therefore, investigating value misalignment in the context of LLM for cultural heritage is crucial for mitigating these risks, yet there has been a significant lack of systematic and comprehensive study and investigation in this area. To fill this gap, we systematically assess the reliability of LLMs in generating culturally aligned texts for cultural heritage-related tasks. We conduct a comprehensive evaluation by compiling an extensive set of 1066 query tasks covering 5 widely recognized categories with 17 aspects within the knowledge framework of cultural heritage across 5 open-source LLMs, and examine both the type and rate of cultural value misalignments in the generated texts. Using both automated and manual approaches, we effectively detect and analyze the cultural value misalignments in LLM-generated texts. Our findings are concerning: over 65% of the generated texts exhibit notable cultural misalignments, with certain tasks demonstrating almost complete misalignment with key cultural values. Beyond these findings, this paper introduces a benchmark dataset and a comprehensive evaluation workflow that can serve as a valuable resource for future research aimed at enhancing the cultural sensitivity and reliability of LLMs.

arxiv情報

著者	Fan Bu,Zheng Wang,Siyi Wang,Ziyao Liu
発行日	2025-06-16 15:37:06+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Avoiding Obfuscation with Prover-Estimator Debate

投稿日: 2025年6月17日作成者: jarxiv

要約

強力なAIシステムをトレーニングして、ますます複雑なタスクで正確な人間の監督を提供する能力にかかっている希望の行動を示す。
この問題に対する有望なアプローチは、特定の問題に対する正しい解決策についての議論で2人の競合するAIの力を活用することにより、人間の判断を増幅することです。
以前の理論的研究は、AI議論の複雑さの理論的形式化を提供し、可能な限り複雑なクラスとしての複雑なクラスとしての人間の判断の正しさを保証するAI議論のプロトコルを設計する問題を提起しました。
討論者が複雑な問題をより単純なサブ問題に分解する再帰的議論は、議論で正確に判断できる問題のクラスを増やすことを約束します。
ただし、再帰的な議論のための既存のプロトコルは、難読化された議論の問題に遭遇します。不正な討論者は、正直な相手に計算上の扱いにくい問題を解決するために強制的に勝つために強制する計算上効率的な戦略を使用できます。
この問題は、特定の安定性の仮定の下で、相手に匹敵する計算効率を必要とする戦略で正直な討論者が勝つことができることを保証できる新しい再帰的討論プロトコルで緩和します。

要約(オリジナル)

Training powerful AI systems to exhibit desired behaviors hinges on the ability to provide accurate human supervision on increasingly complex tasks. A promising approach to this problem is to amplify human judgement by leveraging the power of two competing AIs in a debate about the correct solution to a given problem. Prior theoretical work has provided a complexity-theoretic formalization of AI debate, and posed the problem of designing protocols for AI debate that guarantee the correctness of human judgements for as complex a class of problems as possible. Recursive debates, in which debaters decompose a complex problem into simpler subproblems, hold promise for growing the class of problems that can be accurately judged in a debate. However, existing protocols for recursive debate run into the obfuscated arguments problem: a dishonest debater can use a computationally efficient strategy that forces an honest opponent to solve a computationally intractable problem to win. We mitigate this problem with a new recursive debate protocol that, under certain stability assumptions, ensures that an honest debater can win with a strategy requiring computational efficiency comparable to their opponent.

arxiv情報

著者	Jonah Brown-Cohen,Geoffrey Irving,Georgios Piliouras
発行日	2025-06-16 15:37:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CC, cs.DS | コメントを受け付けていません

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント