「cs.AI」カテゴリーアーカイブ

Equivariant Representation Learning for Symmetry-Aware Inference with Guarantees

投稿日: 2025年5月27日作成者: jarxiv

要約回帰、条件付き確率推定、および不確実性の定量化の多くの現実世界の応用では、 … 続きを読む →

カテゴリー: 43-06, cs.AI, cs.LG, cs.RO, I.2.6 | コメントを受け付けていません

DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning

投稿日: 2025年5月27日作成者: jarxiv

要約スパースリワード補強学習（RL）は、幅広い非常に複雑なタスクをモデル化でき … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.RO | コメントを受け付けていません

SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety

投稿日: 2025年5月27日作成者: jarxiv

要約大規模な言語モデル（LLMS）が進んでおり、ますます多くのフィールドでアプ … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks

投稿日: 2025年5月27日作成者: jarxiv

要約大規模な言語モデル（LLMS）は、正式な仕様を生成することにより、自動化さ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LO, cs.SE | コメントを受け付けていません

SAEs Are Good for Steering — If You Select the Right Features

投稿日: 2025年5月27日作成者: jarxiv

要約スパース自動エンコーダー（SAE）は、モデルの潜在空間の分解を学ぶための監 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

Incentivizing Reasoning from Weak Supervision

投稿日: 2025年5月27日作成者: jarxiv

要約大規模な言語モデル（LLMS）は、推論集約型タスクの印象的なパフォーマンス … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Inference-time Alignment in Continuous Space

投稿日: 2025年5月27日作成者: jarxiv

要約推論時間に人間のフィードバックで大規模な言語モデルを調整することで、柔軟性 … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Revealing the Intrinsic Ethical Vulnerability of Aligned Large Language Models

投稿日: 2025年5月27日作成者: jarxiv

要約大規模な言語モデル（LLM）は、人工的な一般情報の基礎的な調査ですが、指導 … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Safety Through Reasoning: An Empirical Study of Reasoning Guardrail Models

投稿日: 2025年5月27日作成者: jarxiv

要約推論ベースの言語モデルは、さまざまなドメインで強力なパフォーマンスを実証し … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Homophily Enhanced Graph Domain Adaptation

投稿日: 2025年5月27日作成者: jarxiv

要約グラフドメインの適応（GDA）は、ラベルのあるソースグラフからラベルの希少 … 続きを読む →

カテゴリー: cs.AI, cs.SI | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

Equivariant Representation Learning for Symmetry-Aware Inference with Guarantees

DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning

SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety

Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks

SAEs Are Good for Steering — If You Select the Right Features

Incentivizing Reasoning from Weak Supervision

Inference-time Alignment in Continuous Space

Revealing the Intrinsic Ethical Vulnerability of Aligned Large Language Models

Safety Through Reasoning: An Empirical Study of Reasoning Guardrail Models

Homophily Enhanced Graph Domain Adaptation

最近の投稿

最近のコメント

アーカイブ

カテゴリー