「cs.CL」カテゴリーアーカイブ

DeduCE: Deductive Consistency as a Framework to Evaluate LLM Reasoning

投稿日: 2025年4月10日作成者: jarxiv

要約 Olympiadレベルの推論の問題での大きなパフォーマンスにもかかわらず、 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

Self-Steering Language Models

投稿日: 2025年4月10日作成者: jarxiv

要約テスト時間の推論により、言語モデルは複雑なタスクに取り組むことができますが … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

KG-LLM-Bench: A Scalable Benchmark for Evaluating LLM Reasoning on Textualized Knowledge Graphs

投稿日: 2025年4月10日作成者: jarxiv

要約知識グラフは、最新の事実の知識を大規模な言語モデル（LLM）に注入するため … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.IR | コメントを受け付けていません

Sculpting Subspaces: Constrained Full Fine-Tuning in LLMs for Continual Learning

投稿日: 2025年4月10日作成者: jarxiv

要約大規模な言語モデル（LLMS）での継続的な学習は、壊滅的な忘却を受けやすく … 続きを読む →

カテゴリー: 68T50, cs.AI, cs.CL, cs.LG, G.3, math.PR, stat.ML | コメントを受け付けていません

Dolphin: Moving Towards Closed-loop Auto-research through Thinking, Practice, and Feedback

投稿日: 2025年4月10日作成者: jarxiv

要約科学研究のパラダイムは、人工知能（AI）の発達により、深い変換を受けていま … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

A Unified Agentic Framework for Evaluating Conditional Image Generation

投稿日: 2025年4月10日作成者: jarxiv

要約条件付き画像生成は、コンテンツをパーソナライズする能力について大きな注目を … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Models

投稿日: 2025年4月10日作成者: jarxiv

要約このペーパーでは、大規模なマルチモーダルモデル（LMMS）の堅牢な理解能力 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation

投稿日: 2025年4月10日作成者: jarxiv

要約ビジョン言語モデル（VLMS）の迅速な発展には、厳密で信頼できる評価が必要 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.CY, cs.LG | コメントを受け付けていません

Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

投稿日: 2025年4月10日作成者: jarxiv

要約ビジョン言語モデル（VLMS）の評価は、主に英語のベンチマークに依存してお … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills

投稿日: 2025年4月10日作成者: jarxiv

要約複雑な環境で生き残り、繁栄するために、人間は環境探査、経験の階層的な抽象化 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

「cs.CL」カテゴリーアーカイブ

DeduCE: Deductive Consistency as a Framework to Evaluate LLM Reasoning

Self-Steering Language Models

KG-LLM-Bench: A Scalable Benchmark for Evaluating LLM Reasoning on Textualized Knowledge Graphs

Sculpting Subspaces: Constrained Full Fine-Tuning in LLMs for Continual Learning

Dolphin: Moving Towards Closed-loop Auto-research through Thinking, Practice, and Feedback

A Unified Agentic Framework for Evaluating Conditional Image Generation

Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Models

Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation

Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills

最近の投稿

最近のコメント

アーカイブ

カテゴリー