「cs.CL」カテゴリーアーカイブ

Agent-X: Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks

投稿日: 2025年6月2日作成者: jarxiv

要約深い推論は、特に順次のマルチモーダル理解を必要とする視覚中心のシナリオで、 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents

投稿日: 2025年6月2日作成者: jarxiv

要約 Captchasは、実際のアプリケーションにWebエージェントを展開するた … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Automatic classification of stop realisation with wav2vec2.0

投稿日: 2025年6月2日作成者: jarxiv

要約現代の音声研究は、音声データの注釈のために自動ツールを定期的に使用していま … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Fast Large Language Model Collaborative Decoding via Speculation

投稿日: 2025年5月30日作成者: jarxiv

要約大規模な言語モデル（LLM）コラボレーションデコード手法は、各世代のステッ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

Probability-Consistent Preference Optimization for Enhanced LLM Reasoning

投稿日: 2025年5月30日作成者: jarxiv

要約優先最適化の最近の進歩は、大規模な言語モデル（LLM）の数学的推論能力を改 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Translation in the Wild

投稿日: 2025年5月30日作成者: jarxiv

要約大規模な言語モデル（LLM）は、とりわけ翻訳に優れており、ゼロおよび少数の … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Enhancing Automated Interpretability with Output-Centric Feature Descriptions

投稿日: 2025年5月30日作成者: jarxiv

要約自動化された解釈可能性パイプラインは、植物や文の最初の単語など、大規模な言 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Understanding Refusal in Language Models with Sparse Autoencoders

投稿日: 2025年5月30日作成者: jarxiv

要約拒否は、整合した言語モデルの重要な安全行動ですが、拒否を促進する内部メカニ … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

LEXam: Benchmarking Legal Reasoning on 340 Law Exams

投稿日: 2025年5月30日作成者: jarxiv

要約テスト時間スケーリングの最近の進歩にもかかわらず、長い形式の法的推論は依然 … 続きを読む →

カテゴリー: 68T50, cs.AI, cs.CL, cs.LG, I.2 | コメントを受け付けていません

Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models

投稿日: 2025年5月30日作成者: jarxiv

要約強化学習（RL）を使用して効果的に大規模な言語モデルの推論能力を強化するこ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

「cs.CL」カテゴリーアーカイブ

Agent-X: Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks

Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents

Automatic classification of stop realisation with wav2vec2.0

Fast Large Language Model Collaborative Decoding via Speculation

Probability-Consistent Preference Optimization for Enhanced LLM Reasoning

Translation in the Wild

Enhancing Automated Interpretability with Output-Centric Feature Descriptions

Understanding Refusal in Language Models with Sparse Autoencoders

LEXam: Benchmarking Legal Reasoning on 340 Law Exams

Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models

最近の投稿

最近のコメント

アーカイブ

カテゴリー