「cs.LG」カテゴリーアーカイブ

Entropy Controllable Direct Preference Optimization

投稿日: 2025年6月16日作成者: jarxiv

要約大規模な言語モデル（LLM）の訓練後、人間のフィードバック（RLHF）から … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

Long-Short Alignment for Effective Long-Context Modeling in LLMs

投稿日: 2025年6月16日作成者: jarxiv

要約大規模な言語モデル（LLM）は、印象的なパフォーマンスと驚くべき緊急特性を … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Persona-driven Simulation of Voting Behavior in the European Parliament with Large Language Models

投稿日: 2025年6月16日作成者: jarxiv

要約大規模な言語モデル（LLMS）は、政治的言説を理解したり、生み出したりする … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

Word Sense Detection Leveraging Maximum Mean Discrepancy

投稿日: 2025年6月16日作成者: jarxiv

要約単語感覚分析は、言語的および社会的背景を解釈するための重要な分析作業です。 … 続きを読む →

カテゴリー: cs.CL, cs.LG, stat.ML | コメントを受け付けていません

On the Performance of LLMs for Real Estate Appraisal

投稿日: 2025年6月16日作成者: jarxiv

要約不動産市場は世界経済にとって不可欠ですが、重要な情報の非対称性に苦しんでい … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

TreeRL: LLM Reinforcement Learning with On-Policy Tree Search

投稿日: 2025年6月16日作成者: jarxiv

要約ツリー検索を備えた強化学習（RL）は、従来の推論タスクで優れたパフォーマン … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

T1: Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling

投稿日: 2025年6月16日作成者: jarxiv

要約大規模な言語モデル（LLMS）は、複雑な推論タスクにおいて顕著な能力を示し … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Factual Knowledge in Language Models: Robustness and Anomalies under Simple Temporal Context Variations

投稿日: 2025年6月16日作成者: jarxiv

要約このペーパーでは、実際の知識の中で、時間的文脈の変動に対する言語モデル（L … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

e3: Learning to Explore Enables Extrapolation of Test-Time Compute for LLMs

投稿日: 2025年6月16日作成者: jarxiv

要約テスト時間スケーリングは、推論時により多くの計算を利用することにより、LL … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

V-Max: A Reinforcement Learning Framework for Autonomous Driving

投稿日: 2025年6月16日作成者: jarxiv

要約学習ベースの意思決定には、一般化可能な自律運転（AD）ポリシーを可能にする … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.RO | コメントを受け付けていません

「cs.LG」カテゴリーアーカイブ

Entropy Controllable Direct Preference Optimization

Long-Short Alignment for Effective Long-Context Modeling in LLMs

Persona-driven Simulation of Voting Behavior in the European Parliament with Large Language Models

Word Sense Detection Leveraging Maximum Mean Discrepancy

On the Performance of LLMs for Real Estate Appraisal

TreeRL: LLM Reinforcement Learning with On-Policy Tree Search

T1: Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling

Factual Knowledge in Language Models: Robustness and Anomalies under Simple Temporal Context Variations

e3: Learning to Explore Enables Extrapolation of Test-Time Compute for LLMs

V-Max: A Reinforcement Learning Framework for Autonomous Driving

最近の投稿

最近のコメント

アーカイブ

カテゴリー