「cs.LG」カテゴリーアーカイブ

TreeRPO: Tree Relative Policy Optimization

投稿日: 2025年6月6日作成者: jarxiv

要約大規模な言語モデル（LLM）は、検証可能な報酬（RLVR）方法による強化学 … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

Counterfactual reasoning: an analysis of in-context emergence

投稿日: 2025年6月6日作成者: jarxiv

要約大規模なニューラル言語モデル（LMS）は、コンテキスト内学習において顕著な … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, math.ST, stat.TH | コメントを受け付けていません

One Wave To Explain Them All: A Unifying Perspective On Feature Attribution

投稿日: 2025年6月6日作成者: jarxiv

要約機能の属性方法は、モデルの決定に影響を与える入力機能を識別することにより、 … 続きを読む →

カテゴリー: cs.AI, cs.LG, stat.ML | コメントを受け付けていません

Unleashing The Power of Pre-Trained Language Models for Irregularly Sampled Time Series

投稿日: 2025年6月6日作成者: jarxiv

要約 ChatGPTなどの事前に訓練された言語モデル（PLMS）は、自然言語処理 … 続きを読む →

カテゴリー: cs.AI, cs.LG, stat.AP | コメントを受け付けていません

Mitigating Degree Bias Adaptively with Hard-to-Learn Nodes in Graph Contrastive Learning

投稿日: 2025年6月6日作成者: jarxiv

要約グラフニューラルネットワーク（GNNS）は、多くの場合、ノード分類タスクの … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

The Lessons of Developing Process Reward Models in Mathematical Reasoning

投稿日: 2025年6月6日作成者: jarxiv

要約プロセス報酬モデル（PRM）は、推論プロセスで中間エラーを特定して軽減する … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

MesaNet: Sequence Modeling by Locally Optimal Test-Time Training

投稿日: 2025年6月6日作成者: jarxiv

要約シーケンスモデリングは現在、SoftMaxの自己触媒を使用する因果変圧器ア … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

Rethinking LLM Advancement: Compute-Dependent and Independent Paths to Progress

投稿日: 2025年6月6日作成者: jarxiv

要約大規模な言語モデル（LLM）開発を管理する規制の取り組みは、主に高性能計算 … 続きを読む →

カテゴリー: cs.AI, cs.LG, I.2 | コメントを受け付けていません

Just Enough Thinking: Efficient Reasoning with Adaptive Length Penalties Reinforcement Learning

投稿日: 2025年6月6日作成者: jarxiv

要約大きな推論モデル（LRMS）は、推論時により多くのトークンを生成することに … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

Context is Key: A Benchmark for Forecasting with Essential Textual Information

投稿日: 2025年6月6日作成者: jarxiv

要約予測は、多くのドメインにわたる意思決定における重要なタスクです。履歴数値 … 続きを読む →

カテゴリー: cs.AI, cs.LG, stat.ML | コメントを受け付けていません

「cs.LG」カテゴリーアーカイブ

TreeRPO: Tree Relative Policy Optimization

Counterfactual reasoning: an analysis of in-context emergence

One Wave To Explain Them All: A Unifying Perspective On Feature Attribution

Unleashing The Power of Pre-Trained Language Models for Irregularly Sampled Time Series

Mitigating Degree Bias Adaptively with Hard-to-Learn Nodes in Graph Contrastive Learning

The Lessons of Developing Process Reward Models in Mathematical Reasoning

MesaNet: Sequence Modeling by Locally Optimal Test-Time Training

Rethinking LLM Advancement: Compute-Dependent and Independent Paths to Progress

Just Enough Thinking: Efficient Reasoning with Adaptive Length Penalties Reinforcement Learning

Context is Key: A Benchmark for Forecasting with Essential Textual Information

最近の投稿

最近のコメント

アーカイブ

カテゴリー