「cs.LG」カテゴリーアーカイブ

Latent Action Pretraining from Videos

投稿日: 2025年5月16日作成者: jarxiv

要約 General Action Models（LAPA）の潜在的なアクション … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

ComplexFormer: Disruptively Advancing Transformer Inference Ability via Head-Specific Complex Vector Attention

投稿日: 2025年5月16日作成者: jarxiv

要約変圧器モデルは、トークンの依存関係をキャプチャするために自己関節に依存して … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

TensorLLM: Tensorising Multi-Head Attention for Enhanced Reasoning and Compression in LLMs

投稿日: 2025年5月16日作成者: jarxiv

要約大規模な言語モデル（LLM）の推論能力は、重量を構造的に除去することで改善 … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

投稿日: 2025年5月16日作成者: jarxiv

要約 AIの進行は評価の質によってボトルネックされており、強力なLLM As-A … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

Simple and Provable Scaling Laws for the Test-Time Compute of Large Language Models

投稿日: 2025年5月16日作成者: jarxiv

要約大規模な言語モデル（LLM）のテスト時間計算のために証明可能なスケーリング … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

FitCF: A Framework for Automatic Feature Importance-guided Counterfactual Example Generation

投稿日: 2025年5月16日作成者: jarxiv

要約反事実的な例は、モデルを改善するための貴重なデータとして、およびモデルの行 … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information

投稿日: 2025年5月16日作成者: jarxiv

要約大規模な言語モデル（LLM）が広くアクセスできるようになると、現実世界の使 … 続きを読む →

カテゴリー: 68T50, cs.CL, cs.LG | コメントを受け付けていません

Parallel Scaling Law for Language Models

投稿日: 2025年5月16日作成者: jarxiv

要約パラメーター（パラメータースケーリング）または出力トークン（推論時間スケー … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

SceneGenAgent: Precise Industrial Scene Generation with Coding Agent

投稿日: 2025年5月16日作成者: jarxiv

要約産業シーンのモデリングは、産業製造のシミュレーションに不可欠です。大規模 … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SE | コメントを受け付けていません

RouteNator: A Router-Based Multi-Modal Architecture for Generating Synthetic Training Data for Function Calling LLMs

投稿日: 2025年5月16日作成者: jarxiv

要約このペーパーでは、実際のユーザーインタラクションデータが利用できない場合、 … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

「cs.LG」カテゴリーアーカイブ

Latent Action Pretraining from Videos

ComplexFormer: Disruptively Advancing Transformer Inference Ability via Head-Specific Complex Vector Attention

TensorLLM: Tensorising Multi-Head Attention for Enhanced Reasoning and Compression in LLMs

J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

Simple and Provable Scaling Laws for the Test-Time Compute of Large Language Models

FitCF: A Framework for Automatic Feature Importance-guided Counterfactual Example Generation

Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information

Parallel Scaling Law for Language Models

SceneGenAgent: Precise Industrial Scene Generation with Coding Agent

RouteNator: A Router-Based Multi-Modal Architecture for Generating Synthetic Training Data for Function Calling LLMs

最近の投稿

最近のコメント

アーカイブ

カテゴリー