「cs.LG」カテゴリーアーカイブ

Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation

投稿日: 2025年5月29日作成者: jarxiv

要約このホワイトペーパーでは、重要性サンプリングの行動ポリシーの推定に焦点を当 … 続きを読む →

カテゴリー: cs.AI, cs.LG, stat.ML | コメントを受け付けていません

Novelty Detection in Reinforcement Learning with World Models

投稿日: 2025年5月29日作成者: jarxiv

要約世界モデルを使用した補強学習（RL）は、最近の大幅な成功を発見しています。 … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.SY, eess.SY | コメントを受け付けていません

Evaluating Supervised Learning Models for Fraud Detection: A Comparative Study of Classical and Deep Architectures on Imbalanced Transaction Data

投稿日: 2025年5月29日作成者: jarxiv

要約詐欺の検出は、財務やeコマースなどのハイステークスドメインで重要なタスクの … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

Training RL Agents for Multi-Objective Network Defense Tasks

投稿日: 2025年5月29日作成者: jarxiv

要約狭い能力よりも幅広い能力を達成するトレーニングエージェントを強調するオープ … 続きを読む →

カテゴリー: cs.AI, cs.CR, cs.LG | コメントを受け付けていません

TabularQGAN: A Quantum Generative Model for Tabular Data

投稿日: 2025年5月29日作成者: jarxiv

要約この論文では、表形式データを合成するための新しい量子生成モデルを紹介します … 続きを読む →

カテゴリー: cs.AI, cs.LG, quant-ph | コメントを受け付けていません

VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use

投稿日: 2025年5月29日作成者: jarxiv

要約強化学習Finetuning（RFT）は、長い思考、自己修正、および効果的 … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning

投稿日: 2025年5月29日作成者: jarxiv

要約大規模な言語モデルは、さまざまなドメインで強力なパフォーマンスを示していま … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

HDDLGym: A Tool for Studying Multi-Agent Hierarchical Problems Defined in HDDL with OpenAI Gym

投稿日: 2025年5月29日作成者: jarxiv

要約近年、Openai Gymのようなツールを使用してRehnection L … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.MA | コメントを受け付けていません

On the performance of machine-learning assisted Monte Carlo in sampling from simple statistical physics models

投稿日: 2025年5月29日作成者: jarxiv

要約近年、従来の方法を使用して研究できないサンプルが困難なシステムのシミュレー … 続きを読む →

カテゴリー: cond-mat.dis-nn, cond-mat.stat-mech, cs.AI, cs.LG, physics.comp-ph | コメントを受け付けていません

Machine Unlearning under Overparameterization

投稿日: 2025年5月29日作成者: jarxiv

要約マシンの非学習アルゴリズムは、特定のトレーニングサンプルの影響を削除するこ … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

「cs.LG」カテゴリーアーカイブ

Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation

Novelty Detection in Reinforcement Learning with World Models

Evaluating Supervised Learning Models for Fraud Detection: A Comparative Study of Classical and Deep Architectures on Imbalanced Transaction Data

Training RL Agents for Multi-Objective Network Defense Tasks

TabularQGAN: A Quantum Generative Model for Tabular Data

VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use

Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning

HDDLGym: A Tool for Studying Multi-Agent Hierarchical Problems Defined in HDDL with OpenAI Gym

On the performance of machine-learning assisted Monte Carlo in sampling from simple statistical physics models

Machine Unlearning under Overparameterization

最近の投稿

最近のコメント

アーカイブ

カテゴリー