「cs.AI」カテゴリーアーカイブ

Adaptive Reinforcement Learning for Unobservable Random Delays

投稿日: 2025年6月18日作成者: jarxiv

要約標準の強化学習（RL）の設定では、エージェントと環境の間の相互作用は通常、 … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.RO | コメントを受け付けていません

AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environment

投稿日: 2025年6月18日作成者: jarxiv

要約現在のサービスロボットは、限られた自然言語コミュニケーション能力、事前定義 … 続きを読む →

カテゴリー: cs.AI, cs.MA, cs.RO | コメントを受け付けていません

AlphaDecay:Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs

投稿日: 2025年6月18日作成者: jarxiv

要約重量減衰は、大規模な言語モデル（LLMS）をトレーニングするための標準的な … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

TGDPO: Harnessing Token-Level Reward Guidance for Enhancing Direct Preference Optimization

投稿日: 2025年6月18日作成者: jarxiv

要約人間のフィードバックからの強化学習における最近の進歩により、きめ細かいトー … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

PredictaBoard: Benchmarking LLM Score Predictability

投稿日: 2025年6月18日作成者: jarxiv

要約印象的なスキルを持っているにもかかわらず、大規模な言語モデル（LLM）は予 … 続きを読む →

カテゴリー: cs.AI, cs.CL, stat.ML | コメントを受け付けていません

Object-Centric Neuro-Argumentative Learning

投稿日: 2025年6月18日作成者: jarxiv

要約過去10年間、私たちは深い学習技術にもっと頼って重要な決定を下すため、その … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

GenerationPrograms: Fine-grained Attribution with Executable Programs

投稿日: 2025年6月18日作成者: jarxiv

要約最近の大規模な言語モデル（LLMS）は、ソースコンディショニングされたテキ … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

From tools to thieves: Measuring and understanding public perceptions of AI through crowdsourced metaphors

投稿日: 2025年6月18日作成者: jarxiv

要約人工知能（AI）ベースのテクノロジーの増加する有病率にどのように対応しまし … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CY, cs.HC | コメントを受け付けていません

No-Regret Learning Under Adversarial Resource Constraints: A Spending Plan Is All You Need!

投稿日: 2025年6月18日作成者: jarxiv

要約私たちは、リソースの制約の下でオンラインの意思決定の問題を研究しています。 … 続きを読む →

カテゴリー: cs.AI, cs.LG, stat.ML | コメントを受け付けていません

Vul-RAG: Enhancing LLM-based Vulnerability Detection via Knowledge-level RAG

投稿日: 2025年6月18日作成者: jarxiv

要約 LLMは脆弱性の検出において有望な可能性を示していますが、この研究は、脆弱 … 続きを読む →

カテゴリー: cs.AI, cs.SE | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

Adaptive Reinforcement Learning for Unobservable Random Delays

AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environment

AlphaDecay:Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs

TGDPO: Harnessing Token-Level Reward Guidance for Enhancing Direct Preference Optimization

PredictaBoard: Benchmarking LLM Score Predictability

Object-Centric Neuro-Argumentative Learning

GenerationPrograms: Fine-grained Attribution with Executable Programs

From tools to thieves: Measuring and understanding public perceptions of AI through crowdsourced metaphors

No-Regret Learning Under Adversarial Resource Constraints: A Spending Plan Is All You Need!

Vul-RAG: Enhancing LLM-based Vulnerability Detection via Knowledge-level RAG

最近の投稿

最近のコメント

アーカイブ

カテゴリー