月別アーカイブ: 2024年2月

Universal Jailbreak Backdoors from Poisoned Human Feedback

投稿日: 2024年2月8日作成者: jarxiv

要約ヒューマンフィードバックからの強化学習 (RLHF) は、大規模な言語モ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CR, cs.LG | コメントを受け付けていません

PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition

投稿日: 2024年2月8日作成者: jarxiv

要約この研究では、大規模言語モデル (LLM) を使用した固有表現認識 (NE … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Explaining Learned Reward Functions with Counterfactual Trajectories

投稿日: 2024年2月8日作成者: jarxiv

要約人間の行動やフィードバックから報酬を学習することは、AI システムを人間の … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay

投稿日: 2024年2月8日作成者: jarxiv

要約大規模な言語モデルは、人間レベルの推論能力が必要と一般に考えられているタス … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

Learning by Doing: An Online Causal Reinforcement Learning Framework with Causal-Aware Policy

投稿日: 2024年2月8日作成者: jarxiv

要約人間の知能における直観的な認知および推論ソリューションの重要な要素として、 … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

DS-MS-TCN: Otago Exercises Recognition with a Dual-Scale Multi-Stage Temporal Convolutional Network

投稿日: 2024年2月8日作成者: jarxiv

要約オタゴ運動プログラム (OEP) は、バランスと筋力の強化を目的とした、高 … 続きを読む →

カテゴリー: cs.AI, cs.LG, eess.SP | コメントを受け付けていません

A Unified Framework for Probabilistic Verification of AI Systems via Weighted Model Integration

投稿日: 2024年2月8日作成者: jarxiv

要約 AI システムの確率的形式検証 (PFV) はまだ初期段階にあります。こ … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

The Strain of Success: A Predictive Model for Injury Risk Mitigation and Team Success in Soccer

投稿日: 2024年2月8日作成者: jarxiv

要約この論文では、サッカーにおける新しい逐次チーム選択モデルを紹介します。具 … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

Imitation Learning from Observation with Automatic Discount Scheduling

投稿日: 2024年2月8日作成者: jarxiv

要約人間は観察と模倣を通じて新しいスキルを獲得することがよくあります。ロボッ … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.RO | コメントを受け付けていません

Prompting Implicit Discourse Relation Annotation

投稿日: 2024年2月8日作成者: jarxiv

要約 ChatGPT などの事前トレーニング済みの大規模言語モデルは、教師付きト … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

月別アーカイブ: 2024年2月

Universal Jailbreak Backdoors from Poisoned Human Feedback

PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition

Explaining Learned Reward Functions with Counterfactual Trajectories

CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay

Learning by Doing: An Online Causal Reinforcement Learning Framework with Causal-Aware Policy

DS-MS-TCN: Otago Exercises Recognition with a Dual-Scale Multi-Stage Temporal Convolutional Network

A Unified Framework for Probabilistic Verification of AI Systems via Weighted Model Integration

The Strain of Success: A Predictive Model for Injury Risk Mitigation and Team Success in Soccer

Imitation Learning from Observation with Automatic Discount Scheduling

Prompting Implicit Discourse Relation Annotation

最近の投稿

最近のコメント

アーカイブ

カテゴリー