「cs.CL」カテゴリーアーカイブ

Loss-to-Loss Prediction: Scaling Laws for All Datasets

投稿日: 2024年11月21日作成者: jarxiv

要約スケーリング則は、単一のデータ分布の計算スケール全体で列車損失を予測するた … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, stat.ML | コメントを受け付けていません

Literature Meets Data: A Synergistic Approach to Hypothesis Generation

投稿日: 2024年11月21日作成者: jarxiv

要約 AI は、仮説生成を含む科学プロセスを変革する可能性を秘めています。仮説 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CY, cs.LG | コメントを受け付けていません

A Flexible Large Language Models Guardrail Development Methodology Applied to Off-Topic Prompt Detection

投稿日: 2024年11月21日作成者: jarxiv

要約大規模な言語モデルは、主題から外れた誤用の傾向があり、ユーザーがこれらのモ … 続きを読む →

カテゴリー: 68T50, cs.CL, cs.LG, I.2.7 | コメントを受け付けていません

Demystifying Large Language Models for Medicine: A Primer

投稿日: 2024年11月21日作成者: jarxiv

要約大規模言語モデル (LLM) は、さまざまなコンテキストにわたって人間のよ … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Keep the Cost Down: A Review on Methods to Optimize LLM’ s KV-Cache Consumption

投稿日: 2024年11月21日作成者: jarxiv

要約 2022 年後半の ChatGPT リリースに代表される大規模言語モデル … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models

投稿日: 2024年11月21日作成者: jarxiv

要約大規模言語モデル (LLM) は急速に進歩し、優れた機能を実証しています。 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

MindForge: Empowering Embodied Agents with Theory of Mind for Lifelong Collaborative Learning

投稿日: 2024年11月21日作成者: jarxiv

要約 Minecraft の Voyager など、現代の身体化エージェントは、 … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Does Unlearning Truly Unlearn? A Black Box Evaluation of LLM Unlearning Methods

投稿日: 2024年11月21日作成者: jarxiv

要約大規模言語モデルの学習解除は、LLM が学習した有害な情報を削除して、悪意 … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Training Bilingual LMs with Data Constraints in the Targeted Language

投稿日: 2024年11月21日作成者: jarxiv

要約大規模な言語モデルは、現在のスケーリング法の要求に従って、Web の大規模 … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

MemoryFormer: Minimize Transformer Computation by Removing Fully-Connected Layers

投稿日: 2024年11月21日作成者: jarxiv

要約大規模な言語モデルの計算の複雑さを軽減するために、リニアアテンションやフ … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

「cs.CL」カテゴリーアーカイブ

Loss-to-Loss Prediction: Scaling Laws for All Datasets

Literature Meets Data: A Synergistic Approach to Hypothesis Generation

A Flexible Large Language Models Guardrail Development Methodology Applied to Off-Topic Prompt Detection

Demystifying Large Language Models for Medicine: A Primer

Keep the Cost Down: A Review on Methods to Optimize LLM’ s KV-Cache Consumption

Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models

MindForge: Empowering Embodied Agents with Theory of Mind for Lifelong Collaborative Learning

Does Unlearning Truly Unlearn? A Black Box Evaluation of LLM Unlearning Methods

Training Bilingual LMs with Data Constraints in the Targeted Language

MemoryFormer: Minimize Transformer Computation by Removing Fully-Connected Layers

最近の投稿

最近のコメント

アーカイブ

カテゴリー