「cs.LG」カテゴリーアーカイブ

Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking

投稿日: 2025年2月28日作成者: jarxiv

要約チェーンオブ思考（COT）は、幅広いタスクにわたって大規模な言語モデル（L … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Similarity-Distance-Magnitude Universal Verification

投稿日: 2025年2月28日作成者: jarxiv

要約類似性（つまり、トレーニングに深さマッチを正しく予測する）を追加することに … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Re-evaluating Open-ended Evaluation of Large Language Models

投稿日: 2025年2月28日作成者: jarxiv

要約評価は、伝統的に特定のスキルの候補者のランキングに焦点を当ててきました。 … 続きを読む →

カテゴリー: cs.CL, cs.GT, cs.LG, stat.ML | コメントを受け付けていません

The Impact of Unstated Norms in Bias Analysis of Language Models

投稿日: 2025年2月28日作成者: jarxiv

要約大規模な言語モデル（LLM）のバイアスには、明白な差別から暗黙のステレオタ … 続きを読む →

カテゴリー: 68T50, cs.CL, cs.CY, cs.LG | コメントを受け付けていません

Improving Neuron-level Interpretability with White-box Language Models

投稿日: 2025年2月28日作成者: jarxiv

要約 GPT-2のような自動再帰言語モデルのニューロンは、その活性化パターンを分 … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Layer-Aware Task Arithmetic: Disentangling Task-Specific and Instruction-Following Knowledge

投稿日: 2025年2月28日作成者: jarxiv

要約大規模な言語モデル（LLMS）は、微調整を通じてタスク固有の強力な機能を示 … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis

投稿日: 2025年2月28日作成者: jarxiv

要約 Web AIエージェントの最近の進歩により、複雑なWebナビゲーションタス … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Remove Symmetries to Control Model Expressivity and Improve Optimization

投稿日: 2025年2月28日作成者: jarxiv

要約対称性が損失関数に存在する場合、モデルは「崩壊」として知られる場合がある低 … 続きを読む →

カテゴリー: cs.AI, cs.LG, stat.ML | コメントを受け付けていません

Teasing Apart Architecture and Initial Weights as Sources of Inductive Bias in Neural Networks

投稿日: 2025年2月28日作成者: jarxiv

要約人工ニューラルネットワークは、データから人間の知識の多くの側面を獲得するこ … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

A Polynomial-Time Approximation for Pairwise Fair $k$-Median Clustering

投稿日: 2025年2月28日作成者: jarxiv

要約この作業では、$ \ ell \ ge 2 $グループを使用してペアワイズ … 続きを読む →

カテゴリー: cs.AI, cs.DS, cs.LG | コメントを受け付けていません

「cs.LG」カテゴリーアーカイブ

Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking

Similarity-Distance-Magnitude Universal Verification

Re-evaluating Open-ended Evaluation of Large Language Models

The Impact of Unstated Norms in Bias Analysis of Language Models

Improving Neuron-level Interpretability with White-box Language Models

Layer-Aware Task Arithmetic: Disentangling Task-Specific and Instruction-Following Knowledge

Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis

Remove Symmetries to Control Model Expressivity and Improve Optimization

Teasing Apart Architecture and Initial Weights as Sources of Inductive Bias in Neural Networks

A Polynomial-Time Approximation for Pairwise Fair $k$-Median Clustering

最近の投稿

最近のコメント

アーカイブ

カテゴリー