月別アーカイブ: 2025年2月

What is the Relationship between Tensor Factorizations and Circuits (and How Can We Exploit it)?

投稿日: 2025年2月4日作成者: jarxiv

要約本稿では、回路表現とテンソル分解という、一見異なるが基本的に関連する2つの … 続きを読む →

カテゴリー: cs.LG | コメントを受け付けていません

How Do the Architecture and Optimizer Affect Representation Learning? On the Training Dynamics of Representations in Deep Neural Networks

投稿日: 2025年2月4日作成者: jarxiv

要約本稿では、ディープニューラルネットワーク（DNN）の表現が学習中にどのよう … 続きを読む →

カテゴリー: cs.LG | コメントを受け付けていません

Mind the Gap: a Spectral Analysis of Rank Collapse and Signal Propagation in Attention Layers

投稿日: 2025年2月4日作成者: jarxiv

要約アテンション層は、現在の最先端のニューラルネットワークアーキテクチャである … 続きを読む →

カテゴリー: cs.LG, stat.ML | コメントを受け付けていません

Can sparse autoencoders make sense of latent representations?

投稿日: 2025年2月4日作成者: jarxiv

要約スパースオートエンコーダ(SAE)は最近、大規模な言語モデルにおいて解釈可 … 続きを読む →

カテゴリー: cs.LG | コメントを受け付けていません

E2Former: A Linear-time Efficient and Equivariant Transformer for Scalable Molecular Modeling

投稿日: 2025年2月4日作成者: jarxiv

要約等変量グラフニューラルネットワーク（EGNN）は、化学、生物学、材料科学な … 続きを読む →

カテゴリー: cs.LG | コメントを受け付けていません

CodeMonkeys: Scaling Test-Time Compute for Software Engineering

投稿日: 2025年2月4日作成者: jarxiv

要約テスト時間計算のスケーリングは、LLMの能力を向上させる有望な軸である。し … 続きを読む →

カテゴリー: cs.LG | コメントを受け付けていません

Unmasking Conversational Bias in AI Multiagent Systems

投稿日: 2025年2月4日作成者: jarxiv

要約生成モデルによって生成された出力におけるバイアスを検出することは、重要な設 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.MA | コメントを受け付けていません

Large Language Models as Markov Chains

投稿日: 2025年2月4日作成者: jarxiv

要約大規模言語モデル(LLM)は、自然言語処理タスクの広い範囲において、またそ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, stat.ML | コメントを受け付けていません

WikiHint: A Human-Annotated Dataset for Hint Ranking and Generation

投稿日: 2025年2月4日作成者: jarxiv

要約ユーザーがチャットボットに頻繁に質問するようになり、大規模言語モデル（LL … 続きを読む →

カテゴリー: cs.CL, cs.IR | コメントを受け付けていません

IncogniText: Privacy-enhancing Conditional Text Anonymization via LLM-based Private Attribute Randomization

投稿日: 2025年2月4日作成者: jarxiv

要約本研究では、テキストの匿名化の問題を扱う。その目的は、テキストの有用性、す … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CR, cs.LG | コメントを受け付けていません

月別アーカイブ: 2025年2月

What is the Relationship between Tensor Factorizations and Circuits (and How Can We Exploit it)?

How Do the Architecture and Optimizer Affect Representation Learning? On the Training Dynamics of Representations in Deep Neural Networks

Mind the Gap: a Spectral Analysis of Rank Collapse and Signal Propagation in Attention Layers

Can sparse autoencoders make sense of latent representations?

E2Former: A Linear-time Efficient and Equivariant Transformer for Scalable Molecular Modeling

CodeMonkeys: Scaling Test-Time Compute for Software Engineering

Unmasking Conversational Bias in AI Multiagent Systems

Large Language Models as Markov Chains

WikiHint: A Human-Annotated Dataset for Hint Ranking and Generation

IncogniText: Privacy-enhancing Conditional Text Anonymization via LLM-based Private Attribute Randomization

最近の投稿

最近のコメント

アーカイブ

カテゴリー