月別アーカイブ: 2024年6月

Separations in the Representational Capabilities of Transformers and Recurrent Architectures

投稿日: 2024年6月14日作成者: jarxiv

要約変圧器アーキテクチャは基礎モデルに広く採用されています。推論コストが高い … 続きを読む →

カテゴリー: cs.LG, stat.ML | コメントを受け付けていません

On the Expressibility of the Reconstructional Color Refinement

投稿日: 2024年6月14日作成者: jarxiv

要約有名なウラム再構成予想に関連する最も基本的な事実の 1 つは、グラフの接続 … 続きを読む →

カテゴリー: cs.CC, cs.DM, cs.LG, math.CO | コメントを受け付けていません

Advancing Graph Generation through Beta Diffusion

投稿日: 2024年6月14日作成者: jarxiv

要約拡散モデルは自然画像の生成に有効であることが実証されており、グラフなどのさ … 続きを読む →

カテゴリー: cs.LG, stat.ML | コメントを受け付けていません

Understanding Hallucinations in Diffusion Models through Mode Interpolation

投稿日: 2024年6月14日作成者: jarxiv

要約口語的に言えば、拡散プロセスに基づく画像生成モデルは、トレーニングデータ … 続きを読む →

カテゴリー: cs.LG | コメントを受け付けていません

Unichain and Aperiodicity are Sufficient for Asymptotic Optimality of Average-Reward Restless Bandits

投稿日: 2024年6月14日作成者: jarxiv

要約無限の地平線、平均報酬の落ち着きのない盗賊問題を離散時間で考察します。私 … 続きを読む →

カテゴリー: 90C40, cs.LG, G.3, math.OC, math.PR | コメントを受け付けていません

Data-dependent and Oracle Bounds on Forgetting in Continual Learning

投稿日: 2024年6月14日作成者: jarxiv

要約継続的な学習では、知識を保存してタスク間で再利用し、将来のタスクへの適切な … 続きを読む →

カテゴリー: cs.LG | コメントを受け付けていません

Efficient Discrepancy Testing for Learning with Distribution Shift

投稿日: 2024年6月14日作成者: jarxiv

要約ドメイン適応の分野におけるトレイン分布とテスト分布の間の距離の基本的な概念 … 続きを読む →

カテゴリー: cs.DS, cs.LG | コメントを受け付けていません

Learning conditional distributions on continuous spaces

投稿日: 2024年6月14日作成者: jarxiv

要約私たちは、特徴空間とターゲット空間のさまざまな次元を考慮して、多次元ユニッ … 続きを読む →

カテゴリー: cs.LG, math.ST, stat.ML, stat.TH | コメントを受け付けていません

Why Warmup the Learning Rate? Underlying Mechanisms and Improvements

投稿日: 2024年6月14日作成者: jarxiv

要約深層学習では、学習率 $\eta$ をウォームアップするのが一般的であり、 … 続きを読む →

カテゴリー: cond-mat.dis-nn, cs.LG, stat.ML | コメントを受け付けていません

SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models

投稿日: 2024年6月14日作成者: jarxiv

要約科学研究における大規模言語モデル (LLM) の利用が急増しているため、科 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

月別アーカイブ: 2024年6月

Separations in the Representational Capabilities of Transformers and Recurrent Architectures

On the Expressibility of the Reconstructional Color Refinement

Advancing Graph Generation through Beta Diffusion

Understanding Hallucinations in Diffusion Models through Mode Interpolation

Unichain and Aperiodicity are Sufficient for Asymptotic Optimality of Average-Reward Restless Bandits

Data-dependent and Oracle Bounds on Forgetting in Continual Learning

Efficient Discrepancy Testing for Learning with Distribution Shift

Learning conditional distributions on continuous spaces

Why Warmup the Learning Rate? Underlying Mechanisms and Improvements

SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models

最近の投稿

最近のコメント

アーカイブ

カテゴリー