月別アーカイブ: 2024年6月

When Parts are Greater Than Sums: Individual LLM Components Can Outperform Full Models

投稿日: 2024年6月25日作成者: jarxiv

要約この論文では、大規模な言語モデルの出力をアテンションヘッドと MLP ( … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Venturing into Uncharted Waters: The Navigation Compass from Transformer to Mamba

投稿日: 2024年6月25日作成者: jarxiv

要約ディープニューラルネットワークアーキテクチャである Transfor … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Children’s Speech Recognition through Discrete Token Enhancement

投稿日: 2024年6月25日作成者: jarxiv

要約子供の音声認識は、主に公的に利用可能なデータが不足しているため、リソースが … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

CLIMATELI: Evaluating Entity Linking on Climate Change Data

投稿日: 2024年6月25日作成者: jarxiv

要約気候変動 (CC) は世界的に重要な差し迫ったテーマであり、社会科学から自 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Adversarial Contrastive Decoding: Boosting Safety Alignment of Large Language Models via Opposite Prompt Optimization

投稿日: 2024年6月25日作成者: jarxiv

要約大規模言語モデル (LLM) が広く適用されるようになったことで、その安全 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers

投稿日: 2024年6月25日作成者: jarxiv

要約自己回帰トランスフォーマー、特に拡張コンテキストウィンドウ内で長いシーケ … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

OCALM: Object-Centric Assessment with Language Models

投稿日: 2024年6月25日作成者: jarxiv

要約報酬信号を適切に定義して強化学習 (RL) エージェントを効率的にトレーニ … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Towards Zero-Shot Text-To-Speech for Arabic Dialects

投稿日: 2024年6月25日作成者: jarxiv

要約ゼロショットマルチスピーカーテキスト読み上げ (ZS-TTS) システ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Can Many-Shot In-Context Learning Help Long-Context LLM Judges? See More, Judge Better!

投稿日: 2024年6月25日作成者: jarxiv

要約大規模言語モデル (LLM) のパフォーマンスを評価するための判断材料とし … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters

投稿日: 2024年6月25日作成者: jarxiv

要約大規模言語モデル (LLM) は自然言語処理に革命をもたらし、その適用範囲 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

月別アーカイブ: 2024年6月

When Parts are Greater Than Sums: Individual LLM Components Can Outperform Full Models

Venturing into Uncharted Waters: The Navigation Compass from Transformer to Mamba

Children’s Speech Recognition through Discrete Token Enhancement

CLIMATELI: Evaluating Entity Linking on Climate Change Data

Adversarial Contrastive Decoding: Boosting Safety Alignment of Large Language Models via Opposite Prompt Optimization

Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers

OCALM: Object-Centric Assessment with Language Models

Towards Zero-Shot Text-To-Speech for Arabic Dialects

Can Many-Shot In-Context Learning Help Long-Context LLM Judges? See More, Judge Better!

Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters

最近の投稿

最近のコメント

アーカイブ

カテゴリー