「cs.CL」カテゴリーアーカイブ

DualToken: Towards Unifying Visual Understanding and Generation with Dual Visual Vocabularies

投稿日: 2025年3月20日作成者: jarxiv

要約視覚的理解と生成に必要な異なる表現スペースは、大規模な言語モデルの自己回帰 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

MoonCast: High-Quality Zero-Shot Podcast Generation

投稿日: 2025年3月20日作成者: jarxiv

要約テキスト間合成の最近の進歩は、個々のスピーカーの高品質の短い発言を生み出す … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation

投稿日: 2025年3月19日作成者: jarxiv

要約エンドツーエンドの音声翻訳では、エンコーダーによって学んだ音響表現は、通常 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Towards Harmless Multimodal Assistants with Blind Preference Optimization

投稿日: 2025年3月19日作成者: jarxiv

要約マルチモーダル大手言語モデル（MLLM）は、マルチモーダルの理解、推論、お … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Implicit Reasoning in Transformers is Reasoning through Shortcuts

投稿日: 2025年3月19日作成者: jarxiv

要約 OpenaiのO1とO3の成功とDeepseekのR1の成功によって示され … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Benchmarking Failures in Tool-Augmented Language Models

投稿日: 2025年3月19日作成者: jarxiv

要約ツールの統合により、バニラテキスト生成を超えて言語モデル（LMS）の機能が … 続きを読む →

カテゴリー: cs.CL, cs.SE | コメントを受け付けていません

Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues

投稿日: 2025年3月19日作成者: jarxiv

要約 Mamba、RWKV、GLA、MLSTM、Deltanetなどの線形再発性 … 続きを読む →

カテゴリー: cs.CL, cs.FL, cs.LG | コメントを受け付けていません

Zero-Shot Action Recognition in Surveillance Videos

投稿日: 2025年3月19日作成者: jarxiv

要約公共スペースでの監視に対する需要の高まりは、人的資源の不足により大きな課題 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

JuDGE: Benchmarking Judgment Document Generation for Chinese Legal System

投稿日: 2025年3月19日作成者: jarxiv

要約このペーパーでは、中国の法制度における判断文書生成のパフォーマンスを評価す … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.IR | コメントを受け付けていません

The Problem of Coherence in Natural Language Explanations of Recommendations

投稿日: 2025年3月19日作成者: jarxiv

要約推奨事項に自然言語の説明を提供することは、非専門家ユーザーの観点から特に役 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.IR, cs.LG | コメントを受け付けていません

「cs.CL」カテゴリーアーカイブ

DualToken: Towards Unifying Visual Understanding and Generation with Dual Visual Vocabularies

MoonCast: High-Quality Zero-Shot Podcast Generation

AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation

Towards Harmless Multimodal Assistants with Blind Preference Optimization

Implicit Reasoning in Transformers is Reasoning through Shortcuts

Benchmarking Failures in Tool-Augmented Language Models

Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues

Zero-Shot Action Recognition in Surveillance Videos

JuDGE: Benchmarking Judgment Document Generation for Chinese Legal System

The Problem of Coherence in Natural Language Explanations of Recommendations

最近の投稿

最近のコメント

アーカイブ

カテゴリー