「cs.SD」カテゴリーアーカイブ

Improving the Inclusivity of Dutch Speech Recognition by Fine-tuning Whisper on the JASMIN-CGN Corpus

投稿日: 2025年2月25日作成者: jarxiv

要約ジャスミンCGNコーパスの子供、高齢者、非ネイティブオランダ語のスピーチに … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

ML-SUPERB: Multilingual Speech Universal PERformance Benchmark

投稿日: 2025年2月25日作成者: jarxiv

要約音声処理Universal Performance Benchmark（S … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond

投稿日: 2025年2月25日作成者: jarxiv

要約 2023年の多言語スピーチユニバーサルパフォーマンスベンチマーク（ML-S … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Zero-shot Musical Stem Retrieval with Joint-Embedding Predictive Architectures

投稿日: 2025年2月25日作成者: jarxiv

要約この論文では、音楽のSTEMの回復のタスクに取り組みます。ミュージカルミ … 続きを読む →

カテゴリー: cs.AI, cs.SD, eess.AS | コメントを受け付けていません

Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation

投稿日: 2025年2月25日作成者: jarxiv

要約言語の多様性は、自動音声認識や翻訳など、音声からテキスト（S2T）タスクに … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Reverb: Open-Source ASR and Diarization from Rev

投稿日: 2025年2月24日作成者: jarxiv

要約今日、私たちは非営利的な使用のためのコア音声認識とダイアリ化化モデルをオー … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Everyday Speech in the Indian Subcontinent

投稿日: 2025年2月24日作成者: jarxiv

要約インドには1369の言語があり、そのうち22は公式です。これらの言語を表 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS, I.2.7 | コメントを受け付けていません

KAD: No More FAD! An Effective and Efficient Evaluation Metric for Audio Generation

投稿日: 2025年2月24日作成者: jarxiv

要約生成されたオーディオ信号の評価に広く採用されていますが、FR \ &#82 … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models

投稿日: 2025年2月21日作成者: jarxiv

要約検索拡張生成（RAG）は、大規模な言語モデル（LLM）に外部知識を統合でき … 続きを読む →

カテゴリー: cs.AI, cs.SD, eess.AS | コメントを受け付けていません

Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives

投稿日: 2025年2月20日作成者: jarxiv

要約視聴覚学習は、複数の感覚モダリティを活用することにより、現実の世界をより豊 … 続きを読む →

カテゴリー: cs.CV, cs.SD | コメントを受け付けていません

「cs.SD」カテゴリーアーカイブ

Improving the Inclusivity of Dutch Speech Recognition by Fine-tuning Whisper on the JASMIN-CGN Corpus

ML-SUPERB: Multilingual Speech Universal PERformance Benchmark

Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond

Zero-shot Musical Stem Retrieval with Joint-Embedding Predictive Architectures

Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation

Reverb: Open-Source ASR and Diarization from Rev

Everyday Speech in the Indian Subcontinent

KAD: No More FAD! An Effective and Efficient Evaluation Metric for Audio Generation

WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models

Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives

最近の投稿

最近のコメント

アーカイブ

カテゴリー