「eess.AS」カテゴリーアーカイブ

Detecting the Severity of Major Depressive Disorder from Speech: A Novel HARD-Training Methodology

投稿日: 2023年5月26日作成者: jarxiv

要約大うつ病性障害 (MDD) は、高い社会経済的コストを伴う世界的に一般的な … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS, q-bio.QM | コメントを受け付けていません

ASR and Emotional Speech: A Word-Level Investigation of the Mutual Impact of Speech and Emotion Recognition

投稿日: 2023年5月26日作成者: jarxiv

要約音声感情認識 (SER) では、音声信号固有の変動性に対処するために、テキ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation

投稿日: 2023年5月26日作成者: jarxiv

要約最近の研究では、さまざまなモダリティのさまざまなタスクにわたって、モデル … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

End-to-End Simultaneous Speech Translation with Differentiable Segmentation

投稿日: 2023年5月26日作成者: jarxiv

要約エンドツーエンド同時音声翻訳 (SimulST) は、ストリーミング音声入 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural Transducers

投稿日: 2023年5月26日作成者: jarxiv

要約最近、RNN トランスデューサーはさまざまな自動音声認識タスクで目覚ましい … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Unified Modeling of Multi-Talker Overlapped Speech Recognition and Diarization with a Sidecar Separator

投稿日: 2023年5月26日作成者: jarxiv

要約複数の話者の重複した音声は、音声認識と日記作成に重大な課題をもたらします。 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

LMs with a Voice: Spoken Language Modeling beyond Speech Tokens

投稿日: 2023年5月25日作成者: jarxiv

要約我々は、事前に訓練された言語モデル (LM) を適応させて音声継続を実行す … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment

投稿日: 2023年5月25日作成者: jarxiv

要約 STS (Speech-to-Singing) 音声変換タスクは、音声録音 … 続きを読む →

カテゴリー: cs.CL, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR

投稿日: 2023年5月25日作成者: jarxiv

要約新しい LLM ベースのユースケースを世界中の人々が利用できるようにするに … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation

投稿日: 2023年5月25日作成者: jarxiv

要約音声直接翻訳 (S2ST) は、音声をある言語から別の言語に変換することを … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

Detecting the Severity of Major Depressive Disorder from Speech: A Novel HARD-Training Methodology

ASR and Emotional Speech: A Word-Level Investigation of the Mutual Impact of Speech and Emotion Recognition

VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation

End-to-End Simultaneous Speech Translation with Differentiable Segmentation

Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural Transducers

Unified Modeling of Multi-Talker Overlapped Speech Recognition and Diarization with a Sidecar Separator

LMs with a Voice: Spoken Language Modeling beyond Speech Tokens

AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment

Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR

AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation

最近の投稿

最近のコメント

アーカイブ

カテゴリー