「cs.SD」カテゴリーアーカイブ

Emphasizing Unseen Words: New Vocabulary Acquisition for End-to-End Speech Recognition

投稿日: 2023年2月21日作成者: jarxiv

要約人間の言語の動的な性質により、自動音声認識 (ASR) システムは新しい語 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Knowledge-aware Bayesian Co-attention for Multimodal Emotion Recognition

投稿日: 2023年2月21日作成者: jarxiv

要約マルチモーダル感情認識は、さまざまなモダリティを融合して人間の感情を予測す … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Towards Measuring and Scoring Speaker Diarization Fairness

投稿日: 2023年2月21日作成者: jarxiv

要約話者ダイアライゼーション、つまり「誰がいつ話したか」を見つけるタスクは、現 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

A Sidecar Separator Can Convert a Single-Speaker Speech Recognition System to a Multi-Speaker One

投稿日: 2023年2月21日作成者: jarxiv

要約自動音声認識 (ASR) は、一般的な非重複環境で適切に機能しますが、マル … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition

投稿日: 2023年2月20日作成者: jarxiv

要約エンドツーエンド (E2E) モデルと内部言語モデル (ILM) のジョイ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Modular Hybrid Autoregressive Transducer

投稿日: 2023年2月20日作成者: jarxiv

要約トランスデューサには明確に分離された音響モデル (AM)、言語モデル (L … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Towards Building Text-To-Speech Systems for the Next Billion Users

投稿日: 2023年2月20日作成者: jarxiv

要約ディープラーニングベースのテキスト読み上げ (TTS) システムは、モ … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Deep Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition

投稿日: 2023年2月20日作成者: jarxiv

要約この論文では、コーパス間の音声感情認識 (SER) 問題に対処するために、 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Handling the Alignment for Wake Word Detection: A Comparison Between Alignment-Based, Alignment-Free and Hybrid Approaches

投稿日: 2023年2月20日作成者: jarxiv

要約ウェイクワード検出は、ほとんどのインテリジェントホームやポータブルデ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Hypernetworks build Implicit Neural Representations of Sounds

投稿日: 2023年2月20日作成者: jarxiv

要約暗黙的ニューラル表現 (INR) は、現在、画像の超解像、画像圧縮、3D … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

「cs.SD」カテゴリーアーカイブ

Emphasizing Unseen Words: New Vocabulary Acquisition for End-to-End Speech Recognition

Knowledge-aware Bayesian Co-attention for Multimodal Emotion Recognition

Towards Measuring and Scoring Speaker Diarization Fairness

A Sidecar Separator Can Convert a Single-Speaker Speech Recognition System to a Multi-Speaker One

JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition

Modular Hybrid Autoregressive Transducer

Towards Building Text-To-Speech Systems for the Next Billion Users

Deep Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition

Handling the Alignment for Wake Word Detection: A Comparison Between Alignment-Based, Alignment-Free and Hybrid Approaches

Hypernetworks build Implicit Neural Representations of Sounds

最近の投稿

最近のコメント

アーカイブ

カテゴリー