「eess.AS」カテゴリーアーカイブ

Investigating the Sensitivity of Automatic Speech Recognition Systems to Phonetic Variation in L2 Englishes

投稿日: 2023年5月15日作成者: jarxiv

要約自動音声認識（ASR）システムは、学習させた音声と類似した音声に対して最高 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Improving Cascaded Unsupervised Speech Translation with Denoising Back-translation

投稿日: 2023年5月15日作成者: jarxiv

要約音声翻訳モデルの多くはパラレルデータに大きく依存しており、特に低リソース言 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Streaming Joint Speech Recognition and Disfluency Detection

投稿日: 2023年5月12日作成者: jarxiv

要約失語症検出は、主に音声認識の後処理として、パイプラインアプローチで解決され … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

The Pipeline System of ASR and NLU with MLM-based Data Augmentation toward STOP Low-resource Challenge

投稿日: 2023年5月12日作成者: jarxiv

要約本論文では、ICASSP Signal Processing Grand … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Speaker Diaphragm Excursion Prediction: deep attention and online adaptation

投稿日: 2023年5月12日作成者: jarxiv

要約スピーカ保護アルゴリズムは、再生信号の特性を活用し、特に小さなスピーカを持 … 続きを読む →

カテゴリー: cs.AI, cs.IT, cs.SD, eess.AS, math.IT | コメントを受け付けていません

Knowledge Transfer For On-Device Speech Emotion Recognition with Neural Structured Learning

投稿日: 2023年5月12日作成者: jarxiv

要約音声感情認識（SER）は、ヒューマンコンピュータインタラクション（HCI） … 続きを読む →

カテゴリー: cs.AI, cs.SD, eess.AS | コメントを受け付けていません

CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model

投稿日: 2023年5月12日作成者: jarxiv

要約デノイジング拡散確率モデル（DDPM）は、音声合成において有望な性能を示し … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

V2Meow: Meowing to the Visual Beat via Music Generation

投稿日: 2023年5月12日作成者: jarxiv

要約タイトル：V2Meow：音楽生成によるビジュアルビートのミウシカ要約： … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Speech Driven Video Editing via an Audio-Conditioned Diffusion Model

投稿日: 2023年5月12日作成者: jarxiv

要約タイトル：音声条件付き拡散モデルによる音声駆動のビデオ編集要約： &#8 … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Modelling black-box audio effects with time-varying feature modulation

投稿日: 2023年5月11日作成者: jarxiv

要約タイトル：タイムバリング特徴調整を用いたブラックボックスオーディオエフェク … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

Investigating the Sensitivity of Automatic Speech Recognition Systems to Phonetic Variation in L2 Englishes

Improving Cascaded Unsupervised Speech Translation with Denoising Back-translation

Streaming Joint Speech Recognition and Disfluency Detection

The Pipeline System of ASR and NLU with MLM-based Data Augmentation toward STOP Low-resource Challenge

Speaker Diaphragm Excursion Prediction: deep attention and online adaptation

Knowledge Transfer For On-Device Speech Emotion Recognition with Neural Structured Learning

CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model

V2Meow: Meowing to the Visual Beat via Music Generation

Speech Driven Video Editing via an Audio-Conditioned Diffusion Model

Modelling black-box audio effects with time-varying feature modulation

最近の投稿

最近のコメント

アーカイブ

カテゴリー