「eess.AS」カテゴリーアーカイブ

AVSegFormer: Audio-Visual Segmentation with Transformer

投稿日: 2023年7月5日作成者: jarxiv

要約オーディオとビジョンの組み合わせは、マルチモーダルコミュニティで長い間注 … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Language-agnostic Code-Switching in Sequence-To-Sequence Speech Recognition

投稿日: 2023年7月4日作成者: jarxiv

要約コードスイッチング（CS）とは、異なる言語の単語やフレーズを交互に使用する … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Beyond Neural-on-Neural Approaches to Speaker Gender Protection

投稿日: 2023年7月3日作成者: jarxiv

要約最近の研究では、性別推論攻撃を防御するために音声を変更するアプローチが提案 … 続きを読む →

カテゴリー: cs.CL, cs.CR, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Towards Improving the Performance of Pre-Trained Speech Models for Low-Resource Languages Through Lateral Inhibition

投稿日: 2023年7月3日作成者: jarxiv

要約自然言語処理における Transformer モデルからの双方向エンコーダ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Empirical Interpretation of the Relationship Between Speech Acoustic Context and Emotion Recognition

投稿日: 2023年7月3日作成者: jarxiv

要約音声感情認識 (SER) は、心の知能指数を取得し、音声の文脈上の意味を理 … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Automatic Speech Recognition of Non-Native Child Speech for Language Learning Applications

投稿日: 2023年6月30日作成者: jarxiv

要約ボイスボットは、特に第二言語学習の文脈において、言語スキルの発達をサポート … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

MooseNet: A Trainable Metric for Synthesized Speech with a PLDA Module

投稿日: 2023年6月30日作成者: jarxiv

要約私たちは、聞き手の平均意見スコア (MOS) を予測する訓練可能な音声指標 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Leveraging Cross-Utterance Context For ASR Decoding

投稿日: 2023年6月30日作成者: jarxiv

要約外部言語モデル (LM) は自動音声認識システムのデコード段階に組み込まれ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Predicting Music Hierarchies with a Graph-Based Neural Decoder

投稿日: 2023年6月30日作成者: jarxiv

要約この論文では、音楽シーケンスを依存関係ツリーに解析するためのデータ駆動型フ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

High-Quality Automatic Voice Over with Accurate Alignment: Supervision through Self-Supervised Discrete Speech Units

投稿日: 2023年6月30日作成者: jarxiv

要約自動ボイスオーバー (AVO) の目標は、指定されたテキストスクリプトに … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

AVSegFormer: Audio-Visual Segmentation with Transformer

Language-agnostic Code-Switching in Sequence-To-Sequence Speech Recognition

Beyond Neural-on-Neural Approaches to Speaker Gender Protection

Towards Improving the Performance of Pre-Trained Speech Models for Low-Resource Languages Through Lateral Inhibition

Empirical Interpretation of the Relationship Between Speech Acoustic Context and Emotion Recognition

Automatic Speech Recognition of Non-Native Child Speech for Language Learning Applications

MooseNet: A Trainable Metric for Synthesized Speech with a PLDA Module

Leveraging Cross-Utterance Context For ASR Decoding

Predicting Music Hierarchies with a Graph-Based Neural Decoder

High-Quality Automatic Voice Over with Accurate Alignment: Supervision through Self-Supervised Discrete Speech Units

最近の投稿

最近のコメント

アーカイブ

カテゴリー