「eess.AS」カテゴリーアーカイブ

DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding

投稿日: 2024年6月14日作成者: jarxiv

要約事前トレーニングされたテキストベースの大規模言語モデル (LLM) と音声 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos

投稿日: 2024年6月14日作成者: jarxiv

要約人間の対話のためのリアルなオーディオを生成することは、映画や仮想現実ゲーム … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.SD, eess.AS | コメントを受け付けていません

PianoMotion10M: Dataset and Benchmark for Hand Motion Generation in Piano Performance

投稿日: 2024年6月14日作成者: jarxiv

要約近年、教育における人工知能技術への注目が高まっていますが、効果的な楽器指導 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Broadband MEMS Microphone Arrays with Reduced Aperture Through 3D-Printed Waveguides

投稿日: 2024年6月13日作成者: jarxiv

要約この論文では、ビームフォーミング技術を使用する際に、超音波 MEMS マイ … 続きを読む →

カテゴリー: cs.RO, cs.SD, cs.SY, eess.AS, eess.SY | コメントを受け付けていません

SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models

投稿日: 2024年6月13日作成者: jarxiv

要約事前トレーニングされた音声基礎モデル (SFM) からの表現は、多くの下流 … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

tinyCLAP: Distilling Constrastive Language-Audio Pretrained Models

投稿日: 2024年6月13日作成者: jarxiv

要約 Contrastive Language-Audio Pretrainin … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Transformer-based Model for ASR N-Best Rescoring and Rewriting

投稿日: 2024年6月13日作成者: jarxiv

要約音声アシスタントは、速度とプライバシーを確保するために、オンデバイスの … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

The VoicePrivacy 2024 Challenge Evaluation Plan

投稿日: 2024年6月13日作成者: jarxiv

要約この課題の課題は、言語内容や感情状態を保護しながら、話者の声のアイデンティ … 続きを読む →

カテゴリー: cs.CL, cs.CR, eess.AS | コメントを受け付けていません

Speech Emotion Recognition with ASR Transcripts: A Comprehensive Study on Word Error Rate and Fusion Techniques

投稿日: 2024年6月13日作成者: jarxiv

要約テキストデータは一般に、音声感情認識 (SER) のパフォーマンスと信頼 … 続きを読む →

カテゴリー: cs.CL, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Towards Unsupervised Speech Recognition Without Pronunciation Models

投稿日: 2024年6月13日作成者: jarxiv

要約教師あり自動音声認識 (ASR) の最近の進歩は、主に大規模な書き起こされ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding

Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos

PianoMotion10M: Dataset and Benchmark for Hand Motion Generation in Piano Performance

Broadband MEMS Microphone Arrays with Reduced Aperture Through 3D-Printed Waveguides

SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models

tinyCLAP: Distilling Constrastive Language-Audio Pretrained Models

Transformer-based Model for ASR N-Best Rescoring and Rewriting

The VoicePrivacy 2024 Challenge Evaluation Plan

Speech Emotion Recognition with ASR Transcripts: A Comprehensive Study on Word Error Rate and Fusion Techniques

Towards Unsupervised Speech Recognition Without Pronunciation Models

最近の投稿

最近のコメント

アーカイブ

カテゴリー