「eess.AS」カテゴリーアーカイブ

Naturalistic Music Decoding from EEG Data via Latent Diffusion Models

投稿日: 2024年7月4日作成者: jarxiv

要約この論文では、脳波記録から自然主義的な音楽を再構成するタスクに、強力な生成 … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Naturalistic Music Decoding from EEG Data via Latent Diffusion Models

投稿日: 2024年7月3日作成者: jarxiv

要約この記事では、脳波 (EEG) 記録から自然な音楽を再構築するタスクに、強 … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Open-Source Conversational AI with SpeechBrain 1.0

投稿日: 2024年7月3日作成者: jarxiv

要約 SpeechBrain は、PyTorch に基づくオープンソースの会話 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.HC, cs.LG, eess.AS | コメントを受け付けていません

Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization

投稿日: 2024年7月3日作成者: jarxiv

要約この論文では、人間のフィードバックからの強化学習 (RLHF) を使用して … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Towards Robust Speech Representation Learning for Thousands of Languages

投稿日: 2024年7月3日作成者: jarxiv

要約自己教師あり学習 (SSL) は、ラベル付きデータの必要性を減らし、音声テ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

SlideAVSR: A Dataset of Paper Explanation Videos for Audio-Visual Speech Recognition

投稿日: 2024年7月3日作成者: jarxiv

要約オーディオビジュアル音声認識 (AVSR) は、自動音声認識 (ASR) … 続きを読む →

カテゴリー: cs.CV, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Is one brick enough to break the wall of spoken dialogue state tracking?

投稿日: 2024年7月2日作成者: jarxiv

要約タスク指向対話 (TOD) システムでは、ユーザーの要求に対するシステムの … 続きを読む →

カテゴリー: cs.AI, cs.CL, eess.AS, eess.SP | コメントを受け付けていません

Proceedings of The second international workshop on eXplainable AI for the Arts (XAIxArts)

投稿日: 2024年7月2日作成者: jarxiv

要約 Explainable AI for the Arts (XAIxArts … 続きを読む →

カテゴリー: cs.AI, cs.HC, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Deep Active Audio Feature Learning in Resource-Constrained Environments

投稿日: 2024年7月2日作成者: jarxiv

要約ラベル付きデータが不足しているため、生体音響アプリケーションでのディープ … 続きを読む →

カテゴリー: cs.CV, cs.SD, eess.AS | コメントを受け付けていません

Training-Free Deepfake Voice Recognition by Leveraging Large-Scale Pre-Trained Models

投稿日: 2024年7月2日作成者: jarxiv

要約現在の音声ディープフェイク検出器にとって一般化は主な問題であり、配布外のデ … 続きを読む →

カテゴリー: cs.CV, cs.SD, eess.AS | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

Naturalistic Music Decoding from EEG Data via Latent Diffusion Models

Naturalistic Music Decoding from EEG Data via Latent Diffusion Models

Open-Source Conversational AI with SpeechBrain 1.0

Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization

Towards Robust Speech Representation Learning for Thousands of Languages

SlideAVSR: A Dataset of Paper Explanation Videos for Audio-Visual Speech Recognition

Is one brick enough to break the wall of spoken dialogue state tracking?

Proceedings of The second international workshop on eXplainable AI for the Arts (XAIxArts)

Deep Active Audio Feature Learning in Resource-Constrained Environments

Training-Free Deepfake Voice Recognition by Leveraging Large-Scale Pre-Trained Models

最近の投稿

最近のコメント

アーカイブ

カテゴリー