「eess.AS」カテゴリーアーカイブ

Fill in the Gap! Combining Self-supervised Representation Learning with Neural Audio Synthesis for Speech Inpainting

投稿日: 2024年5月31日作成者: jarxiv

要約ほとんどの音声自己教師あり学習 (SSL) モデルは、入力信号の欠落部分 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Iterative Feature Boosting for Explainable Speech Emotion Recognition

投稿日: 2024年5月31日作成者: jarxiv

要約音声感情認識 (SER) では、実際の重要性を考慮せずに事前定義された特徴 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, cs.SD, eess.AS, I.2.1 | コメントを受け付けていません

RapVerse: Coherent Vocals and Whole-Body Motions Generations from Text

投稿日: 2024年5月31日作成者: jarxiv

要約この作品では、テキストの歌詞入力から直接 3D の全体的な体の動きを生成し … 続きを読む →

カテゴリー: cs.CV, cs.SD, eess.AS | コメントを受け付けていません

BLSP-KD: Bootstrapping Language-Speech Pre-training via Knowledge Distillation

投稿日: 2024年5月30日作成者: jarxiv

要約最近のエンドツーエンドのアプローチは、大規模言語モデル (LLM) を音声 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Continual Contrastive Spoken Language Understanding

投稿日: 2024年5月30日作成者: jarxiv

要約最近、ニューラルネットワークはさまざまな分野で目覚ましい進歩を遂げており … 続きを読む →

カテゴリー: cs.AI, eess.AS | コメントを受け付けていません

TRAMBA: A Hybrid Transformer and Mamba Architecture for Practical Audio and Bone Conduction Speech Super Resolution and Enhancement on Mobile and Wearable Platforms

投稿日: 2024年5月30日作成者: jarxiv

要約私たちは、モバイルおよびウェアラブルプラットフォームに適した、音響および骨 … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning

投稿日: 2024年5月30日作成者: jarxiv

要約テキストから音楽への編集における最近の進歩は、テキストクエリを使用して音 … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

BLSP: Bootstrapping Language-Speech Pre-training via Behavior Alignment of Continuation Writing

投稿日: 2024年5月29日作成者: jarxiv

要約大規模言語モデル (LLM) の出現により、その優れた言語機能を音声に拡張 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

MF-AED-AEC: Speech Emotion Recognition by Leveraging Multimodal Fusion, Asr Error Detection, and Asr Error Correction

投稿日: 2024年5月29日作成者: jarxiv

要約音声感情認識 (SER) における一般的なアプローチには、音声情報とテキス … 続きを読む →

カテゴリー: cs.CL, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR

投稿日: 2024年5月29日作成者: jarxiv

要約教師なし自動音声認識 (ASR) は、音声とテキストのペアのデータを監視せ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

Fill in the Gap! Combining Self-supervised Representation Learning with Neural Audio Synthesis for Speech Inpainting

Iterative Feature Boosting for Explainable Speech Emotion Recognition

RapVerse: Coherent Vocals and Whole-Body Motions Generations from Text

BLSP-KD: Bootstrapping Language-Speech Pre-training via Knowledge Distillation

Continual Contrastive Spoken Language Understanding

TRAMBA: A Hybrid Transformer and Mamba Architecture for Practical Audio and Bone Conduction Speech Super Resolution and Enhancement on Mobile and Wearable Platforms

Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning

BLSP: Bootstrapping Language-Speech Pre-training via Behavior Alignment of Continuation Writing

MF-AED-AEC: Speech Emotion Recognition by Leveraging Multimodal Fusion, Asr Error Detection, and Asr Error Correction

REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR

最近の投稿

最近のコメント

アーカイブ

カテゴリー