「cs.SD」カテゴリーアーカイブ

RoDia: A New Dataset for Romanian Dialect Identification from Speech

投稿日: 2023年9月13日作成者: jarxiv

要約方言の識別は、音声処理および言語テクノロジにおいて重要なタスクであり、音声 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Leveraging Large Language Models for Exploiting ASR Uncertainty

投稿日: 2023年9月13日作成者: jarxiv

要約大規模な言語モデルはさまざまな自然言語処理 (NLP) タスクに優れていま … 続きを読む →

カテゴリー: cs.CL, cs.HC, cs.SD, eess.AS | コメントを受け付けていません

Diffusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation

投稿日: 2023年9月12日作成者: jarxiv

要約この論文では、GENEA (身体エージェントのための非言語行動の生成と評価 … 続きを読む →

カテゴリー: 68T42, cs.HC, cs.LG, cs.SD, eess.AS, I.2.6 | コメントを受け付けていません

GRASS: Unified Generation Model for Speech-to-Semantic Tasks

投稿日: 2023年9月12日作成者: jarxiv

要約この論文では、音声データのタスク関連プロンプトを条件としたターゲットテキ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Addressing Feature Imbalance in Sound Source Separation

投稿日: 2023年9月12日作成者: jarxiv

要約ニューラルネットワークは、タスクを解決するために特定の機能に過度に依存し … 続きを読む →

カテゴリー: cs.AI, cs.SD, eess.AS | コメントを受け付けていません

Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP

投稿日: 2023年9月12日作成者: jarxiv

要約表現力豊かな Text-to-Speech (TTS) の領域では、明示的 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech

投稿日: 2023年9月12日作成者: jarxiv

要約自己教師あり学習 (SSL) は、コンピュータービジョンや自然言語処理な … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Parallel and Limited Data Voice Conversion Using Stochastic Variational Deep Kernel Learning

投稿日: 2023年9月11日作成者: jarxiv

要約通常、音声変換はトレーニングデータが限られているエンジニアリング上の問題 … 続きを読む →

カテゴリー: cs.LG, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

LanSER: Language-Model Supported Speech Emotion Recognition

投稿日: 2023年9月11日作成者: jarxiv

要約音声感情認識 (SER) モデルは通常、トレーニングにコストのかかる人間が … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems

投稿日: 2023年9月11日作成者: jarxiv

要約大規模言語モデル (LLM) の知識の転送は、言語知識をエンドツーエンドの … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

「cs.SD」カテゴリーアーカイブ

RoDia: A New Dataset for Romanian Dialect Identification from Speech

Leveraging Large Language Models for Exploiting ASR Uncertainty

Diffusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation

GRASS: Unified Generation Model for Speech-to-Semantic Tasks

Addressing Feature Imbalance in Sound Source Separation

Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP

LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech

Parallel and Limited Data Voice Conversion Using Stochastic Variational Deep Kernel Learning

LanSER: Language-Model Supported Speech Emotion Recognition

Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems

最近の投稿

最近のコメント

アーカイブ

カテゴリー