「eess.AS」カテゴリーアーカイブ

LLaSM: Large Language and Speech Model

投稿日: 2023年9月13日作成者: jarxiv

要約マルチモーダル大規模言語モデルは、最近大きな関心を集めています。ただし、 … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Speech Separation based on Contrastive Learning and Deep Modularization

投稿日: 2023年9月13日作成者: jarxiv

要約現在のモノラルの最先端の音声分離ツールは教師あり学習に依存しています。こ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

RoDia: A New Dataset for Romanian Dialect Identification from Speech

投稿日: 2023年9月13日作成者: jarxiv

要約方言の識別は、音声処理および言語テクノロジにおいて重要なタスクであり、音声 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Leveraging Large Language Models for Exploiting ASR Uncertainty

投稿日: 2023年9月13日作成者: jarxiv

要約大規模な言語モデルはさまざまな自然言語処理 (NLP) タスクに優れていま … 続きを読む →

カテゴリー: cs.CL, cs.HC, cs.SD, eess.AS | コメントを受け付けていません

Diffusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation

投稿日: 2023年9月12日作成者: jarxiv

要約この論文では、GENEA (身体エージェントのための非言語行動の生成と評価 … 続きを読む →

カテゴリー: 68T42, cs.HC, cs.LG, cs.SD, eess.AS, I.2.6 | コメントを受け付けていません

GRASS: Unified Generation Model for Speech-to-Semantic Tasks

投稿日: 2023年9月12日作成者: jarxiv

要約この論文では、音声データのタスク関連プロンプトを条件としたターゲットテキ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Addressing Feature Imbalance in Sound Source Separation

投稿日: 2023年9月12日作成者: jarxiv

要約ニューラルネットワークは、タスクを解決するために特定の機能に過度に依存し … 続きを読む →

カテゴリー: cs.AI, cs.SD, eess.AS | コメントを受け付けていません

Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP

投稿日: 2023年9月12日作成者: jarxiv

要約表現力豊かな Text-to-Speech (TTS) の領域では、明示的 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech

投稿日: 2023年9月12日作成者: jarxiv

要約自己教師あり学習 (SSL) は、コンピュータービジョンや自然言語処理な … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Parallel and Limited Data Voice Conversion Using Stochastic Variational Deep Kernel Learning

投稿日: 2023年9月11日作成者: jarxiv

要約通常、音声変換はトレーニングデータが限られているエンジニアリング上の問題 … 続きを読む →

カテゴリー: cs.LG, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

LLaSM: Large Language and Speech Model

Speech Separation based on Contrastive Learning and Deep Modularization

RoDia: A New Dataset for Romanian Dialect Identification from Speech

Leveraging Large Language Models for Exploiting ASR Uncertainty

Diffusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation

GRASS: Unified Generation Model for Speech-to-Semantic Tasks

Addressing Feature Imbalance in Sound Source Separation

Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP

LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech

Parallel and Limited Data Voice Conversion Using Stochastic Variational Deep Kernel Learning

最近の投稿

最近のコメント

アーカイブ

カテゴリー