「eess.AS」カテゴリーアーカイブ

Matching Latent Encoding for Audio-Text based Keyword Spotting

投稿日: 2023年6月9日作成者: jarxiv

要約キーワードスポッティング (KWS) で音声とテキストの埋め込みを併用す … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Assessing Phrase Break of ESL Speech with Pre-trained Language Models and Large Language Models

投稿日: 2023年6月9日作成者: jarxiv

要約この研究では、事前トレーニング済み言語モデル (PLM) と大規模言語モデ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

The ART of Conversation: Measuring Phonetic Convergence and Deliberate Imitation in L2-Speech with a Siamese RNN

投稿日: 2023年6月9日作成者: jarxiv

要約音声収束とは、会話中の 2 人の対話者の自動的かつ無意識的な音声適応を指し … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models

投稿日: 2023年6月9日作成者: jarxiv

要約音声表現を学習するための自己教師ありの技術は、人間によるラベルを必要とせず … 続きを読む →

カテゴリー: cs.CL, eess.AS, stat.ML | コメントを受け付けていません

Two Stage Contextual Word Filtering for Context bias in Unified Streaming and Non-streaming Transducer

投稿日: 2023年6月9日作成者: jarxiv

要約 E2E ASR システムでは、トレーニングデータにあまり出現しないエンテ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Simple and Controllable Music Generation

投稿日: 2023年6月9日作成者: jarxiv

要約私たちは条件付き音楽生成のタスクに取り組みます。圧縮された個別の音楽表現 … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Label Aware Speech Representation Learning For Language Identification

投稿日: 2023年6月8日作成者: jarxiv

要約言語認識などの非意味論的タスクに対する音声表現学習アプローチでは、分類子モ … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages

投稿日: 2023年6月8日作成者: jarxiv

要約この作品では、ザンビア語のオープンソース多言語音声リソースである Zamb … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Handling the Alignment for Wake Word Detection: A Comparison Between Alignment-Based, Alignment-Free and Hybrid Approaches

投稿日: 2023年6月8日作成者: jarxiv

要約ウェイクワード検出は、ほとんどのインテリジェントホームおよびポータブル … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Topological Data Analysis for Speech Processing

投稿日: 2023年6月7日作成者: jarxiv

要約トポロジカルデータ分析 (TDA) を音声分類問題と事前学習済み音声モデ … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SD, eess.AS, math.AT | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

Matching Latent Encoding for Audio-Text based Keyword Spotting

Assessing Phrase Break of ESL Speech with Pre-trained Language Models and Large Language Models

The ART of Conversation: Measuring Phonetic Convergence and Deliberate Imitation in L2-Speech with a Siamese RNN

BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models

Two Stage Contextual Word Filtering for Context bias in Unified Streaming and Non-streaming Transducer

Simple and Controllable Music Generation

Label Aware Speech Representation Learning For Language Identification

Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages

Handling the Alignment for Wake Word Detection: A Comparison Between Alignment-Based, Alignment-Free and Hybrid Approaches

Topological Data Analysis for Speech Processing

最近の投稿

最近のコメント

アーカイブ

カテゴリー