「eess.AS」カテゴリーアーカイブ

Improved Cross-Lingual Transfer Learning For Automatic Speech Translation

投稿日: 2023年6月2日作成者: jarxiv

要約多言語の音声からテキストへの翻訳に関する研究が話題になっています。複数の … 続きを読む →

カテゴリー: cs.AI, cs.CL, eess.AS, eess.SP | コメントを受け付けていません

Iterative autoregression: a novel trick to improve your low-latency speech enhancement model

投稿日: 2023年6月2日作成者: jarxiv

要約ストリーミングモデルは、リアルタイム音声強調ツールの重要なコンポーネント … 続きを読む →

カテゴリー: cs.AI, cs.SD, eess.AS | コメントを受け付けていません

VOCALExplore: Pay-as-You-Go Video Data Exploration and Model Building [Technical Report]

投稿日: 2023年6月2日作成者: jarxiv

要約ユーザーがビデオデータセットに対してドメイン固有のモデルを構築できるよう … 続きを読む →

カテゴリー: cs.CV, cs.DB, cs.SD, eess.AS | コメントを受け付けていません

UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-determined Training Mixtures

投稿日: 2023年6月1日作成者: jarxiv

要約複数のスピーカーが同時に存在する残響状態では、各マイクは異なる場所にある複 … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Text-to-Speech Pipeline for Swiss German — A comparison

投稿日: 2023年6月1日作成者: jarxiv

要約この研究では、さまざまな Text-to-Speech (TTS) モデル … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Simple yet Effective Code-Switching Language Identification with Multitask Pre-Training and Transfer Learning

投稿日: 2023年6月1日作成者: jarxiv

要約コードスイッチング (コードミキシングとも呼ばれる) は、カジュアルな環境 … 続きを読む →

カテゴリー: cs.CL, eess.AS | コメントを受け付けていません

MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets

投稿日: 2023年6月1日作成者: jarxiv

要約この論文では、トレーニングターゲットがどのように取得されるかということから … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Attention-Based Methods For Audio Question Answering

投稿日: 2023年6月1日作成者: jarxiv

要約音声質問応答 (AQA) は、システムに音声および自然言語の質問が提供され … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

ViLaS: Integrating Vision and Language into Automatic Speech Recognition

投稿日: 2023年6月1日作成者: jarxiv

要約追加のマルチモーダル情報を使用して自動音声認識 (ASR) のパフォーマン … 続きを読む →

カテゴリー: cs.AI, cs.CL, eess.AS | コメントを受け付けていません

Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models

投稿日: 2023年5月31日作成者: jarxiv

要約主に、暗黙的なセマンティックモデリングにより、自己教師あり学習 (SSL … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

Improved Cross-Lingual Transfer Learning For Automatic Speech Translation

Iterative autoregression: a novel trick to improve your low-latency speech enhancement model

VOCALExplore: Pay-as-You-Go Video Data Exploration and Model Building [Technical Report]

UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-determined Training Mixtures

Text-to-Speech Pipeline for Swiss German — A comparison

Simple yet Effective Code-Switching Language Identification with Multitask Pre-Training and Transfer Learning

MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets

Attention-Based Methods For Audio Question Answering

ViLaS: Integrating Vision and Language into Automatic Speech Recognition

Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models

最近の投稿

最近のコメント

アーカイブ

カテゴリー