「eess.AS」カテゴリーアーカイブ

Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels

投稿日: 2023年6月19日作成者: jarxiv

要約オーディオビジュアル音声認識は、音響ノイズに対する堅牢性により多くの注目を … 続きを読む →

カテゴリー: cs.CV, cs.SD, eess.AS | コメントを受け付けていません

RealImpact: A Dataset of Impact Sound Fields for Real Objects

投稿日: 2023年6月19日作成者: jarxiv

要約物体は、さまざまな摂動、環境条件、リスナーに対する姿勢の下で独特の音を出し … 続きを読む →

カテゴリー: cs.CV, cs.GR, cs.SD, eess.AS | コメントを受け付けていません

Few-shot bioacoustic event detection at the DCASE 2023 challenge

投稿日: 2023年6月16日作成者: jarxiv

要約フューショット生体音響イベント検出では、対象クラスの少数の例のみにアクセス … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation

投稿日: 2023年6月16日作成者: jarxiv

要約音声基礎モデルの自己教師あり学習 (SSL) の優れた一般化能力が大きな注 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Inconsistency Ranking-based Noisy Label Detection for High-quality Data

投稿日: 2023年6月16日作成者: jarxiv

要約ディープラーニングを成功させるには、注釈付きの高品質で大量のデータが必要で … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language

投稿日: 2023年6月16日作成者: jarxiv

要約現在の自己教師あり学習アルゴリズムはモダリティ固有であることが多く、大量の … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Audio Tagging on an Embedded Hardware Platform

投稿日: 2023年6月16日作成者: jarxiv

要約畳み込みニューラルネットワーク (CNN) は、さまざまな音声分類タスク … 続きを読む →

カテゴリー: cs.AI, cs.SD, cs.SY, eess.AS, eess.SY | コメントを受け付けていません

ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications

投稿日: 2023年6月16日作成者: jarxiv

要約パーソナルアシスタント、自動音声認識装置、対話理解システムは、相互接続さ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data

投稿日: 2023年6月16日作成者: jarxiv

要約音声とテキストは異なる特徴を持つ非常に異なるモダリティであるため、テキスト … 続きを読む →

カテゴリー: cs.AI, cs.CL, eess.AS | コメントを受け付けていません

Lexical Speaker Error Correction: Leveraging Language Models for Speaker Diarization Error Correction

投稿日: 2023年6月16日作成者: jarxiv

要約話者ダイアライゼーション (SD) は通常、自動音声認識 (ASR) シス … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, eess.AS | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels

RealImpact: A Dataset of Impact Sound Fields for Real Objects

Few-shot bioacoustic event detection at the DCASE 2023 challenge

Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation

Inconsistency Ranking-based Noisy Label Detection for High-quality Data

Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language

Audio Tagging on an Embedded Hardware Platform

ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications

SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data

Lexical Speaker Error Correction: Leveraging Language Models for Speaker Diarization Error Correction

最近の投稿

最近のコメント

アーカイブ

カテゴリー