「cs.SD」カテゴリーアーカイブ

AV-SAM: Segment Anything Model Meets Audio-Visual Localization and Segmentation

投稿日: 2023年5月4日作成者: jarxiv

要約タイトル: AV-SAM: Segment Anything Model … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

CryCeleb: A Speaker Verification Dataset Based on Infant Cry Sounds

投稿日: 2023年5月4日作成者: jarxiv

要約タイトル：CryCeleb：乳児の泣き声に基づく話者認証データセット要約 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

AQ-GT: a Temporally Aligned and Quantized GRU-Transformer for Co-Speech Gesture Synthesis

投稿日: 2023年5月3日作成者: jarxiv

要約タイトル：AQ-GT：時間的に整列し量子化されたGRU-Transform … 続きを読む →

カテゴリー: cs.GR, cs.HC, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Going In Style: Audio Backdoors Through Stylistic Transformations

投稿日: 2023年5月3日作成者: jarxiv

要約タイトル：Going In Style: Audio Backdoors … 続きを読む →

カテゴリー: cs.CR, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Lessons Learned in ATCO2: 5000 hours of Air Traffic Control Communications for Robust Automatic Speech Recognition and Understanding

投稿日: 2023年5月3日作成者: jarxiv

要約タイトル：ATCO2における学習成果：堅牢な自動音声認識と理解のための50 … 続きを読む →

カテゴリー: cs.CL, cs.HC, cs.SD, eess.AS | コメントを受け付けていません

The Pipeline System of ASR and NLU with MLM-based Data Augmentation toward STOP Low-resource Challenge

投稿日: 2023年5月3日作成者: jarxiv

要約【タイトル】MLMベースのデータ拡張によるASRおよびNLUのパイプライン … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

A Study on the Integration of Pipeline and E2E SLU systems for Spoken Semantic Parsing toward STOP Quality Challenge

投稿日: 2023年5月3日作成者: jarxiv

要約タイトル：STOP Quality Challengeのためのスポークンセ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Self-supervised learning for infant cry analysis

投稿日: 2023年5月3日作成者: jarxiv

要約タイトル：自己教師付き学習による幼児の泣き声分析要約： – … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Long-Term Rhythmic Video Soundtracker

投稿日: 2023年5月3日作成者: jarxiv

要約タイトル：長期リズミックビデオサウンドトラッカー要約： – … 続きを読む →

カテゴリー: cs.CV, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Clinical BERTScore: An Improved Measure of Automatic Speech Recognition Performance in Clinical Settings

投稿日: 2023年5月2日作成者: jarxiv

要約タイトル：臨床設定における自動音声認識パフォーマンスの改善に向けたClin … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

「cs.SD」カテゴリーアーカイブ

AV-SAM: Segment Anything Model Meets Audio-Visual Localization and Segmentation

CryCeleb: A Speaker Verification Dataset Based on Infant Cry Sounds

AQ-GT: a Temporally Aligned and Quantized GRU-Transformer for Co-Speech Gesture Synthesis

Going In Style: Audio Backdoors Through Stylistic Transformations

Lessons Learned in ATCO2: 5000 hours of Air Traffic Control Communications for Robust Automatic Speech Recognition and Understanding

The Pipeline System of ASR and NLU with MLM-based Data Augmentation toward STOP Low-resource Challenge

A Study on the Integration of Pipeline and E2E SLU systems for Spoken Semantic Parsing toward STOP Quality Challenge

Self-supervised learning for infant cry analysis

Long-Term Rhythmic Video Soundtracker

Clinical BERTScore: An Improved Measure of Automatic Speech Recognition Performance in Clinical Settings

最近の投稿

最近のコメント

アーカイブ

カテゴリー