「eess.AS」カテゴリーアーカイブ

Considerations for Ethical Speech Recognition Datasets

投稿日: 2023年5月4日作成者: jarxiv

要約タイトル：公正な音声認識データセットに関する考慮事項要約： -音声AI技 … 続きを読む →

カテゴリー: cs.CL, cs.CY, cs.SD, eess.AS | コメントを受け付けていません

M2-CTTS: End-to-End Multi-scale Multi-modal Conversational Text-to-Speech Synthesis

投稿日: 2023年5月4日作成者: jarxiv

要約【タイトル】 M2-CTTS: 多層多様な言語・音声モダリティに対応した会 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Low-Resource Music Genre Classification with Cross-Modal Neural Model Reprogramming

投稿日: 2023年5月4日作成者: jarxiv

要約タイトル：「クロスモーダルニューラルモデル再プログラミングによる低リソース … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.NE, cs.SD, eess.AS | コメントを受け付けていません

AV-SAM: Segment Anything Model Meets Audio-Visual Localization and Segmentation

投稿日: 2023年5月4日作成者: jarxiv

要約タイトル: AV-SAM: Segment Anything Model … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

CryCeleb: A Speaker Verification Dataset Based on Infant Cry Sounds

投稿日: 2023年5月4日作成者: jarxiv

要約タイトル：CryCeleb：乳児の泣き声に基づく話者認証データセット要約 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

AQ-GT: a Temporally Aligned and Quantized GRU-Transformer for Co-Speech Gesture Synthesis

投稿日: 2023年5月3日作成者: jarxiv

要約タイトル：AQ-GT：時間的に整列し量子化されたGRU-Transform … 続きを読む →

カテゴリー: cs.GR, cs.HC, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Going In Style: Audio Backdoors Through Stylistic Transformations

投稿日: 2023年5月3日作成者: jarxiv

要約タイトル：Going In Style: Audio Backdoors … 続きを読む →

カテゴリー: cs.CR, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Lessons Learned in ATCO2: 5000 hours of Air Traffic Control Communications for Robust Automatic Speech Recognition and Understanding

投稿日: 2023年5月3日作成者: jarxiv

要約タイトル：ATCO2における学習成果：堅牢な自動音声認識と理解のための50 … 続きを読む →

カテゴリー: cs.CL, cs.HC, cs.SD, eess.AS | コメントを受け付けていません

The Pipeline System of ASR and NLU with MLM-based Data Augmentation toward STOP Low-resource Challenge

投稿日: 2023年5月3日作成者: jarxiv

要約【タイトル】MLMベースのデータ拡張によるASRおよびNLUのパイプライン … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

A Study on the Integration of Pipeline and E2E SLU systems for Spoken Semantic Parsing toward STOP Quality Challenge

投稿日: 2023年5月3日作成者: jarxiv

要約タイトル：STOP Quality Challengeのためのスポークンセ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

Considerations for Ethical Speech Recognition Datasets

M2-CTTS: End-to-End Multi-scale Multi-modal Conversational Text-to-Speech Synthesis

Low-Resource Music Genre Classification with Cross-Modal Neural Model Reprogramming

AV-SAM: Segment Anything Model Meets Audio-Visual Localization and Segmentation

CryCeleb: A Speaker Verification Dataset Based on Infant Cry Sounds

AQ-GT: a Temporally Aligned and Quantized GRU-Transformer for Co-Speech Gesture Synthesis

Going In Style: Audio Backdoors Through Stylistic Transformations

Lessons Learned in ATCO2: 5000 hours of Air Traffic Control Communications for Robust Automatic Speech Recognition and Understanding

The Pipeline System of ASR and NLU with MLM-based Data Augmentation toward STOP Low-resource Challenge

A Study on the Integration of Pipeline and E2E SLU systems for Spoken Semantic Parsing toward STOP Quality Challenge

最近の投稿

最近のコメント

アーカイブ

カテゴリー