「cs.SD」カテゴリーアーカイブ

Detection and classification of vocal productions in large scale audio recordings

投稿日: 2023年8月14日作成者: jarxiv

要約私たちは、大規模な自然音声録音から音声作品を抽出し、これらの音声作品を分類 … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS, stat.AP | コメントを受け付けていません

There is more than one kind of robustness: Fooling Whisper with adversarial examples

投稿日: 2023年8月14日作成者: jarxiv

要約 Whisper は、分布外の入力とランダムノイズの両方に対して優れた堅牢 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

A Compact End-to-End Model with Local and Global Context for Spoken Language Identification

投稿日: 2023年8月14日作成者: jarxiv

要約 ContextNet アーキテクチャに基づいた音声言語識別 (LID) 用 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping

投稿日: 2023年8月14日作成者: jarxiv

要約視覚音声認識 (VSR) は、人間の専門家であっても、ビデオシーケンスに … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

ML-SUPERB: Multilingual Speech Universal PERformance Benchmark

投稿日: 2023年8月14日作成者: jarxiv

要約音声処理ユニバーサルパフォーマンスベンチマーク (SUPERB) は、 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

An Autoethnographic Exploration of XAI in Algorithmic Composition

投稿日: 2023年8月14日作成者: jarxiv

要約機械学習モデルは、民族音楽からクラシック音楽まで、さまざまなジャンルにわた … 続きを読む →

カテゴリー: cs.AI, cs.HC, cs.SD | コメントを受け付けていません

Improving Joint Speech-Text Representations Without Alignment

投稿日: 2023年8月14日作成者: jarxiv

要約昨年は、テキストと画像のドメインが一緒に表現されるクロスモーダル表現空間の … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Deep Learning for Diverse Data Types Steganalysis: A Review

投稿日: 2023年8月14日作成者: jarxiv

要約ステガノグラフィーとステガナリシスは、情報セキュリティ分野の 2 つの相互 … 続きを読む →

カテゴリー: cs.AI, cs.CR, cs.LG, cs.MM, cs.SD, eess.AS, eess.IV | コメントを受け付けていません

A Novel Self-training Approach for Low-resource Speech Recognition

投稿日: 2023年8月11日作成者: jarxiv

要約この論文では、低リソース設定における自動音声認識 (ASR) の自己学習ア … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis

投稿日: 2023年8月11日作成者: jarxiv

要約最近の研究では、テキストではなく、自己教師形式で学習された低ビットレートの … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

「cs.SD」カテゴリーアーカイブ

Detection and classification of vocal productions in large scale audio recordings

There is more than one kind of robustness: Fooling Whisper with adversarial examples

A Compact End-to-End Model with Local and Global Context for Spoken Language Identification

Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping

ML-SUPERB: Multilingual Speech Universal PERformance Benchmark

An Autoethnographic Exploration of XAI in Algorithmic Composition

Improving Joint Speech-Text Representations Without Alignment

Deep Learning for Diverse Data Types Steganalysis: A Review

A Novel Self-training Approach for Low-resource Speech Recognition

EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis

最近の投稿

最近のコメント

アーカイブ

カテゴリー