「eess.AS」カテゴリーアーカイブ

‘Alexa, can you forget me?’ Machine Unlearning Benchmark in Spoken Language Understanding

投稿日: 2025年5月22日作成者: jarxiv

要約機械学習モデルから特定の情報を効率的に削除するプロセスであるマシンの未学習 … 続きを読む →

カテゴリー: cs.CL, eess.AS | コメントを受け付けていません

Granary: Speech Recognition and Translation Dataset in 25 European Languages

投稿日: 2025年5月22日作成者: jarxiv

要約マルチタスクと多言語のアプローチは大規模なモデルに利益をもたらしますが、低 … 続きを読む →

カテゴリー: cs.CL, eess.AS | コメントを受け付けていません

Mitigating Subgroup Disparities in Multi-Label Speech Emotion Recognition: A Pseudo-Labeling and Unsupervised Learning Approach

投稿日: 2025年5月22日作成者: jarxiv

要約サブグループの格差とパフォーマンスバイアスは計算研究でますます研究されてい … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

MIKU-PAL: An Automated and Standardized Multi-Modal Method for Speech Paralinguistic and Affect Labeling

投稿日: 2025年5月22日作成者: jarxiv

要約強い一貫性を持つ大規模な感情的な音声データを取得することは、音声統合の課題 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

ToxicTone: A Mandarin Audio Dataset Annotated for Toxicity and Toxic Utterance Tonality

投稿日: 2025年5月22日作成者: jarxiv

要約テキストでの有毒な音声検出に関する広範な研究にもかかわらず、音声のマンダリ … 続きを読む →

カテゴリー: cs.CL, eess.AS | コメントを受け付けていません

LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec

投稿日: 2025年5月22日作成者: jarxiv

要約個別の音声トークンは、言語モデルベースの音声生成に強い可能性を示しています … 続きを読む →

カテゴリー: cs.AI, cs.SD, eess.AS | コメントを受け付けていません

dMel: Speech Tokenization made Simple

投稿日: 2025年5月22日作成者: jarxiv

要約大規模な言語モデルは、膨大なテキストデータに自己監視された事前供与を活用す … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment

投稿日: 2025年5月22日作成者: jarxiv

要約視聴覚学習の最近の進歩により、モダリティ全体の学習表現における有望な結果が … 続きを読む →

カテゴリー: cs.CV, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Scaling and Enhancing LLM-based AVSR: A Sparse Mixture of Projectors Approach

投稿日: 2025年5月22日作成者: jarxiv

要約視覚的なキューを統合することにより、騒々しい環境での視聴覚音声認識（AVS … 続きを読む →

カテゴリー: cs.CV, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Self-Supervised Frameworks for Speaker Verification via Bootstrapped Positive Sampling

投稿日: 2025年5月21日作成者: jarxiv

要約自己学習学習（SSL）の最近の開発は、スピーカー検証（SV）の重要な可能性 … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

‘Alexa, can you forget me?’ Machine Unlearning Benchmark in Spoken Language Understanding

Granary: Speech Recognition and Translation Dataset in 25 European Languages

Mitigating Subgroup Disparities in Multi-Label Speech Emotion Recognition: A Pseudo-Labeling and Unsupervised Learning Approach

MIKU-PAL: An Automated and Standardized Multi-Modal Method for Speech Paralinguistic and Affect Labeling

ToxicTone: A Mandarin Audio Dataset Annotated for Toxicity and Toxic Utterance Tonality

LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec

dMel: Speech Tokenization made Simple

CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment

Scaling and Enhancing LLM-based AVSR: A Sparse Mixture of Projectors Approach

Self-Supervised Frameworks for Speaker Verification via Bootstrapped Positive Sampling

最近の投稿

最近のコメント

アーカイブ

カテゴリー