「eess.AS」カテゴリーアーカイブ

Teaching Audio-Aware Large Language Models What Does Not Hear: Mitigating Hallucinations through Synthesized Negative Samples

投稿日: 2025年5月21日作成者: jarxiv

要約オーディオ認識の大規模な言語モデル（ALLMS）の最近の進歩により、オーデ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

SSPS: Self-Supervised Positive Sampling for Robust Self-Supervised Speaker Verification

投稿日: 2025年5月21日作成者: jarxiv

要約自己学習学習（SSL）は、スピーカー検証（SV）のかなりの進歩をもたらしま … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Information

投稿日: 2025年5月20日作成者: jarxiv

要約大規模なオーディオ言語モデル（LALMS）は、スピーチ、オーディオなどのマ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Granary: Speech Recognition and Translation Dataset in 25 European Languages

投稿日: 2025年5月20日作成者: jarxiv

要約マルチタスクと多言語のアプローチは大規模なモデルに利益をもたらしますが、低 … 続きを読む →

カテゴリー: cs.CL, eess.AS | コメントを受け付けていません

Contextual Paralinguistic Data Creation for Multi-Modal Speech-LLM: Data Condensation and Spoken QA Generation

投稿日: 2025年5月20日作成者: jarxiv

要約現在の音声-LLMは、主に両方の側面をカバーする質問回答（QA）データセッ … 続きを読む →

カテゴリー: cs.AI, cs.CL, eess.AS | コメントを受け付けていません

Anti-aliasing of neural distortion effects via model fine tuning

投稿日: 2025年5月19日作成者: jarxiv

要約ニューラルネットワークは、近年ギターの歪み効果モデリングで遍在しています。 … 続きを読む →

カテゴリー: cs.LG, eess.AS, eess.SP | コメントを受け付けていません

Machine Learning Approaches to Vocal Register Classification in Contemporary Male Pop Music

投稿日: 2025年5月19日作成者: jarxiv

要約すべての経験レベルの歌手にとって、技術的なレパートリーを学ぶ際の最も困難な … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Audio Turing Test: Benchmarking the Human-likeness of Large Language Model-based Text-to-Speech Systems in Chinese

投稿日: 2025年5月19日作成者: jarxiv

要約大規模な言語モデル（LLMS）の最近の進歩により、テキストからスピーチ（T … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.HC, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

LegoSLM: Connecting LLM with Speech Encoder using CTC Posteriors

投稿日: 2025年5月19日作成者: jarxiv

要約最近、大規模な事前訓練を受けた音声エンコーダと大規模な言語モデル（LLM） … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

ImprovNet — Generating Controllable Musical Improvisations with Iterative Corruption Refinement

投稿日: 2025年5月19日作成者: jarxiv

要約 Deep Learningがさまざまなドメインにまたがるスタイル転送におけ … 続きを読む →

カテゴリー: cs.AI, cs.SD, eess.AS | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

Teaching Audio-Aware Large Language Models What Does Not Hear: Mitigating Hallucinations through Synthesized Negative Samples

SSPS: Self-Supervised Positive Sampling for Robust Self-Supervised Speaker Verification

SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Information

Granary: Speech Recognition and Translation Dataset in 25 European Languages

Contextual Paralinguistic Data Creation for Multi-Modal Speech-LLM: Data Condensation and Spoken QA Generation

Anti-aliasing of neural distortion effects via model fine tuning

Machine Learning Approaches to Vocal Register Classification in Contemporary Male Pop Music

Audio Turing Test: Benchmarking the Human-likeness of Large Language Model-based Text-to-Speech Systems in Chinese

LegoSLM: Connecting LLM with Speech Encoder using CTC Posteriors

ImprovNet — Generating Controllable Musical Improvisations with Iterative Corruption Refinement

最近の投稿

最近のコメント

アーカイブ

カテゴリー