「eess.AS」カテゴリーアーカイブ

Can We Trust Explainable AI Methods on ASR? An Evaluation on Phoneme Recognition

投稿日: 2023年5月30日作成者: jarxiv

要約 Explainable AI (XAI) 技術は、画像分類や自然言語処理な … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target

投稿日: 2023年5月30日作成者: jarxiv

要約音声言語理解 (SLU) は、話された発話から意味論的な情報を抽出すること … 続きを読む →

カテゴリー: cs.CL, eess.AS | コメントを受け付けていません

InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition

投稿日: 2023年5月30日作成者: jarxiv

要約ローカル機能とグローバル機能はどちらも自動音声認識 (ASR) に不可欠で … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS Adaptation

投稿日: 2023年5月30日作成者: jarxiv

要約広く話されていない言語や、トレーニングデータで十分に表現されていないアク … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model

投稿日: 2023年5月30日作成者: jarxiv

要約最近の大規模言語モデル (LLM) の巨大なスケールにより、命令ベースおよ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition

投稿日: 2023年5月30日作成者: jarxiv

要約最先端の ASR システムは、ローカルとグローバルの相互作用を個別にモデル … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, eess.AS | コメントを受け付けていません

CommonAccent: Exploring Large Acoustic Pretrained Models for Accent Classification Based on Common Voice

投稿日: 2023年5月30日作成者: jarxiv

要約自動音声認識 (ASR) の最近の進歩にもかかわらず、アクセントのある音声 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, eess.AS | コメントを受け付けていません

Leveraging characteristics of the output probability distribution for identifying adversarial audio examples

投稿日: 2023年5月29日作成者: jarxiv

要約敵対的攻撃は、機械学習ベースの自動音声認識 (ASR) システムに対するセ … 続きを読む →

カテゴリー: cs.CR, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

DisfluencyFixer: A tool to enhance Language Learning through Speech To Speech Disfluency Correction

投稿日: 2023年5月29日作成者: jarxiv

要約会話の音声は多くの場合、音声計画からの逸脱で構成され、流暢な発話を生成し、 … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units

投稿日: 2023年5月29日作成者: jarxiv

要約すべてのコンポーネントを共同で最適化できる直接音声音声変換 (S2ST) … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

Can We Trust Explainable AI Methods on ASR? An Evaluation on Phoneme Recognition

Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target

InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition

ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS Adaptation

Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model

HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition

CommonAccent: Exploring Large Acoustic Pretrained Models for Accent Classification Based on Common Voice

Leveraging characteristics of the output probability distribution for identifying adversarial audio examples

DisfluencyFixer: A tool to enhance Language Learning through Speech To Speech Disfluency Correction

UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units

最近の投稿

最近のコメント

アーカイブ

カテゴリー