「eess.AS」カテゴリーアーカイブ

Learn and Don’t Forget: Adding a New Language to ASR Foundation Models

投稿日: 2024年7月10日作成者: jarxiv

要約 Foundation ASR モデルは多くの場合、多くの言語をサポートしま … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Performance Analysis of Speech Encoders for Low-Resource SLU and ASR in Tunisian Dialect

投稿日: 2024年7月10日作成者: jarxiv

要約自己教師あり学習 (SSL) を通じて事前トレーニングされた音声エンコーダ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models

投稿日: 2024年7月10日作成者: jarxiv

要約音声統合大規模言語モデル (SILLM) は、大規模言語モデルと音声認識を … 続きを読む →

カテゴリー: cs.CL, cs.CY, eess.AS | コメントを受け付けていません

Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models

投稿日: 2024年7月10日作成者: jarxiv

要約自殺リスクの早期発見は、自殺企図の可能性を防ぐための介入を可能にするため重 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper

投稿日: 2024年7月10日作成者: jarxiv

要約この研究では、プロンプトの情報が高性能音声認識モデル Whisper とど … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Proceedings of The second international workshop on eXplainable AI for the Arts (XAIxArts)

投稿日: 2024年7月10日作成者: jarxiv

要約 Explainable AI for the Arts (XAIxArts … 続きを読む →

カテゴリー: cs.AI, cs.HC, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Frieren: Efficient Video-to-Audio Generation with Rectified Flow Matching

投稿日: 2024年7月10日作成者: jarxiv

要約ビデオ – オーディオ (V2A) 生成は、サイレントビデオ … 続きを読む →

カテゴリー: cs.CV, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Towards Multimodal Prediction of Spontaneous Humour: A Novel Dataset and First Results

投稿日: 2024年7月9日作成者: jarxiv

要約ユーモアは人間の社会的行動、感情、認知の重要な要素です。その自動理解によ … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Multitaper mel-spectrograms for keyword spotting

投稿日: 2024年7月8日作成者: jarxiv

要約キーワードスポッティング（KWS）は、特徴表現の品質に最も敏感な音声認識タ … 続きを読む →

カテゴリー: cs.LG, eess.AS | コメントを受け付けていません

Romanization Encoding For Multilingual ASR

投稿日: 2024年7月8日作成者: jarxiv

要約多言語およびコードスイッチング自動音声認識(ASR)システムを最適化するた … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

Learn and Don’t Forget: Adding a New Language to ASR Foundation Models

Performance Analysis of Speech Encoders for Low-Resource SLU and ASR in Tunisian Dialect

Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models

Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models

Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper

Proceedings of The second international workshop on eXplainable AI for the Arts (XAIxArts)

Frieren: Efficient Video-to-Audio Generation with Rectified Flow Matching

Towards Multimodal Prediction of Spontaneous Humour: A Novel Dataset and First Results

Multitaper mel-spectrograms for keyword spotting

Romanization Encoding For Multilingual ASR

最近の投稿

最近のコメント

アーカイブ

カテゴリー