「eess.AS」カテゴリーアーカイブ

Expressive Acoustic Guitar Sound Synthesis with an Instrument-Specific Input Representation and Diffusion Outpainting

投稿日: 2024年1月25日作成者: jarxiv

要約演奏用のギターサウンドを合成することは、同時発音数が多く表現の多様性が大き … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.SD, eess.AS, eess.SP | コメントを受け付けていません

Non-Intrusive Speech Intelligibility Prediction for Hearing-Impaired Users using Intermediate ASR Features and Human Memory Models

投稿日: 2024年1月25日作成者: jarxiv

要約ニューラルネットワークは、非侵入的な音声明瞭度の予測に使用されて成功して … 続きを読む →

カテゴリー: cs.AI, cs.SD, eess.AS | コメントを受け付けていません

Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study

投稿日: 2024年1月24日作成者: jarxiv

要約大規模モデルの時代では、デコードの自己回帰的な性質により、レイテンシーが重 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Multilingual acoustic word embeddings for zero-resource languages

投稿日: 2024年1月24日作成者: jarxiv

要約この研究は、ラベル付きデータのないゼロリソース言語向けの音声アプリケーショ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Overlap-aware End-to-End Supervised Hierarchical Graph Clustering for Speaker Diarization

投稿日: 2024年1月24日作成者: jarxiv

要約話者ダイアライゼーションは、話者のアイデンティティに基づいてオーディオ録音 … 続きを読む →

カテゴリー: cs.AI, cs.SD, eess.AS | コメントを受け付けていません

Segment Beyond View: Handling Partially Missing Modality for Audio-Visual Semantic Segmentation

投稿日: 2024年1月24日作成者: jarxiv

要約拡張現実 (AR) デバイスは、著名なモバイルインタラクションプラット … 続きを読む →

カテゴリー: cs.CV, cs.SD, eess.AS | コメントを受け付けていません

NEUROSEC: FPGA-Based Neuromorphic Audio Security

投稿日: 2024年1月23日作成者: jarxiv

要約人間の脳の複雑さと機能からインスピレーションを得たニューロモーフィックシ … 続きを読む →

カテゴリー: cs.CR, cs.ET, cs.LG, cs.NE, cs.SD, eess.AS | コメントを受け付けていません

Resource-constrained stereo singing voice cancellation

投稿日: 2024年1月23日作成者: jarxiv

要約我々は、音楽ソース分離のサブタスクであるステレオ歌声キャンセルの問題を研究 … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

DiarizationLM: Speaker Diarization Post-Processing with Large Language Models

投稿日: 2024年1月23日作成者: jarxiv

要約このペーパーでは、大規模言語モデル (LLM) を利用して話者ダイアライゼ … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Streaming Bilingual End-to-End ASR model using Attention over Multiple Softmax

投稿日: 2024年1月23日作成者: jarxiv

要約多言語モデリングがいくつか進歩したとしても、入力言語を知らずに単一のニュー … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

Expressive Acoustic Guitar Sound Synthesis with an Instrument-Specific Input Representation and Diffusion Outpainting

Non-Intrusive Speech Intelligibility Prediction for Hearing-Impaired Users using Intermediate ASR Features and Human Memory Models

Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study

Multilingual acoustic word embeddings for zero-resource languages

Overlap-aware End-to-End Supervised Hierarchical Graph Clustering for Speaker Diarization

Segment Beyond View: Handling Partially Missing Modality for Audio-Visual Semantic Segmentation

NEUROSEC: FPGA-Based Neuromorphic Audio Security

Resource-constrained stereo singing voice cancellation

DiarizationLM: Speaker Diarization Post-Processing with Large Language Models

Streaming Bilingual End-to-End ASR model using Attention over Multiple Softmax

最近の投稿

最近のコメント

アーカイブ

カテゴリー