「cs.SD」カテゴリーアーカイブ

REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR

投稿日: 2024年11月18日作成者: jarxiv

要約教師なし自動音声認識 (ASR) は、音声とテキストのペアのデータを監視せ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Local deployment of large-scale music AI models on commodity hardware

投稿日: 2024年11月15日作成者: jarxiv

要約私たちは、汎用ハードウェア上でローカルに大規模な生成 AI モデルを使用し … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Re-Parameterization of Lightweight Transformer for On-Device Speech Emotion Recognition

投稿日: 2024年11月15日作成者: jarxiv

要約エッジまたはモノのインターネット (IoT) デバイスでの機械学習モデルの … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

A Comparative Study of Discrete Speech Tokens for Semantic-Related Tasks with Large Language Models

投稿日: 2024年11月14日作成者: jarxiv

要約 Speech Large Language Model (Speech L … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Investigating the Effectiveness of Explainability Methods in Parkinson’s Detection from Speech

投稿日: 2024年11月14日作成者: jarxiv

要約パーキンソン病 (PD) における言語障害は、診断の重要な初期指標となりま … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Automatic Album Sequencing

投稿日: 2024年11月13日作成者: jarxiv

要約アルバムの順序付けは、アルバム制作プロセスの重要な部分です。最近、コレク … 続きを読む →

カテゴリー: 68T07, cs.AI, cs.CL, cs.LG, cs.MM, cs.SD, I.2.6 | コメントを受け付けていません

Investigating the Effectiveness of Explainability Methods in Parkinson’s Detection from Speech

投稿日: 2024年11月13日作成者: jarxiv

要約パーキンソン病 (PD) における言語障害は、診断の重要な初期指標となりま … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

SAV-SE: Scene-aware Audio-Visual Speech Enhancement with Selective State Space Model

投稿日: 2024年11月13日作成者: jarxiv

要約音声強調はさまざまなアプリケーションで重要な役割を果たしており、視覚情報の … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Diffusion Models for Audio Restoration

投稿日: 2024年11月12日作成者: jarxiv

要約オーディオ再生デバイスの発展と高速データ伝送に伴い、エンターテインメントと … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Electroencephalogram-based Multi-class Decoding of Attended Speakers’ Direction with Audio Spatial Spectrum

投稿日: 2024年11月12日作成者: jarxiv

要約聴取者の脳波 (EEG) 信号から聴取者の焦点の指向性を解読することは、聴 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

「cs.SD」カテゴリーアーカイブ

REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR

Local deployment of large-scale music AI models on commodity hardware

Re-Parameterization of Lightweight Transformer for On-Device Speech Emotion Recognition

A Comparative Study of Discrete Speech Tokens for Semantic-Related Tasks with Large Language Models

Investigating the Effectiveness of Explainability Methods in Parkinson’s Detection from Speech

Automatic Album Sequencing

Investigating the Effectiveness of Explainability Methods in Parkinson’s Detection from Speech

SAV-SE: Scene-aware Audio-Visual Speech Enhancement with Selective State Space Model

Diffusion Models for Audio Restoration

Electroencephalogram-based Multi-class Decoding of Attended Speakers’ Direction with Audio Spatial Spectrum

最近の投稿

最近のコメント

アーカイブ

カテゴリー