「cs.SD」カテゴリーアーカイブ

Reinforcement Learning Outperforms Supervised Fine-Tuning: A Case Study on Audio Question Answering

投稿日: 2025年5月15日作成者: jarxiv

要約最近、強化学習（RL）は、大規模な言語モデル（LLM）の推論能力を大幅に強 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

The Voice Timbre Attribute Detection 2025 Challenge Evaluation Plan

投稿日: 2025年5月15日作成者: jarxiv

要約声の音色とは、人間の聴覚によって認識されているように、他の人と区別する人の … 続きを読む →

カテゴリー: cs.AI, cs.SD, eess.AS | コメントを受け付けていません

WavReward: Spoken Dialogue Models With Generalist Reward Evaluators

投稿日: 2025年5月15日作成者: jarxiv

要約 GPT-4O-Audioなどのエンドツーエンドの音声対話モデルは、最近、音 … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

UWAV: Uncertainty-weighted Weakly-supervised Audio-Visual Video Parsing

投稿日: 2025年5月15日作成者: jarxiv

要約オーディオビジュアルビデオの解析（AVVP）は、両方のユニモーダルイベント … 続きを読む →

カテゴリー: cs.CV, cs.SD, eess.AS | コメントを受け付けていません

A Mamba-based Network for Semi-supervised Singing Melody Extraction Using Confidence Binary Regularization

投稿日: 2025年5月14日作成者: jarxiv

要約 Singing Melody Extraction（SME）は、音楽情報検 … 続きを読む →

カテゴリー: cs.AI, cs.SD, eess.AS | コメントを受け付けていません

ISAC: An Invertible and Stable Auditory Filter Bank with Customizable Kernels for ML Integration

投稿日: 2025年5月13日作成者: jarxiv

要約このペーパーでは、機械学習のパラダイムに統合されるように特別に設計された、 … 続きを読む →

カテゴリー: cs.LG, cs.SD | コメントを受け付けていません

Multi-Domain Audio Question Answering Toward Acoustic Content Reasoning in The DCASE 2025 Challenge

投稿日: 2025年5月13日作成者: jarxiv

要約 DCASE 2025チャレンジのタスク5を紹介します。音響質問（AQA）ベ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Diffused Responsibility: Analyzing the Energy Consumption of Generative Text-to-Audio Diffusion Models

投稿日: 2025年5月13日作成者: jarxiv

要約テキストからオーディオモデルは最近、テキストの説明からサウンドを生成するた … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Lightweight End-to-end Text-to-speech Synthesis for low resource on-device applications

投稿日: 2025年5月13日作成者: jarxiv

要約最近の作品は、エンドツーエンド（E2E）ファッションのテキストからの生の波 … 続きを読む →

カテゴリー: cs.AI, cs.SD, eess.AS | コメントを受け付けていません

Learning Music Audio Representations With Limited Data

投稿日: 2025年5月12日作成者: jarxiv

要約汎用音楽の音声表現の学習に焦点を当てたものを含む音楽の大規模な学習モデルは … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

「cs.SD」カテゴリーアーカイブ

Reinforcement Learning Outperforms Supervised Fine-Tuning: A Case Study on Audio Question Answering

The Voice Timbre Attribute Detection 2025 Challenge Evaluation Plan

WavReward: Spoken Dialogue Models With Generalist Reward Evaluators

UWAV: Uncertainty-weighted Weakly-supervised Audio-Visual Video Parsing

A Mamba-based Network for Semi-supervised Singing Melody Extraction Using Confidence Binary Regularization

ISAC: An Invertible and Stable Auditory Filter Bank with Customizable Kernels for ML Integration

Multi-Domain Audio Question Answering Toward Acoustic Content Reasoning in The DCASE 2025 Challenge

Diffused Responsibility: Analyzing the Energy Consumption of Generative Text-to-Audio Diffusion Models

Lightweight End-to-end Text-to-speech Synthesis for low resource on-device applications

Learning Music Audio Representations With Limited Data

最近の投稿

最近のコメント

アーカイブ

カテゴリー