「eess.AS」カテゴリーアーカイブ

Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model

投稿日: 2025年6月11日作成者: jarxiv

要約大規模なオーディオ言語モデル（LALMS）は、インテリジェントなヒューマン … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

W4S4: WaLRUS Meets S4 for Long-Range Sequence Modeling

投稿日: 2025年6月10日作成者: jarxiv

要約状態空間モデル（SSM）は、シーケンスモデリングの強力なコンポーネントとし … 続きを読む →

カテゴリー: cs.LG, eess.AS, eess.IV, eess.SP | コメントを受け付けていません

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

投稿日: 2025年6月10日作成者: jarxiv

要約 GeminiやChatGptなどのマルチモーダルファンデーションモデルは、 … 続きを読む →

カテゴリー: cs.CL, eess.AS | コメントを受け付けていません

CO-VADA: A Confidence-Oriented Voice Augmentation Debiasing Approach for Fair Speech Emotion Recognition

投稿日: 2025年6月9日作成者: jarxiv

要約音声感情認識（SER）システムのバイアスは、多くの場合、スピーカーの特性と … 続きを読む →

カテゴリー: cs.CL, eess.AS | コメントを受け付けていません

Label-Context-Dependent Internal Language Model Estimation for CTC

投稿日: 2025年6月9日作成者: jarxiv

要約コネクショニストの時間分類（CTC）には、ラベルコンテキストの独立性の仮定 … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Efficient and Direct Duplex Modeling for Speech-to-Speech Language Model

投稿日: 2025年6月9日作成者: jarxiv

要約話し言葉の対話は、人間のコンピューターの相互作用の直感的な形式ですが、現在 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Efficient Fine-Grained Guidance for Diffusion Model Based Symbolic Music Generation

投稿日: 2025年6月9日作成者: jarxiv

要約生成モデルを開発して、象徴的な音楽を作成または条件付けて作成することは、デ … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

The NTNU System at the S&I Challenge 2025 SLA Open Track

投稿日: 2025年6月6日作成者: jarxiv

要約音声言語評価に関する最近の研究ライン（SLA）は、BertやWAV2VEC … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

AudioLens: A Closer Look at Auditory Attribute Perception of Large Audio-Language Models

投稿日: 2025年6月6日作成者: jarxiv

要約大規模なオーディオ言語モデル（LALMS）の内部メカニズムを理解することは … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Multimodal Biomarkers for Schizophrenia: Towards Individual Symptom Severity Estimation

投稿日: 2025年6月5日作成者: jarxiv

要約深い学習を使用した統合失調症評価に関する研究は、通常、障害の有無を検出する … 続きを読む →

カテゴリー: cs.LG, eess.AS, eess.IV, eess.SP | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model

W4S4: WaLRUS Meets S4 for Long-Range Sequence Modeling

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

CO-VADA: A Confidence-Oriented Voice Augmentation Debiasing Approach for Fair Speech Emotion Recognition

Label-Context-Dependent Internal Language Model Estimation for CTC

Efficient and Direct Duplex Modeling for Speech-to-Speech Language Model

Efficient Fine-Grained Guidance for Diffusion Model Based Symbolic Music Generation

The NTNU System at the S&I Challenge 2025 SLA Open Track

AudioLens: A Closer Look at Auditory Attribute Perception of Large Audio-Language Models

Multimodal Biomarkers for Schizophrenia: Towards Individual Symptom Severity Estimation

最近の投稿

最近のコメント

アーカイブ

カテゴリー