「eess.AS」カテゴリーアーカイブ

Whose Emotion Matters? Speaking Activity Localisation without Prior Knowledge

投稿日: 2023年8月16日作成者: jarxiv

要約会話中の感情認識 (ERC) のタスクは、たとえばビデオベースの Mult … 続きを読む →

カテゴリー: 68T20, cs.CV, cs.LG, cs.NE, cs.SD, eess.AS, I.2.0 | コメントを受け付けていません

AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes

投稿日: 2023年8月16日作成者: jarxiv

要約我々は、AudioFormer という名前のメソッドを提案します。このメソ … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN

投稿日: 2023年8月15日作成者: jarxiv

要約逆短時間フーリエ変換ネットワーク (iSTFTNet) は、高速、軽量、高 … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS, stat.ML | コメントを受け付けていません

Active Bird2Vec: Towards End-to-End Bird Sound Monitoring with Transformers

投稿日: 2023年8月15日作成者: jarxiv

要約私たちは、自己教師あり (SSL) とディープアクティブラーニング (DA … 続きを読む →

カテゴリー: cs.HC, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

PitchNet: A Fully Convolutional Neural Network for Pitch Estimation

投稿日: 2023年8月15日作成者: jarxiv

要約音楽とサウンド処理の分野では、ピッチ抽出が極めて重要な役割を果たします。 … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes

投稿日: 2023年8月15日作成者: jarxiv

要約私たちは、AudioFormer という名前のメソッドを提案します。このメ … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

DiffSED: Sound Event Detection with Denoising Diffusion

投稿日: 2023年8月15日作成者: jarxiv

要約サウンドイベント検出 (SED) は、制約のないオーディオサンプルを前 … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

SpeechX: Neural Codec Language Model as a Versatile Speech Transformer

投稿日: 2023年8月15日作成者: jarxiv

要約音声テキストプロンプトに基づく生成音声モデルの最近の進歩により、高品質の … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition

投稿日: 2023年8月15日作成者: jarxiv

要約音声感情認識 (SER) は、音声信号から人間の感情や感情状態を推測するこ … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Pretraining Respiratory Sound Representations using Metadata and Contrastive Learning

投稿日: 2023年8月14日作成者: jarxiv

要約エンドツーエンド方式でアノテーションを使用する教師あり学習に基づく方法は、 … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

Whose Emotion Matters? Speaking Activity Localisation without Prior Knowledge

AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes

iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN

Active Bird2Vec: Towards End-to-End Bird Sound Monitoring with Transformers

PitchNet: A Fully Convolutional Neural Network for Pitch Estimation

AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes

DiffSED: Sound Event Detection with Denoising Diffusion

SpeechX: Neural Codec Language Model as a Versatile Speech Transformer

Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition

Pretraining Respiratory Sound Representations using Metadata and Contrastive Learning

最近の投稿

最近のコメント

アーカイブ

カテゴリー