「eess.AS」カテゴリーアーカイブ

Developing a Multi-variate Prediction Model For COVID-19 From Crowd-sourced Respiratory Voice Data

投稿日: 2024年2月13日作成者: jarxiv

要約新型コロナウイルス感染症は世界 223 か国以上に影響を及ぼしており、ポス … 続きを読む →

カテゴリー: cs.AI, cs.SD, eess.AS | コメントを受け付けていません

Sheet Music Transformer: End-To-End Optical Music Recognition Beyond Monophonic Transcription

投稿日: 2024年2月13日作成者: jarxiv

要約最先端のエンドツーエンドの光学式音楽認識 (OMR) は、これまで主にモノ … 続きを読む →

カテゴリー: cs.CV, cs.SD, eess.AS | コメントを受け付けていません

Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification

投稿日: 2024年2月12日作成者: jarxiv

要約ラベルのないデータに対して自己教師あり目標を使用して大規模な基礎モデルをト … 続きを読む →

カテゴリー: cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Self-consistent context aware conformer transducer for speech recognition

投稿日: 2024年2月12日作成者: jarxiv

要約我々は、ASR システムにコンテキスト情報フローを追加する配座異性体トラン … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

A Multi-Perspective Machine Learning Approach to Evaluate Police-Driver Interaction in Los Angeles

投稿日: 2024年2月12日作成者: jarxiv

要約政府職員と民間人の間の交流は、公共の福祉と民主主義社会の機能に必要な国家の … 続きを読む →

カテゴリー: cs.AI, cs.CY, cs.LG, eess.AS, I.2.0 | コメントを受け付けていません

Establishing degrees of closeness between audio recordings along different dimensions using large-scale cross-lingual models

投稿日: 2024年2月9日作成者: jarxiv

要約リソースが少ない言語研究という非常に制約されたコンテキストにおいて、事前学 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Unified Speech-Text Pretraining for Spoken Dialog Modeling

投稿日: 2024年2月9日作成者: jarxiv

要約最近の研究では、音声を直接理解して合成するための大規模言語モデル (LLM … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

SpiRit-LM: Interleaved Spoken and Written Language Model

投稿日: 2024年2月9日作成者: jarxiv

要約テキストと音声を自由に混合する基礎マルチモーダル言語モデルである SPIR … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Integrating Self-supervised Speech Model with Pseudo Word-level Targets from Visually-grounded Speech Model

投稿日: 2024年2月9日作成者: jarxiv

要約自己教師あり音声モデルの最近の進歩により、多くの下流タスクで大幅な改善が見 … 続きを読む →

カテゴリー: cs.CL, cs.LG, eess.AS | コメントを受け付けていません

A Multi-Perspective Machine Learning Approach to Evaluate Police-Driver Interaction in Los Angeles

投稿日: 2024年2月9日作成者: jarxiv

要約政府職員と民間人の間の交流は、公共の福祉と民主主義社会の機能に必要な国家の … 続きを読む →

カテゴリー: cs.AI, cs.CY, cs.LG, eess.AS, I.2.0 | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

Developing a Multi-variate Prediction Model For COVID-19 From Crowd-sourced Respiratory Voice Data

Sheet Music Transformer: End-To-End Optical Music Recognition Beyond Monophonic Transcription

Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification

Self-consistent context aware conformer transducer for speech recognition

A Multi-Perspective Machine Learning Approach to Evaluate Police-Driver Interaction in Los Angeles

Establishing degrees of closeness between audio recordings along different dimensions using large-scale cross-lingual models

Unified Speech-Text Pretraining for Spoken Dialog Modeling

SpiRit-LM: Interleaved Spoken and Written Language Model

Integrating Self-supervised Speech Model with Pseudo Word-level Targets from Visually-grounded Speech Model

A Multi-Perspective Machine Learning Approach to Evaluate Police-Driver Interaction in Los Angeles

最近の投稿

最近のコメント

アーカイブ

カテゴリー