「eess.AS」カテゴリーアーカイブ

The mutual exclusivity bias of bilingual visually grounded speech models

投稿日: 2025年6月5日作成者: jarxiv

要約相互排他性（私）は、おなじみの言葉ではなく、子供の言語学習を促進するのでは … 続きを読む →

カテゴリー: cs.CL, eess.AS | コメントを受け付けていません

Acoustically Precise Hesitation Tagging Is Essential for End-to-End Verbatim Transcription Systems

投稿日: 2025年6月5日作成者: jarxiv

要約自動スピーキング評価のための逐語的転写は、エラー分析やフィードバックなどの … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

A Novel Data Augmentation Approach for Automatic Speaking Assessment on Opinion Expressions

投稿日: 2025年6月5日作成者: jarxiv

要約意見表現に関する自動スピーキング評価（ASA）は、ラベル付きの録音の希少性 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

UniCUE: Unified Recognition and Generation Framework for Chinese Cued Speech Video-to-Speech Generation

投稿日: 2025年6月5日作成者: jarxiv

要約手がかりのスピーチ（CS）は、ハンドコーディングを通じてリップリーディング … 続きを読む →

カテゴリー: cs.CV, cs.SD, eess.AS | コメントを受け付けていません

Sounding that Object: Interactive Object-Aware Image to Audio Generation

投稿日: 2025年6月5日作成者: jarxiv

要約複雑なオーディオビジュアルシーンに対して正確なサウンドを生成することは、特 … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Towards a Japanese Full-duplex Spoken Dialogue System

投稿日: 2025年6月4日作成者: jarxiv

要約全二重音声対話システムは、音声の重なりやバックチャネルといった人間の会話の … 続きを読む →

カテゴリー: cs.CL, eess.AS | コメントを受け付けていません

Improving Multilingual Speech Models on ML-SUPERB 2.0: Fine-tuning with Data Augmentation and LID-Aware CTC

投稿日: 2025年6月4日作成者: jarxiv

要約教師ありまたは教師ありで事前に学習された音声基礎モデル（SFM）を用いた多 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Egocentric Speaker Classification in Child-Adult Dyadic Interactions: From Sensing to Computational Modeling

投稿日: 2025年6月3日作成者: jarxiv

要約自閉症スペクトラム障害（ASD）は、社会的コミュニケーション、反復行動、お … 続きを読む →

カテゴリー: cs.AI, cs.SD, eess.AS | コメントを受け付けていません

Bemba Speech Translation: Exploring a Low-Resource African Language

投稿日: 2025年6月3日作成者: jarxiv

要約このホワイトペーパーでは、スポークン言語翻訳に関する国際会議（IWSLT … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Efficient Speech Translation through Model Compression and Knowledge Distillation

投稿日: 2025年6月3日作成者: jarxiv

要約音声翻訳のための大規模なオーディオ言語モデルの効率的な展開は、重要な計算要 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

「eess.AS」カテゴリーアーカイブ

The mutual exclusivity bias of bilingual visually grounded speech models

Acoustically Precise Hesitation Tagging Is Essential for End-to-End Verbatim Transcription Systems

A Novel Data Augmentation Approach for Automatic Speaking Assessment on Opinion Expressions

UniCUE: Unified Recognition and Generation Framework for Chinese Cued Speech Video-to-Speech Generation

Sounding that Object: Interactive Object-Aware Image to Audio Generation

Towards a Japanese Full-duplex Spoken Dialogue System

Improving Multilingual Speech Models on ML-SUPERB 2.0: Fine-tuning with Data Augmentation and LID-Aware CTC

Egocentric Speaker Classification in Child-Adult Dyadic Interactions: From Sensing to Computational Modeling

Bemba Speech Translation: Exploring a Low-Resource African Language

Efficient Speech Translation through Model Compression and Knowledge Distillation

最近の投稿

最近のコメント

アーカイブ

カテゴリー