Wiki-En-ASR-Adapt: Large-scale synthetic dataset for English ASR Customization

要約

固有名詞や用語などの多様なレアフレーズや語彙外 (OOV) フレーズに焦点を当てた、自動音声認識 (ASR) の文脈に応じたスペルチェックカスタマイズのための大規模な公開合成データセットを初めて紹介します。
提案されたアプローチでは、破損した ASR 仮説の現実的な例を何百万も作成し、カスタマイズタスク用に自明ではないバイアスリストをシミュレートすることができます。
さらに、トレーニング例のシミュレートされたバイアスリストに 2 種類の「ハードネガ」を注入することを提案し、それらを自動的にマイニングする手順について説明します。
提案されたデータセットでオープンソースのカスタマイズモデルをトレーニングする実験を報告し、ハードネガティブバイアスフレーズの注入により WER と誤警報の数が減少することを示します。

要約(オリジナル)

We present a first large-scale public synthetic dataset for contextual spellchecking customization of automatic speech recognition (ASR) with focus on diverse rare and out-of-vocabulary (OOV) phrases, such as proper names or terms. The proposed approach allows creating millions of realistic examples of corrupted ASR hypotheses and simulate non-trivial biasing lists for the customization task. Furthermore, we propose injecting two types of “hard negatives’ to the simulated biasing lists in training examples and describe our procedures to automatically mine them. We report experiments with training an open-source customization model on the proposed dataset and show that the injection of hard negative biasing phrases decreases WER and the number of false alarms.

arxiv情報

著者	Alexandra Antonova
発行日	2023-09-29 14:18:59+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Wiki-En-ASR-Adapt: Large-scale synthetic dataset for English ASR Customization

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー