Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization

要約

タイトル：自動音声認識のためのディープトランスファーラーニング：汎用性の改善に向けて

要約：

– ディープラーニング（DL）を使用する場合、自動音声認識（ASR）は最近重要な課題になってきた。
– DL技術や機械学習（ML）のアプローチは、トレーニングとテストデータが同じドメインで同じ入力フィーチャースペースとデータ分布特性を持つと仮定しているが、現実のAIアプリケーションではこの仮定は当てはまらないことがある。
– 実際には、リアルなデータの収集が困難、費用が高い、またはまれな場合があり、これはDLモデルのデータ要件を満たせないことがある。
– この問題を克服するために、ディープトランスファーラーニング（DTL）が導入されており、トレーニングデータと関連性のある小規模またはわずかに異なる実際のデータを使用して高性能モデルを開発するのに役立つ。
– この論文では、DTLベースのASRフレームワークの包括的な調査が提示され、最新の開発を明確にし、学者やプロフェッショナルが現在の課題を理解するのに役立つ。
– 具体的には、DTLの背景を示した後、よく設計されたタクソノミーを採用して最新の状態を報告する。
– 次に、各フレームワークの制限と利点を特定するために、批判的な分析が実施される。
– その後、比較研究が導入され、現在の課題を強調し、将来の研究の機会を導出する。

要約(オリジナル)

Automatic speech recognition (ASR) has recently become an important challenge when using deep learning (DL). It requires large-scale training datasets and high computational and storage resources. Moreover, DL techniques and machine learning (ML) approaches in general, hypothesize that training and testing data come from the same domain, with the same input feature space and data distribution characteristics. This assumption, however, is not applicable in some real-world artificial intelligence (AI) applications. Moreover, there are situations where gathering real data is challenging, expensive, or rarely occurring, which can not meet the data requirements of DL models. deep transfer learning (DTL) has been introduced to overcome these issues, which helps develop high-performing models using real datasets that are small or slightly different but related to the training data. This paper presents a comprehensive survey of DTL-based ASR frameworks to shed light on the latest developments and helps academics and professionals understand current challenges. Specifically, after presenting the DTL background, a well-designed taxonomy is adopted to inform the state-of-the-art. A critical analysis is then conducted to identify the limitations and advantages of each framework. Moving on, a comparative study is introduced to highlight the current challenges before deriving opportunities for future research.

arxiv情報

著者	Hamza Kheddar,Yassine Himeur,Somaya Al-Maadeed,Abbes Amira,Faycal Bensaali
発行日	2023-04-27 21:08:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, OpenAI

Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー