MiniRBT: A Two-stage Distilled Small Chinese Pre-trained Model

要約

【タイトル】ミニRBT：二段階蒸留された中国語の小さい事前学習モデル

【要約】
– 自然言語処理において、事前学習された言語モデルは不可欠なインフラとなっています。
– ただし、これらのモデルはしばしば、大きな容量、長時間の推論時間、および難しいデプロイメントなどの問題に直面します。
– さらに、主流の事前学習モデルは英語に焦点を当てており、中国語の小さな事前学習モデルに関する研究は不十分です。
– 本論文では、中国語自然言語処理の研究を推進するため、狭く深い学生モデルを採用し、全単語マスキングと二段階の蒸留を事前学習に組み込んだ、小さな中国語事前学習モデルであるMiniRBTを紹介します。
– マシンリーディング・コンプリヘンションおよびテキスト分類タスクでの実験により、MiniRBTはRoBERTaに対して94％の性能を発揮し、6.8倍の高速化を提供することが示され、その効果と効率性が証明されました。

要約(オリジナル)

In natural language processing, pre-trained language models have become essential infrastructures. However, these models often suffer from issues such as large size, long inference time, and challenging deployment. Moreover, most mainstream pre-trained models focus on English, and there are insufficient studies on small Chinese pre-trained models. In this paper, we introduce MiniRBT, a small Chinese pre-trained model that aims to advance research in Chinese natural language processing. MiniRBT employs a narrow and deep student model and incorporates whole word masking and two-stage distillation during pre-training to make it well-suited for most downstream tasks. Our experiments on machine reading comprehension and text classification tasks reveal that MiniRBT achieves 94% performance relative to RoBERTa, while providing a 6.8x speedup, demonstrating its effectiveness and efficiency.

arxiv情報

著者	Xin Yao,Ziqing Yang,Yiming Cui,Shijin Wang
発行日	2023-04-03 04:45:57+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, OpenAI

MiniRBT: A Two-stage Distilled Small Chinese Pre-trained Model

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー