A Comprehensive Comparison of Pre-training Language Models

要約

最近、事前トレーニングされた言語モデルの開発により、自然言語処理 (NLP) タスクが新しい最先端のものになりました。
この論文では、さまざまな事前トレーニング済み言語モデルの効率を調査します。
同じ量のテキストと同じトレーニングステップを使用して、トランスフォーマーベースのモデルのリストを事前トレーニングします。
実験結果は、元の BERT に対する最大の改善は、短いテキストを理解するためにより多くのコンテキスト情報を取得するために RNN 層を追加したことであることを示しています。
しかし、結論は次のとおりです。同様の BERT 構造では、短いテキストの理解に顕著な改善は見られません。
データ中心の方法[12]により、より優れたパフォーマンスを達成できます。

要約(オリジナル)

Recently, the development of pre-trained language models has brought natural language processing (NLP) tasks to the new state-of-the-art. In this paper we explore the efficiency of various pre-trained language models. We pre-train a list of transformer-based models with the same amount of text and the same training steps. The experimental results shows that the most improvement upon the origin BERT is adding the RNN-layer to capture more contextual information for short text understanding. But the conclusion is: There are no remarkable improvement for short text understanding for similar BERT structures. Data-centric method[12] can achieve better performance.

arxiv情報

著者	Tong Guo
発行日	2023-07-26 01:56:20+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A Comprehensive Comparison of Pre-training Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー