Integrating Pre-trained Language Model into Neural Machine Translation

要約

ニューラル機械翻訳 (NMT) は、広範な研究開発を通じて、自然言語処理における重要なテクノロジになりました。
ただし、高品質のバイリンガル言語ペアデータの欠如が、NMT のパフォーマンスを向上させる上で依然として大きな課題となっています。
最近の研究では、この問題に対処するために、事前トレーニングされた言語モデル (PLM) からのコンテキスト情報の使用が検討されています。
しかし、PLM モデルと NMT モデル間の非互換性の問題は未解決のままです。
この研究では、特定された問題を克服するために、PLM 統合 NMT (PiNMT) モデルを提案します。
PiNMT モデルは、PLM マルチレイヤーコンバーター、エンベディングフュージョン、コサインアライメントという 3 つの重要なコンポーネントで構成されており、それぞれが効果的な PLM 情報を NMT に提供する上で重要な役割を果たします。
さらに、このホワイトペーパーでは、2 つのトレーニング戦略、個別学習率とデュアルステップトレーニングも紹介します。
提案された PiNMT モデルとトレーニング戦略を実装することにより、IWSLT’14 En$\leftrightarrow$De データセットで最先端のパフォーマンスを達成します。
この研究の成果は、PLM と NMT を効率的に統合して非互換性を克服し、パフォーマンスを向上させるための新しいアプローチを実証するものであるため、注目に値します。

要約(オリジナル)

Neural Machine Translation (NMT) has become a significant technology in natural language processing through extensive research and development. However, the deficiency of high-quality bilingual language pair data still poses a major challenge to improving NMT performance. Recent studies have been exploring the use of contextual information from pre-trained language model (PLM) to address this problem. Yet, the issue of incompatibility between PLM and NMT model remains unresolved. This study proposes PLM-integrated NMT (PiNMT) model to overcome the identified problems. PiNMT model consists of three critical components, PLM Multi Layer Converter, Embedding Fusion, and Cosine Alignment, each playing a vital role in providing effective PLM information to NMT. Furthermore, two training strategies, Separate Learning Rates and Dual Step Training, are also introduced in this paper. By implementing the proposed PiNMT model and training strategy, we achieve state-of-the-art performance on the IWSLT’14 En$\leftrightarrow$De dataset. This study’s outcomes are noteworthy as they demonstrate a novel approach for efficiently integrating PLM with NMT to overcome incompatibility and enhance performance.

arxiv情報

著者	Soon-Jae Hwang,Chang-Sung Jeong
発行日	2023-11-22 16:12:39+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Integrating Pre-trained Language Model into Neural Machine Translation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー