Epi-Curriculum: Episodic Curriculum Learning for Low-Resource Domain Adaptation in Neural Machine Translation

要約

ニューラル機械翻訳 (NMT) モデルは成功していますが、データ数が限られた新しいドメインで翻訳する場合のパフォーマンスは依然として低いです。
この論文では、低リソースドメインアダプテーション (DA) に対処するための新しいアプローチであるエピカリキュラムを紹介します。これには、ノイズ除去されたカリキュラム学習とともに新しいエピソードトレーニングフレームワークが含まれています。
私たちのエピソードトレーニングフレームワークは、経験の浅いデコーダー/エンコーダーにエンコーダー/デコーダーをエピソード的に公開することで、ドメインシフトに対するモデルの堅牢性を強化します。
ノイズ除去されたカリキュラム学習では、ノイズが含まれたデータがフィルタリングされ、学習プロセスを簡単なタスクからより難しいタスクに徐々に導くことで、モデルの適応性がさらに向上します。
英語-ドイツ語および英語-ルーマニア語の翻訳に関する実験では、(i) エピカリキュラムは、目に見える領域と見えない領域におけるモデルの堅牢性と適応性の両方を向上させます。
(ii) エピソードトレーニングフレームワークは、ドメインシフトに対するエンコーダとデコーダの堅牢性を強化します。

要約(オリジナル)

Neural Machine Translation (NMT) models have become successful, but their performance remains poor when translating on new domains with a limited number of data. In this paper, we present a novel approach Epi-Curriculum to address low-resource domain adaptation (DA), which contains a new episodic training framework along with denoised curriculum learning. Our episodic training framework enhances the model’s robustness to domain shift by episodically exposing the encoder/decoder to an inexperienced decoder/encoder. The denoised curriculum learning filters the noised data and further improves the model’s adaptability by gradually guiding the learning process from easy to more difficult tasks. Experiments on English-German and English-Romanian translation show that: (i) Epi-Curriculum improves both model’s robustness and adaptability in seen and unseen domains; (ii) Our episodic training framework enhances the encoder and decoder’s robustness to domain shift.

arxiv情報

著者	Keyu Chen,Di Zhuang,Mingchen Li,J. Morris Chang
発行日	2023-09-06 00:59:27+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Epi-Curriculum: Episodic Curriculum Learning for Low-Resource Domain Adaptation in Neural Machine Translation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー