Optimizing the Training Schedule of Multilingual NMT using Reinforcement Learning

要約

多言語NMTは、同じ言語ファミリの高リソース言語（HRL）からのデータが利用可能な場合、低リソース言語（LRL）を翻訳するための実行可能なソリューションです。
ただし、トレーニングスケジュール、つまり言語の提示の順序は、そのようなシステムの品質に影響を与えます。
ここでは、多面的な翻訳設定で、強化学習を使用してNMTのトレーニングスケジュールを最適化する2つのアルゴリズムを適用することを提案します。
前者は、単一言語または多言語開発サブセットの損失に基づいて、各アクションのリターンの指数関数的にスムーズな推定を使用しますが、後者は、受け取った報酬とともに、システムの異なる状態で選択されたアクションの歴史から訓練された追加のニューラルネットワークを使用して報酬を使用して報酬を推定します。
LRLSとHRLSを使用した8対1の翻訳データセットでは、2番目の方法では、LRL対HRLバッチのプレゼンテーションの数を調整することにより、単一言語バッチのランダム選択とシャッフル多言語バッチの両方に関してBLEとCOMETのスコアを改善します。

要約(オリジナル)

Multilingual NMT is a viable solution for translating low-resource languages (LRLs) when data from high-resource languages (HRLs) from the same language family is available. However, the training schedule, i.e. the order of presentation of languages, has an impact on the quality of such systems. Here, in a many-to-one translation setting, we propose to apply two algorithms that use reinforcement learning to optimize the training schedule of NMT: (1) Teacher-Student Curriculum Learning and (2) Deep Q Network. The former uses an exponentially smoothed estimate of the returns of each action based on the loss on monolingual or multilingual development subsets, while the latter estimates rewards using an additional neural network trained from the history of actions selected in different states of the system, together with the rewards received. On a 8-to-1 translation dataset with LRLs and HRLs, our second method improves BLEU and COMET scores with respect to both random selection of monolingual batches and shuffled multilingual batches, by adjusting the number of presentations of LRL vs. HRL batches.

arxiv情報

著者	Alexis Allemann,Àlex R. Atrio,Andrei Popescu-Belis
発行日	2025-06-02 07:35:16+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Optimizing the Training Schedule of Multilingual NMT using Reinforcement Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー