ACT-MNMT Auto-Constriction Turning for Multilingual Neural Machine Translation

要約

大規模言語モデル (LLM) は、ゼロ/少数ショットプロンプトまたはプロンプトチューニングを通じて、多言語機械翻訳タスクで有望なパフォーマンスを達成しました。
しかし、LLM の事前トレーニング中に多言語データが混合されるため、LLM ベースの翻訳モデルは、両方のプロンプトベースの方法において、指示の誤解、間違った言語での翻訳などの一連の現象を含む、的外れの問題に直面しています。
そしてオーバージェネレーション。
この問題について、この文書では、\textbf{\underline{A}}uto-\textbf{\underline{C}}制限 \textbf{\underline{T}} の \textbf{\underline{M}} 監視メカニズムを導入します。
ultilingual \textbf{\underline{N}}eural \textbf{\underline{M}}achine \textbf{\underline{T}}translation (\model)。これは新しい教師あり微調整メカニズムであり、従来のメカニズムと直交しています。
プロンプトベースのメソッド。
このメソッドでは、 \model はグラウンドトゥルースの前にトリガートークンを追加することによって、ターゲット側で制約付きテンプレートを自動的に構築します。
さらに、トリガートークンを自由に配置および組み合わせてさまざまなタスクセマンティクスを表すことができ、ラベルの可能性を最大化するために反復的に更新することができます。
実験は複数のメトリックを使用した WMT テストセットで実行され、実験結果は、 \model が複数の翻訳方向にわたってパフォーマンスの大幅な向上を達成し、翻訳におけるターゲット外の現象を軽減することを示しています。

要約(オリジナル)

Large language model (LLM) has achieved promising performance in multilingual machine translation tasks through zero/few-shot prompts or prompt-tuning. However, due to the mixture of multilingual data during the pre-training of LLM, the LLM-based translation models face the off-target issue in both prompt-based methods, including a series of phenomena, namely instruction misunderstanding, translation with wrong language and over-generation. For this issue, this paper introduces an \textbf{\underline{A}}uto-\textbf{\underline{C}}onstriction \textbf{\underline{T}}urning mechanism for \textbf{\underline{M}}ultilingual \textbf{\underline{N}}eural \textbf{\underline{M}}achine \textbf{\underline{T}}ranslation (\model), which is a novel supervised fine-tuning mechanism and orthogonal to the traditional prompt-based methods. In this method, \model automatically constructs a constrained template in the target side by adding trigger tokens ahead of the ground truth. Furthermore, trigger tokens can be arranged and combined freely to represent different task semantics, and they can be iteratively updated to maximize the label likelihood. Experiments are performed on WMT test sets with multiple metrics, and the experimental results demonstrate that \model achieves substantially improved performance across multiple translation directions and reduce the off-target phenomena in the translation.

arxiv情報

著者	Shaojie Dai,Xin Liu,Ping Luo,Yue Yu
発行日	2024-03-11 14:10:57+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

ACT-MNMT Auto-Constriction Turning for Multilingual Neural Machine Translation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー