Adapting Large Language Models for Document-Level Machine Translation

要約

大規模言語モデル (LLM) は、さまざまな自然言語処理 (NLP) タスクにおいて大幅な進歩を遂げました。
最近の調査によると、中規模の LLM は、タスク固有の微調整後に、より大きな LLM よりも優れたパフォーマンスを発揮することがよくあります。
この研究では、特定の言語ペアの文書レベルの機械翻訳 (DocMT) に特化するように LLM を適応させるプロセスを詳しく掘り下げます。
まず、プロンプト戦略が下流の翻訳パフォーマンスにどのような影響を与えるかを調査します。
次に、2 つの微調整方法、3 つの LLM バックボーン、および 9 つの言語ペアにわたる 18 の翻訳タスクを使用して大規模な実験を実施します。
私たちの調査結果は、これらの特殊なモデルは、場合によっては翻訳パフォーマンスにおいて GPT-4 を上回ることもありますが、他のモデルでは、たとえ二か国語の対訳文書のみを対象として微調整されていたとしても、依然としてターゲット外の翻訳の問題に大きく悩まされることを示しています。
さらに、DocMT 向けに調整されたこれらの LLM の詳細な分析を提供し、翻訳エラー、並列文書のスケーリング則、ドメイン外の一般化、ゼロショットの言語間転送の影響などの側面を調査します。
この研究の結果は、LLM ベースの DocMT モデルの長所と限界を明らかにするだけでなく、DocMT の将来の研究の基盤も提供します。

要約(オリジナル)

Large language models (LLMs) have made significant strides in various natural language processing (NLP) tasks. Recent research shows that the moderately-sized LLMs often outperform their larger counterparts after task-specific fine-tuning. In this work, we delve into the process of adapting LLMs to specialize in document-level machine translation (DocMT) for a specific language pair. Firstly, we explore how prompt strategies affect downstream translation performance. Then, we conduct extensive experiments with two fine-tuning methods, three LLM backbones, and 18 translation tasks across nine language pairs. Our findings indicate that in some cases, these specialized models even surpass GPT-4 in translation performance, while they still significantly suffer from the off-target translation issue in others, even if they are exclusively fine-tuned on bilingual parallel documents. Furthermore, we provide an in-depth analysis of these LLMs tailored for DocMT, exploring aspects such as translation errors, the scaling law of parallel documents, out-of-domain generalization, and the impact of zero-shot crosslingual transfer. The findings of this research not only shed light on the strengths and limitations of LLM-based DocMT models but also provide a foundation for future research in DocMT.

arxiv情報

著者	Minghao Wu,Thuy-Trang Vu,Lizhen Qu,George Foster,Gholamreza Haffari
発行日	2024-01-12 09:29:13+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Adapting Large Language Models for Document-Level Machine Translation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー