Self-Augmented In-Context Learning for Unsupervised Word Translation

要約

最近の研究では、大規模言語モデル (LLM) は、少数ショットのセットアップでは強力な単語翻訳または二言語辞書誘導 (BLI) 機能を実証しますが、教師なしシナリオでは依然として「従来の」マッピングベースのアプローチのパフォーマンスに匹敵することができないことが示されています。
特に低リソース言語の場合、シード翻訳ペアは利用できません。
LLM でこの課題に対処するために、教師なし BLI の自己拡張インコンテキスト学習 (SAIL) を提案します。SAIL は、ゼロショットプロンプトから開始して、インコンテキスト学習 (ICL) のための信頼度の高い単語翻訳ペアのセットを反復的に誘導します。
) を LLM から取得し、ICL 方式で同じ LLM に再適用します。
私たちの方法は、幅広い言語ペアにわたる 2 つの確立された BLI ベンチマークで、LLM のゼロショットプロンプトよりも大幅な向上を示しており、マッピングベースのベースラインを全体的に上回っています。
最先端の教師なし BLI パフォーマンスを実現することに加えて、SAIL に関する包括的な分析も実施し、その限界についても議論します。

要約(オリジナル)

Recent work has shown that, while large language models (LLMs) demonstrate strong word translation or bilingual lexicon induction (BLI) capabilities in few-shot setups, they still cannot match the performance of ‘traditional’ mapping-based approaches in the unsupervised scenario where no seed translation pairs are available, especially for lower-resource languages. To address this challenge with LLMs, we propose self-augmented in-context learning (SAIL) for unsupervised BLI: starting from a zero-shot prompt, SAIL iteratively induces a set of high-confidence word translation pairs for in-context learning (ICL) from an LLM, which it then reapplies to the same LLM in the ICL fashion. Our method shows substantial gains over zero-shot prompting of LLMs on two established BLI benchmarks spanning a wide range of language pairs, also outperforming mapping-based baselines across the board. In addition to achieving state-of-the-art unsupervised BLI performance, we also conduct comprehensive analyses on SAIL and discuss its limitations.

arxiv情報

著者	Yaoyiran Li,Anna Korhonen,Ivan Vulić
発行日	2024-02-15 15:43:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Self-Augmented In-Context Learning for Unsupervised Word Translation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー