Multilingual Retrieval Augmented Generation for Culturally-Sensitive Tasks: A Benchmark for Cross-lingual Robustness

要約

検索された生成（RAG）のパラダイムは、大規模な言語モデル（LLM）の幻覚を軽減するのに役立ちます。
ただし、RAGは、取得されたドキュメント内に含まれるバイアスも導入します。
これらのバイアスは、領土紛争など、多言語で文化的に敏感なシナリオで増幅することができます。
この論文では、49の言語にわたって14Kウィキペディアのドキュメントと組み合わせた720の領土紛争クエリで構成されるベンチマークであるBordirlinesを紹介します。
このタスクに対するLLMSの横断的堅牢性を評価するために、多言語検索のためにいくつかのモードを形式化します。
いくつかのLLMでの実験により、多言語文書を取得することで応答の一貫性が最も向上し、純粋に言語内のドキュメントを使用して地政学的バイアスが低下し、多様な視点を組み込むことで堅牢性がどのように改善するかを示していることが明らかになりました。
また、低リソース言語でのクエリは、応答引用の言語分布にはるかに広いばらつきを示します。
私たちのさらなる実験とケーススタディは、IRから文書の内容までの側面によって横断的なぼろきれがどのように影響を受けるかを調査します。
ベンチマークとコードをリリースして、https://huggingface.co/datasets/borderlines/bordirlinesで言語間で公平な情報アクセスを確保するためのさらなる調査をサポートします。

要約(オリジナル)

The paradigm of retrieval-augmented generated (RAG) helps mitigate hallucinations of large language models (LLMs). However, RAG also introduces biases contained within the retrieved documents. These biases can be amplified in scenarios which are multilingual and culturally-sensitive, such as territorial disputes. In this paper, we introduce BordIRLines, a benchmark consisting of 720 territorial dispute queries paired with 14k Wikipedia documents across 49 languages. To evaluate LLMs’ cross-lingual robustness for this task, we formalize several modes for multilingual retrieval. Our experiments on several LLMs reveal that retrieving multilingual documents best improves response consistency and decreases geopolitical bias over using purely in-language documents, showing how incorporating diverse perspectives improves robustness. Also, querying in low-resource languages displays a much wider variance in the linguistic distribution of response citations. Our further experiments and case studies investigate how cross-lingual RAG is affected by aspects from IR to document contents. We release our benchmark and code to support further research towards ensuring equitable information access across languages at https://huggingface.co/datasets/borderlines/bordirlines.

arxiv情報

著者	Bryan Li,Fiona Luo,Samar Haider,Adwait Agashe,Tammy Li,Runqi Liu,Muqing Miao,Shriya Ramakrishnan,Yuan Yuan,Chris Callison-Burch
発行日	2025-02-18 18:32:25+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Multilingual Retrieval Augmented Generation for Culturally-Sensitive Tasks: A Benchmark for Cross-lingual Robustness

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー