Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse

要約

LLMSは、検索された生成（RAG）システムの不可欠なコンポーネントです。
多くの研究は、エンドツーエンドのRAGシステムの全体的な品質の評価に焦点を当てていますが、RAGタスクのLLMの適切性を理解することにはギャップがあります。
これに対処するために、RAGフレームワーク内のLLMの信頼性を評価する全体的なメトリックであるTrust-Scoreを紹介します。
我々の結果は、コンテキスト内学習などのさまざまなプロンプト方法が、Trust-Scoreで測定されたRAGタスクにLLMを効果的に適応させることができないことを示しています。
したがって、Trust-Alignは、Trust-Scoreパフォーマンスを改善するためにLLMSを調整する方法を提案します。
27モデルのうち26モデルは、ASQA、QAMPARI、およびELI5の実質的に競争力のあるベースラインを実質的に上回ることを使用して整列しています。
具体的には、Llama-3-8Bでは、Trust-AlignはASQA（12.56増加）、Qampari（36.04増加）、およびELI5（上昇17.69）の前線よりも優れています。
また、Trust-Alignは、質の高い引用を正しく拒否し、提供するモデルの能力を大幅に向上させます。
また、Llamaシリーズ（1bから8b）、Qwen-2.5シリーズ（0.5bから7b）、Phi3.5（3.8b）を含む、さまざまなオープンウェイトモデルにわたる信頼整列の有効性を実証します。
https://github.com/declare-lab/trust-alignでコードをリリースします。

要約(オリジナル)

LLMs are an integral component of retrieval-augmented generation (RAG) systems. While many studies focus on evaluating the overall quality of end-to-end RAG systems, there is a gap in understanding the appropriateness of LLMs for the RAG task. To address this, we introduce Trust-Score, a holistic metric that evaluates the trustworthiness of LLMs within the RAG framework. Our results show that various prompting methods, such as in-context learning, fail to effectively adapt LLMs to the RAG task as measured by Trust-Score. Consequently, we propose Trust-Align, a method to align LLMs for improved Trust-Score performance. 26 out of 27 models aligned using Trust-Align substantially outperform competitive baselines on ASQA, QAMPARI, and ELI5. Specifically, in LLaMA-3-8b, Trust-Align outperforms FRONT on ASQA (up 12.56), QAMPARI (up 36.04), and ELI5 (up 17.69). Trust-Align also significantly enhances models’ ability to correctly refuse and provide quality citations. We also demonstrate the effectiveness of Trust-Align across different open-weight models, including the LLaMA series (1b to 8b), Qwen-2.5 series (0.5b to 7b), and Phi3.5 (3.8b). We release our code at https://github.com/declare-lab/trust-align.

arxiv情報

著者	Maojia Song,Shang Hong Sim,Rishabh Bhardwaj,Hai Leong Chieu,Navonil Majumder,Soujanya Poria
発行日	2025-04-24 14:58:40+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー