Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval

要約

大規模言語モデル (LLM) の幻覚は、LLM が情報を検索し、実際の情報源に基づいて答えを出せるようにすることで、ますます軽減されています。
残念ながら、LLM は、特に複雑なトピックや間接的なトピックを扱う場合、適切な検索クエリを設定するのに苦労することがよくあります。
LLM が $\textit{trying}$ さまざまなクエリを実行することで、関連するファクトを検索する方法を学習し、関連する結果が得られるクエリを重み付けする方法を学習できることを観察し、$\underline{Le}$arning を $\underline{Re} に導入します。
$\underline{T}$rying (LeReT) による $trieve は、検索クエリを調査し、設定ベースの最適化を使用してクエリの品質を向上させる強化学習フレームワークです。
LeReT は、絶対検索精度を最大 29% 向上させ、下流のジェネレーター評価を 17% 向上させることができます。
LeReT はそのシンプルさと柔軟性により、任意の既製の検索ツールに適用することができ、一般的な LLM パイプラインを改善するための有望な技術となっています。
プロジェクトのウェブサイト: http://sherylhsu.com/LeReT/。

要約(オリジナル)

The hallucinations of large language models (LLMs) are increasingly mitigated by allowing LLMs to search for information and to ground their answers in real sources. Unfortunately, LLMs often struggle with posing the right search queries, especially when dealing with complex or otherwise indirect topics. Observing that LLMs can learn to search for relevant facts by $\textit{trying}$ different queries and learning to up-weight queries that successfully produce relevant results, we introduce $\underline{Le}$arning to $\underline{Re}$trieve by $\underline{T}$rying (LeReT), a reinforcement learning framework that explores search queries and uses preference-based optimization to improve their quality. LeReT can improve the absolute retrieval accuracy by up to 29% and the downstream generator evaluations by 17%. The simplicity and flexibility of LeReT allows it to be applied to arbitrary off-the-shelf retrievers and makes it a promising technique for improving general LLM pipelines. Project website: http://sherylhsu.com/LeReT/.

arxiv情報

著者	Sheryl Hsu,Omar Khattab,Chelsea Finn,Archit Sharma
発行日	2024-10-31 01:34:16+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー