Recall Them All: Retrieval-Augmented Language Models for Long Object List Extraction from Long Documents

要約

テキストからの関係抽出の方法は、リコールが限られているため、主に高精度に焦点を当てています。
ただし、特定のサブジェクトと特定の関係にあるオブジェクトエンティティの長いリストを作成するには、高いリコールが重要です。
関連するオブジェクトのキューは、長いテキストの多くの文章に広がることができます。
これは、長いテキストから長いリストを抽出するという課題をもたらします。
2つの段階で問題に取り組むL3Xメソッドを提示します。（1）検索の増強のための賢明なテクニックを使用した大規模な言語モデル（LLM）を使用したリコール指向の生成、および（2）候補を検証または剪定する精度指向の精査。
L3Xメソッドは、LLMのみの世代をかなりのマージンよりも優れています。

要約(オリジナル)

Methods for relation extraction from text mostly focus on high precision, at the cost of limited recall. High recall is crucial, though, to populate long lists of object entities that stand in a specific relation with a given subject. Cues for relevant objects can be spread across many passages in long texts. This poses the challenge of extracting long lists from long texts. We present the L3X method which tackles the problem in two stages: (1) recall-oriented generation using a large language model (LLM) with judicious techniques for retrieval augmentation, and (2) precision-oriented scrutinization to validate or prune candidates. Our L3X method outperforms LLM-only generations by a substantial margin.

arxiv情報

著者	Sneha Singhania,Simon Razniewski,Gerhard Weikum
発行日	2025-03-19 11:31:31+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Recall Them All: Retrieval-Augmented Language Models for Long Object List Extraction from Long Documents

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー