Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models

要約

大規模言語モデル (LLM) はコンテキストの使用方法に位置的な偏りを示し、特にリストごとのランキングを複雑にします。
これに対処するために、ブラックボックス LLM のランキングリスト出力に対する自己一貫性の形式である、順列の自己一貫性を提案します。
私たちの重要なアイデアは、プロンプト内のさまざまなリストの順序を排除して、位置の偏りが少なく、順序に依存しないランキングを作成することです。
まず、入力プロンプトが与えられると、プロンプト内のリストを繰り返しシャッフルし、同じ命令を保持したまま LLM に渡します。
次に、すべてのランキングに最も近い中央ランキングを計算して、結果として得られたランキングのサンプルを集計し、プロセス内のプロンプト注文のバイアスを排除します。
理論的には、ランダムな摂動が存在する場合でも真のランキングに収束することを示し、この方法の堅牢性を証明します。
経験的に、ソートとパッセージの再ランキングにおける 5 つのリストランキングデータセットにおいて、私たちのアプローチは従来の推論からのスコアを GPT-3.5 の場合は最大 7 ～ 18%、LLaMA v2 (70B) の場合は 8 ～ 16% 改善し、以前の状態を上回りました。
パッセージのアートの再ランキング。
コードは https://github.com/castorini/perm-sc にあります。

要約(オリジナル)

Large language models (LLMs) exhibit positional bias in how they use context, which especially complicates listwise ranking. To address this, we propose permutation self-consistency, a form of self-consistency over ranking list outputs of black-box LLMs. Our key idea is to marginalize out different list orders in the prompt to produce an order-independent ranking with less positional bias. First, given some input prompt, we repeatedly shuffle the list in the prompt and pass it through the LLM while holding the instructions the same. Next, we aggregate the resulting sample of rankings by computing the central ranking closest in distance to all of them, marginalizing out prompt order biases in the process. Theoretically, we prove the robustness of our method, showing convergence to the true ranking in the presence of random perturbations. Empirically, on five list-ranking datasets in sorting and passage reranking, our approach improves scores from conventional inference by up to 7-18% for GPT-3.5 and 8-16% for LLaMA v2 (70B), surpassing the previous state of the art in passage reranking. Our code is at https://github.com/castorini/perm-sc.

arxiv情報

著者	Raphael Tang,Xinyu Zhang,Xueguang Ma,Jimmy Lin,Ferhan Ture
発行日	2024-04-22 17:53:48+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー