Benchmarking and Improving Generator-Validator Consistency of Language Models

要約

2023年9月現在、ChatGPTは’7+8は何か’を15と正しく答えるが、’7+8=15, True or False’と聞かれると’False’と答える。このような答えの生成と検証の間の矛盾は言語モデル（LM）に蔓延しており、信頼を損なっている。本論文では、生成と検証の一貫性（generator-validator consistency, GV-consistencyと呼ぶ）を測定するフレームワークを提案する。LMの一貫性を改善するために、我々はGV-consistentであるフィルタリングされたジェネレータとバリデータの応答に対してファインチューニングを行うことを提案し、このアプローチをconsistency fine-tuningと呼ぶ。このアプローチにより、Alpaca-30BのGV-consistencyが60%から93%に改善され、その改善は未知のタスクやドメインにも外挿されることがわかった（例えば、肯定的なスタイルに対するGV-consistencyはユーモアのような未知のスタイルにも外挿される）。一貫性の改善に加えて、一貫性の微調整は、ラベル付けされたデータを使用することなく、ジェネレータの品質とバリデータの精度の両方を改善する。数学の質問、知識集約的なQA、インストラクションフォローを含む6つのタスクで評価した結果、我々の手法は全てのタスクでジェネレータの品質を16%、バリデータの精度を6.3%向上させた。

要約(オリジナル)

As of September 2023, ChatGPT correctly answers ‘what is 7+8’ with 15, but when asked ‘7+8=15, True or False’ it responds with ‘False’. This inconsistency between generating and validating an answer is prevalent in language models (LMs) and erodes trust. In this paper, we propose a framework for measuring the consistency between generation and validation (which we call generator-validator consistency, or GV-consistency), finding that even GPT-4, a state-of-the-art LM, is GV-consistent only 76% of the time. To improve the consistency of LMs, we propose to finetune on the filtered generator and validator responses that are GV-consistent, and call this approach consistency fine-tuning. We find that this approach improves GV-consistency of Alpaca-30B from 60% to 93%, and the improvement extrapolates to unseen tasks and domains (e.g., GV-consistency for positive style transfers extrapolates to unseen styles like humor). In addition to improving consistency, consistency fine-tuning improves both generator quality and validator accuracy without using any labeled data. Evaluated across 6 tasks, including math questions, knowledge-intensive QA, and instruction following, our method improves the generator quality by 16% and the validator accuracy by 6.3% across all tasks.

arxiv情報

著者	Xiang Lisa Li,Vaishnavi Shrivastava,Siyan Li,Tatsunori Hashimoto,Percy Liang
発行日	2023-10-03 07:23:22+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Benchmarking and Improving Generator-Validator Consistency of Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー