Prompting GPT-3 To Be Reliable

要約

大規模言語モデル (LLM) は、数ショットプロンプトを介して優れた機能を発揮します。
OpenAI GPT-3 などの商用化された API は、実世界の言語アプリケーションでの使用をさらに増やします。
しかし、GPT-3 の信頼性をどのように向上させるかという重大な問題は、まだ調査されていません。
信頼性は広範で漠然と定義された用語ですが、ML の安全性の既存のフレームワークに対応し、重要であることがよく認識されている 4 つの主な側面 (一般化可能性、社会的バイアス、調整、事実性) に信頼性を分解します。
私たちの主な貢献は、GPT-3 の信頼性を向上させるシンプルで効果的なプロンプトを確立することです。1) 分布外を一般化し、2) 人口分布のバランスを取り、自然言語の指示を使用して社会的偏見を減らし、3) 出力確率を調整し、
4) LLM の事実に関する知識と推論チェーンを更新します。
適切なプロンプトを使用すると、GPT-3 はこれらすべての面で小規模な教師ありモデルよりも信頼性が高くなります。
処理されたすべてのデータセット、評価スクリプト、およびモデル予測をリリースします。
私たちの体系的な実証研究は、プロンプト LLM の信頼性に関する新しい洞察をもたらすだけでなく、さらに重要なことに、私たちのプロンプト戦略は、開業医が GPT-3 のような LLM をより確実に使用するのに役立ちます。

要約(オリジナル)

Large language models (LLMs) show impressive abilities via few-shot prompting. Commercialized APIs such as OpenAI GPT-3 further increase their use in real-world language applications. However, the crucial problem of how to improve the reliability of GPT-3 is still under-explored. While reliability is a broad and vaguely defined term, we decompose reliability into four main facets that correspond to the existing framework of ML safety and are well-recognized to be important: generalizability, social biases, calibration, and factuality. Our core contribution is to establish simple and effective prompts that improve GPT-3’s reliability as it: 1) generalizes out-of-distribution, 2) balances demographic distribution and uses natural language instructions to reduce social biases, 3) calibrates output probabilities, and 4) updates the LLM’s factual knowledge and reasoning chains. With appropriate prompts, GPT-3 is more reliable than smaller-scale supervised models on all these facets. We release all processed datasets, evaluation scripts, and model predictions. Our systematic empirical study not only sheds new insights on the reliability of prompting LLMs, but more importantly, our prompting strategies can help practitioners more reliably use LLMs like GPT-3.

arxiv情報

著者	Chenglei Si,Zhe Gan,Zhengyuan Yang,Shuohang Wang,Jianfeng Wang,Jordan Boyd-Graber,Lijuan Wang
発行日	2023-02-15 02:24:43+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Prompting GPT-3 To Be Reliable

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー