Explanation-based Finetuning Makes Models More Robust to Spurious Cues

要約

大規模言語モデル (LLM) は非常に強力であるため、タスクに無関係なラベルと特徴間の相関関係を学習することがあり、分布外データの一般化が不十分になることがあります。
私たちは、LLM の偽相関への依存を軽減するための一般的なアプローチとして、説明ベースの微調整を提案します。
入力が与えられた場合にモデルが答えを予測するだけの標準的な微調整とは異なり、モデルを微調整して、その答えを裏付けるフリーテキストの説明を追加生成します。
私たちの方法を評価するために、さまざまな種類の偽のキューを含む人工的に構築されたトレーニングセットでモデルを微調整し、これらのキューのないテストセットでテストします。
標準的な微調整と比較して、私たちの方法では、ComVE (+1.2)、CREAK (+9.1)、e-SNLI (+15.4)、
およびSBIC（+6.5）。
有効性は複数のモデルファミリとスケールにわたって一般化され、大規模なモデルほど効果が大きくなります。
最後に、私たちの方法はモデルによって生成された説明でもうまく機能し、人間が書いた説明なしでより多くのデータセットに適用できることを示唆しています。

要約(オリジナル)

Large Language Models (LLMs) are so powerful that they sometimes learn correlations between labels and features that are irrelevant to the task, leading to poor generalization on out-of-distribution data. We propose explanation-based finetuning as a general approach to mitigate LLMs’ reliance on spurious correlations. Unlike standard finetuning where the model only predicts the answer given the input, we finetune the model to additionally generate a free-text explanation supporting its answer. To evaluate our method, we finetune the model on artificially constructed training sets containing different types of spurious cues, and test it on a test set without these cues. Compared to standard finetuning, our method makes GPT-3 (davinci) remarkably more robust against spurious cues in terms of accuracy drop across four classification tasks: ComVE (+1.2), CREAK (+9.1), e-SNLI (+15.4), and SBIC (+6.5). The efficacy generalizes across multiple model families and scales, with greater gains for larger models. Finally, our method also works well with explanations generated by the model, implying its applicability to more datasets without human-written explanations.

arxiv情報

著者	Josh Magnus Ludan,Yixuan Meng,Tai Nguyen,Saurabh Shah,Qing Lyu,Marianna Apidianaki,Chris Callison-Burch
発行日	2023-06-06 15:31:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Explanation-based Finetuning Makes Models More Robust to Spurious Cues

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー