Dancing in Chains: Reconciling Instruction Following and Faithfulness in Language Models

要約

現代言語モデル (LM) は人間の指示に忠実に従う必要があります。
しかし、両方を達成できないことがよくあります。
ここでは、これらの目的でLMを訓練する際の、指示への従うこと（つまり、無制限の指示に従うこと）と忠実さ（つまり、与えられた状況での地上応答）との間のトレードオフの具体的な証拠を提供します。
たとえば、データセットに続く命令で LLaMA-7B を微調整すると、忠実度が低くなります。
逆に、命令に合わせて調整された Vicuna-7B は、コンテキストの基礎を必要とするタスクでさらに最適化されると、命令に従う際のパフォーマンスが低下します。
一般的な解決策の 1 つは、データを混合したマルチタスク学習 (MTL) ですが、それでも相乗的な成果を達成するには程遠いです。
私たちは、バニラ MTL を大幅に上回る、継続的自己指導チューニング (ReSet) のための拒否サンプリングに依存する、シンプルかつ効果的な方法を提案します。
驚くべきことに、高品質でありながら大幅に小さいデータ (3 分の 1 のデータ) を使用して ReSet をトレーニングすると優れた結果が得られるため、少ないほど良いことが分かりました。
私たちの調査結果は、LMのアライメントトレーニングにおける客観的な不一致についてのより良い理解を提供します。

要約(オリジナル)

Modern language models (LMs) need to follow human instructions while being faithful; yet, they often fail to achieve both. Here, we provide concrete evidence of a trade-off between instruction following (i.e., follow open-ended instructions) and faithfulness (i.e., ground responses in given context) when training LMs with these objectives. For instance, fine-tuning LLaMA-7B on instruction following datasets renders it less faithful. Conversely, instruction-tuned Vicuna-7B shows degraded performance at following instructions when further optimized on tasks that require contextual grounding. One common remedy is multi-task learning (MTL) with data mixing, yet it remains far from achieving a synergic outcome. We propose a simple yet effective method that relies on Rejection Sampling for Continued Self-instruction Tuning (ReSet), which significantly outperforms vanilla MTL. Surprisingly, we find that less is more, as training ReSet with high-quality, yet substantially smaller data (three-fold less) yields superior results. Our findings offer a better understanding of objective discrepancies in alignment training of LMs.

arxiv情報

著者	Zhengxuan Wu,Yuhao Zhang,Peng Qi,Yumo Xu,Rujun Han,Yian Zhang,Jifan Chen,Bonan Min,Zhiheng Huang
発行日	2024-07-31 08:05:04+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Dancing in Chains: Reconciling Instruction Following and Faithfulness in Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー