Distilled Self-Critique of LLMs with Synthetic Data: a Bayesian Perspective

要約

この論文では、蒸留された自己批判 (dSC) を導入することにより、RLAIF をベイズ推論として解釈することを提案します。dSC は、後で微調整されたモデルに蒸留されるギブスサンプラーを通じて LLM の出力を洗練します。
dSC は合成データのみを必要とするため、安全性、センチメント、プライバシー管理に関する実験で実行され、LLM を調整するための実行可能かつ安価な代替手段となり得ることが示されています。
コードは \url{https://github.com/vicgalle/distilled-self-critique} でリリースされました。

要約(オリジナル)

This paper proposes an interpretation of RLAIF as Bayesian inference by introducing distilled Self-Critique (dSC), which refines the outputs of a LLM through a Gibbs sampler that is later distilled into a fine-tuned model. Only requiring synthetic data, dSC is exercised in experiments regarding safety, sentiment, and privacy control, showing it can be a viable and cheap alternative to align LLMs. Code released at \url{https://github.com/vicgalle/distilled-self-critique}.

arxiv情報

著者	Victor Gallego
発行日	2024-02-23 17:03:19+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Distilled Self-Critique of LLMs with Synthetic Data: a Bayesian Perspective

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー