MAUVE Scores for Generative Models: Theory and Practice

要約

生成人工知能は大幅な進歩を遂げ、人間の散文と区別できないテキストや、驚くほどフォトリアリスティックな画像を生成します。
生成されたデータ分布がターゲット分布にどの程度近づいているかを自動的に測定することは、既存のモデルを診断し、より良いモデルを開発する上で中心となります。
我々は、テキストや画像の生成モデリングで遭遇するような分布のペア間の比較尺度ファミリーである MAUVE を紹介します。
これらのスコアは、生成モデリングにおける 2 種類のエラーを捉えた発散フロンティアの統計的要約です。
これらのスコアを統計的に推定するための 3 つのアプローチ、ベクトル量子化、ノンパラメトリック推定、分類子ベースの推定を検討します。
ベクトル量子化アプローチに統計的限界を提供します。
経験的に、提案されたスコアをさまざまな$f$ダイバージェンスおよび統計的推定手法と組み合わせることで、人間の判断と相関させ、既知の特性を特定することにより、人間が書いたテキストの分布と最新の神経言語モデルの分布との間のギャップを定量化できることがわかりました。
生成されたテキストの。
私たちは、視覚領域において、MAUVE が既存の指標と同等かそれ以上に、生成された画像の既知の特性を識別できることを実証します。
結論として、言語および画像モダリティで MAUVE を効果的に使用するための実践的な推奨事項を示します。

要約(オリジナル)

Generative artificial intelligence has made significant strides, producing text indistinguishable from human prose and remarkably photorealistic images. Automatically measuring how close the generated data distribution is to the target distribution is central to diagnosing existing models and developing better ones. We present MAUVE, a family of comparison measures between pairs of distributions such as those encountered in the generative modeling of text or images. These scores are statistical summaries of divergence frontiers capturing two types of errors in generative modeling. We explore three approaches to statistically estimate these scores: vector quantization, non-parametric estimation, and classifier-based estimation. We provide statistical bounds for the vector quantization approach. Empirically, we find that the proposed scores paired with a range of $f$-divergences and statistical estimation methods can quantify the gaps between the distributions of human-written text and those of modern neural language models by correlating with human judgments and identifying known properties of the generated texts. We demonstrate in the vision domain that MAUVE can identify known properties of generated images on par with or better than existing metrics. In conclusion, we present practical recommendations for using MAUVE effectively with language and image modalities.

arxiv情報

著者	Krishna Pillutla,Lang Liu,John Thickstun,Sean Welleck,Swabha Swayamdipta,Rowan Zellers,Sewoong Oh,Yejin Choi,Zaid Harchaoui
発行日	2023-12-07 06:38:10+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

MAUVE Scores for Generative Models: Theory and Practice

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー