How Does Diffusion Influence Pretrained Language Models on Out-of-Distribution Data?

要約

Transformer ベースの事前トレーニング済み言語モデル (PLM) は、最新の NLP で大きな成功を収めています。
PLM の重要な利点は、配布外 (OOD) に対する優れた堅牢性です。
最近、拡散モデルは、PLM に拡散を適用する多くの研究を集めています。
拡散が OOD データの PLM にどのような影響を与えるかは、まだ十分に調査されていません。
拡散モデルの中核は、ガウスノイズを入力に徐々に適用する順拡散プロセスと、ノイズを除去する逆ノイズ除去プロセスです。
ノイズを含む入力の再構築は、拡散モデルの基本的な機能です。
私たちは、OOD データを再構築する能力や OOD サンプルを検出する能力のテストなど、再構築損失を測定することによって OOD の堅牢性を直接分析します。
実験は、8 つのデータセットのさまざまなトレーニングパラメーターとデータ統計的特徴を分析することによって実行されます。
これは、拡散を使用して PLM を微調整すると、OOD データの再構築能力が低下することを示しています。
この比較では、拡散モデルが OOD サンプルを効果的に検出でき、ほとんどのデータセットで最先端のパフォーマンスを達成し、絶対精度が最大 18% 向上していることも示しています。
これらの結果は、拡散により PLM の OOD 堅牢性が低下することを示しています。

要約(オリジナル)

Transformer-based pretrained language models (PLMs) have achieved great success in modern NLP. An important advantage of PLMs is good out-of-distribution (OOD) robustness. Recently, diffusion models have attracted a lot of work to apply diffusion to PLMs. It remains under-explored how diffusion influences PLMs on OOD data. The core of diffusion models is a forward diffusion process which gradually applies Gaussian noise to inputs, and a reverse denoising process which removes noise. The noised input reconstruction is a fundamental ability of diffusion models. We directly analyze OOD robustness by measuring the reconstruction loss, including testing the abilities to reconstruct OOD data, and to detect OOD samples. Experiments are conducted by analyzing different training parameters and data statistical features on eight datasets. It shows that finetuning PLMs with diffusion degrades the reconstruction ability on OOD data. The comparison also shows that diffusion models can effectively detect OOD samples, achieving state-of-the-art performance in most of the datasets with an absolute accuracy improvement up to 18%. These results indicate that diffusion reduces OOD robustness of PLMs.

arxiv情報

著者	Huazheng Wang,Daixuan Cheng,Haifeng Sun,Jingyu Wang,Qi Qi,Jianxin Liao,Jing Wang,Cong Liu
発行日	2023-07-26 04:03:25+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

How Does Diffusion Influence Pretrained Language Models on Out-of-Distribution Data?

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー