Lossy Image Compression with Conditional Diffusion Models

要約

ノイズ除去拡散モデルは、最近、高品質の画像生成においてマイルストーンをマークしました。
したがって、それらが神経画像圧縮に適しているかどうか疑問に思うかもしれません。
このホワイトペーパーでは、変換コーディングパラダイムを利用して、条件付き拡散モデルに基づくエンドツーエンドの最適化された画像圧縮フレームワークの概要を説明します。
拡散プロセスに固有の潜在変数に加えて、この論文では、ノイズ除去プロセスを調整するために、追加の個別の「コンテンツ」潜在変数を導入します。
この変数には、エントロピーコーディング用の階層的な事前分布が装備されています。
拡散プロセスを特徴付ける残りの「テクスチャ」潜在変数は、デコード時に (確率論的または決定論的に) 合成されます。
さらに、関心のある知覚指標に合わせてパフォーマンスを調整できることを示します。
5 つのデータセットと 16 の画質評価メトリクスを含む大規模な実験では、私たちのアプローチがレート知覚品質で優れているだけでなく、最先端のモデルと近い歪み性能も示していることが示されています。

要約(オリジナル)

Denoising diffusion models have recently marked a milestone in high-quality image generation. One may thus wonder if they are suitable for neural image compression. This paper outlines an end-to-end optimized image compression framework based on a conditional diffusion model, drawing on the transform-coding paradigm. Besides the latent variables inherent to the diffusion process, this paper introduces an additional discrete “content” latent variable to condition the denoising process. This variable is equipped with a hierarchical prior for entropy coding. The remaining “texture” latent variables characterizing the diffusion process are synthesized (either stochastically or deterministically) at decoding time. We furthermore show that the performance can be tuned toward perceptual metrics of interest. Our extensive experiments involving five datasets and sixteen image quality assessment metrics show that our approach not only compares favorably in rate-perceptual quality but also shows close distortion performance with state-of-the-art models.

arxiv情報

著者	Ruihan Yang,Stephan Mandt
発行日	2022-12-09 14:17:51+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Lossy Image Compression with Conditional Diffusion Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー