Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models

要約

拡散モデル (DM) は、高品質で多様な画像を生成する際に優れたパフォーマンスを発揮します。
ただし、この並外れたパフォーマンスは、特に主要なモデルで多用されている注目モジュールによる、高価なアーキテクチャ設計を犠牲にして実現されています。
既存作品では主にDM効率を高めるための再学習プロセスが採用されています。
これは計算コストが高く、あまり拡張性がありません。
この目的を達成するために、アテンションマップを利用して再トレーニングを必要とせずに冗長トークンの実行時枝刈りを実行する、アテンション駆動型トレーニング不要の効率的拡散モデル (AT-EDM) フレームワークを導入します。
具体的には、単一ノイズ除去ステップの枝刈りのために、冗長トークンを識別するための新しいランキングアルゴリズムである一般化加重ページランク (G-WPR) と、畳み込み演算用のトークンを復元するための類似性に基づく回復方法を開発しました。
さらに、生成品質を向上させるために、さまざまなノイズ除去タイムステップにわたってプルーニング予算を調整する、ノイズ除去ステップ認識プルーニング (DSAP) アプローチを提案します。
広範な評価により、AT-EDM は完全なモデルとほぼ同じ FID および CLIP スコアを維持しながら、効率の点で従来技術に対して有利に機能することが示されています (たとえば、Stable Diffusion XL に対して 38.8% の FLOP 削減と最大 1.53 倍の高速化)。
プロジェクトの Web ページ: https://atedm.github.io。

要約(オリジナル)

Diffusion Models (DMs) have exhibited superior performance in generating high-quality and diverse images. However, this exceptional performance comes at the cost of expensive architectural design, particularly due to the attention module heavily used in leading models. Existing works mainly adopt a retraining process to enhance DM efficiency. This is computationally expensive and not very scalable. To this end, we introduce the Attention-driven Training-free Efficient Diffusion Model (AT-EDM) framework that leverages attention maps to perform run-time pruning of redundant tokens, without the need for any retraining. Specifically, for single-denoising-step pruning, we develop a novel ranking algorithm, Generalized Weighted Page Rank (G-WPR), to identify redundant tokens, and a similarity-based recovery method to restore tokens for the convolution operation. In addition, we propose a Denoising-Steps-Aware Pruning (DSAP) approach to adjust the pruning budget across different denoising timesteps for better generation quality. Extensive evaluations show that AT-EDM performs favorably against prior art in terms of efficiency (e.g., 38.8% FLOPs saving and up to 1.53x speed-up over Stable Diffusion XL) while maintaining nearly the same FID and CLIP scores as the full model. Project webpage: https://atedm.github.io.

arxiv情報

著者	Hongjie Wang,Difan Liu,Yan Kang,Yijun Li,Zhe Lin,Niraj K. Jha,Yuchen Liu
発行日	2024-05-08 17:56:47+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー