Accelerated Diffusion Models via Speculative Sampling

要約

投機的サンプリングは、高速ドラフトモデルを使用して候補トークンを生成し、ターゲットモデルの分布に基づいて候補トークンを承認または拒否することにより、大規模言語モデルで推論を高速化するための一般的な手法です。
以前は投機的サンプリングは離散シーケンスに限定されていましたが、私たちはそれを拡散モデルに拡張し、連続的なベクトル値のマルコフ連鎖を介してサンプルを生成します。
この文脈では、ターゲットモデルは高品質ですが、計算コストがかかる拡散モデルです。
私たちは、ドラフトモデルのトレーニングを必要とせず、すぐにあらゆる普及モデルに適用できるシンプルで効果的なアプローチを含む、さまざまなドラフト戦略を提案します。
私たちの実験では、ターゲットモデルから正確なサンプルを生成しながら、さまざまな拡散モデルでの生成速度が大幅に向上し、関数評価の数が半減することが実証されました。

要約(オリジナル)

Speculative sampling is a popular technique for accelerating inference in Large Language Models by generating candidate tokens using a fast draft model and accepting or rejecting them based on the target model’s distribution. While speculative sampling was previously limited to discrete sequences, we extend it to diffusion models, which generate samples via continuous, vector-valued Markov chains. In this context, the target model is a high-quality but computationally expensive diffusion model. We propose various drafting strategies, including a simple and effective approach that does not require training a draft model and is applicable out of the box to any diffusion model. Our experiments demonstrate significant generation speedup on various diffusion models, halving the number of function evaluations, while generating exact samples from the target model.

arxiv情報

著者	Valentin De Bortoli,Alexandre Galashov,Arthur Gretton,Arnaud Doucet
発行日	2025-01-09 16:50:16+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Accelerated Diffusion Models via Speculative Sampling

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー