GIVT: Generative Infinite-Vocabulary Transformers

要約

有限語彙からの離散トークンの代わりに、実数値エントリを含むベクトルシーケンスを生成する生成無限語彙変換器 (GIVT) を導入します。
この目的を達成するために、デコーダ専用トランスフォーマに対する 2 つの驚くほど単純な変更を提案します。1) 入力において、有限語彙ルックアップテーブルを入力ベクトルの線形投影に置き換えます。
2) 出力では、ロジット予測 (通常はカテゴリ分布にマッピングされる) を多変量ガウス混合モデルのパラメーターに置き換えます。
VQ-GAN と MaskGIT の画像生成パラダイム (変換器を使用して VQ-VAE の離散潜在シーケンスをモデル化する) からインスピレーションを得て、GIVT を使用して VAE の量子化されていない実数値の潜在シーケンスをモデル化します。
GIVT を反復マスクモデリングによるクラス条件付き画像生成に適用すると、MaskGIT と競合する結果が得られますが、因果モデリングに使用した場合、私たちのアプローチは VQ-GAN と MaskGIT の両方を上回ります。
最後に、UViM フレームワークの VAE ベースのバリアントを使用してパノプティックセグメンテーションと深度推定にアプローチを適用すると、画像生成以外でも競合する結果が得られます。

要約(オリジナル)

We introduce generative infinite-vocabulary transformers (GIVT) which generate vector sequences with real-valued entries, instead of discrete tokens from a finite vocabulary. To this end, we propose two surprisingly simple modifications to decoder-only transformers: 1) at the input, we replace the finite-vocabulary lookup table with a linear projection of the input vectors; and 2) at the output, we replace the logits prediction (usually mapped to a categorical distribution) with the parameters of a multivariate Gaussian mixture model. Inspired by the image-generation paradigm of VQ-GAN and MaskGIT, where transformers are used to model the discrete latent sequences of a VQ-VAE, we use GIVT to model the unquantized real-valued latent sequences of a VAE. When applying GIVT to class-conditional image generation with iterative masked modeling, we show competitive results with MaskGIT, while our approach outperforms both VQ-GAN and MaskGIT when using it for causal modeling. Finally, we obtain competitive results outside of image generation when applying our approach to panoptic segmentation and depth estimation with a VAE-based variant of the UViM framework.

arxiv情報

著者	Michael Tschannen,Cian Eastwood,Fabian Mentzer
発行日	2024-01-18 15:47:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

GIVT: Generative Infinite-Vocabulary Transformers

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー