A Low-Resolution Image is Worth 1×1 Words: Enabling Fine Image Super-Resolution with Transformers and TaylorShift

要約

トランスベースの超解像度 (SR) モデルは、最近画像再構成の品質を向上させていますが、計算の複雑さと大きなパッチサイズへの過度の依存により、きめ細かいディテールの強調が制限されるため、課題が残っています。
この研究では、1×1 のパッチサイズを利用することでこれらの制限に対処し、トランスベースの SR モデルでピクセルレベルの処理を可能にする TaylorIR を提案します。
従来のセルフアテンションメカニズムの下での大幅な計算需要に対処するために、テイラー級数展開に基づくメモリ効率の高い代替手段である TaylorShift アテンションメカニズムを採用し、線形複雑さで完全なトークン間の相互作用を実現します。
実験結果は、私たちのアプローチが、従来のセルフアテンションベースのトランスフォーマーと比較してメモリ消費を最大 60% 削減しながら、新しい最先端の SR パフォーマンスを達成することを示しています。

要約(オリジナル)

Transformer-based Super-Resolution (SR) models have recently advanced image reconstruction quality, yet challenges remain due to computational complexity and an over-reliance on large patch sizes, which constrain fine-grained detail enhancement. In this work, we propose TaylorIR to address these limitations by utilizing a patch size of 1×1, enabling pixel-level processing in any transformer-based SR model. To address the significant computational demands under the traditional self-attention mechanism, we employ the TaylorShift attention mechanism, a memory-efficient alternative based on Taylor series expansion, achieving full token-to-token interactions with linear complexity. Experimental results demonstrate that our approach achieves new state-of-the-art SR performance while reducing memory consumption by up to 60% compared to traditional self-attention-based transformers.

arxiv情報

著者	Sanath Budakegowdanadoddi Nagaraju,Brian Bernhard Moser,Tobias Christian Nauen,Stanislav Frolov,Federico Raue,Andreas Dengel
発行日	2024-11-15 14:43:58+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A Low-Resolution Image is Worth 1×1 Words: Enabling Fine Image Super-Resolution with Transformers and TaylorShift

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー