ViFOR: A Fourier-Enhanced Vision Transformer for Multi-Image Super-Resolution in Earth System

要約

スーパー解像度（SR）技術は、地球システムモデル（ESM）データの空間解像度を改善するために不可欠であり、複雑な環境プロセスをよりよく理解するのに役立ちます。
このペーパーでは、視覚変圧器（VIT）とフーリエベースの暗黙的な神経表現ネットワーク（INR）を組み合わせて、低解像度（LR）入力から高解像度（HR）画像を生成する新しいアルゴリズムViforを紹介します。
Viforは、Vision Transformer Architecture内のフーリエベースの活性化関数の新しい統合を導入し、正確なSR再構成に重要なグローバルコンテキストと高周波の詳細を効果的にキャプチャできるようにします。
結果は、VIFORが、VIT、正弦波表現ネットワーク（SIREN）、SR Generative Anversarial Networks（SRGANS）などの最先端の方法を上回ることを示しています。
Viforは、ソース温度、短波、および長波フラックスの完全な画像に対して、VITよりも最大4.18 dB、1.56 dB、および1.73 dBのPSNRを改善します。

要約(オリジナル)

Super-resolution (SR) techniques are essential for improving Earth System Model (ESM) data’s spatial resolution, which helps better understand complex environmental processes. This paper presents a new algorithm, ViFOR, which combines Vision Transformers (ViT) and Fourier-based Implicit Neural Representation Networks (INRs) to generate High-Resolution (HR) images from Low-Resolution (LR) inputs. ViFOR introduces a novel integration of Fourier-based activation functions within the Vision Transformer architecture, enabling it to effectively capture global context and high-frequency details critical for accurate SR reconstruction. The results show that ViFOR outperforms state-of-the-art methods such as ViT, Sinusoidal Representation Networks (SIREN), and SR Generative Adversarial Networks (SRGANs) based on metrics like Peak Signal-to-Noise Ratio (PSNR) and Mean Squared Error (MSE) both for global as well as the local imagery. ViFOR improves PSNR of up to 4.18 dB, 1.56 dB, and 1.73 dB over ViT for full images in the Source Temperature, Shortwave, and Longwave Flux.

arxiv情報

著者	Ehsan Zeraatkar,Salah A Faroughi,Jelena Tešić
発行日	2025-05-23 17:03:00+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

ViFOR: A Fourier-Enhanced Vision Transformer for Multi-Image Super-Resolution in Earth System

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー