Mapping using Transformers for Volumes — Network for Super-Resolution with Long-Range Interactions

要約

これまで、2D 超解像度に見られるトランスベースのモデルの最近の進歩を体積超解像度で利用することは困難でした。
3D ボリュームでの自己注意に必要なメモリにより、受容野が制限されます。
したがって、3D では 2D で行われるほどの長距離相互作用は使用されず、トランスフォーマーの強みが実現されません。
これを克服するために、複数のスケールでキャリアトークンと組み合わせた階層的なアテンションブロックに基づくマルチスケールのトランスフォーマーベースのモデルを提案します。
ここでは、粗い解像度のより大きな領域からの情報が、より細かい解像度の領域に順次引き継がれて、超解像画像が予測されます。
各解像度でトランスフォーマー層を使用する粗密モデリングにより、各スケールでのトークンの数が制限され、これまで可能であったものよりも広い領域に注意を向けることが可能になります。
私たちは、5 つの 3D データセット上で、私たちの手法である MTVNet を最先端の体積超解像度モデルと実験的に比較し、受容野の増加による利点を実証しました。
この利点は、一般的に使用されている 3D データセットで見られるものよりも大きい画像の場合に特に顕著です。
私たちのコードは https://github.com/AugustHoeg/MTVNet で入手できます。

要約(オリジナル)

Until now, it has been difficult for volumetric super-resolution to utilize the recent advances in transformer-based models seen in 2D super-resolution. The memory required for self-attention in 3D volumes limits the receptive field. Therefore, long-range interactions are not used in 3D to the extent done in 2D and the strength of transformers is not realized. We propose a multi-scale transformer-based model based on hierarchical attention blocks combined with carrier tokens at multiple scales to overcome this. Here information from larger regions at coarse resolution is sequentially carried on to finer-resolution regions to predict the super-resolved image. Using transformer layers at each resolution, our coarse-to-fine modeling limits the number of tokens at each scale and enables attention over larger regions than what has previously been possible. We experimentally compare our method, MTVNet, against state-of-the-art volumetric super-resolution models on five 3D datasets demonstrating the advantage of an increased receptive field. This advantage is especially pronounced for images that are larger than what is seen in popularly used 3D datasets. Our code is available at https://github.com/AugustHoeg/MTVNet

arxiv情報

著者	August Leander Høeg,Sophia W. Bardenfleth,Hans Martin Kjer,Tim B. Dyrby,Vedrana Andersen Dahl,Anders Dahl
発行日	2024-12-04 15:06:39+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Mapping using Transformers for Volumes — Network for Super-Resolution with Long-Range Interactions

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー