Hundred-Kilobyte Lookup Tables for Efficient Single-Image Super-Resolution

要約

従来の超解像度 (SR) スキームでは畳み込みニューラルネットワーク (CNN) が多用されており、これには集中的な積和演算 (MAC) 演算が含まれ、グラフィックス処理ユニットなどの特殊なハードウェアが必要です。
これは、電力、コンピューティング、ストレージリソースに負担がかかるデバイス上で実行されることが多いエッジ AI の体制と矛盾します。
このような課題により、単純な LUT 読み出しを採用し、CNN 計算を大幅に回避する一連のルックアップテーブル (LUT) ベースの SR スキームが開発されました。
それにもかかわらず、既存の方法におけるマルチメガバイトの LUT では依然としてオンチップストレージが禁止されており、オフチップメモリの転送が必要です。
この研究では、このストレージのハードルに取り組み、オンチップキャッシュに適した 100 キロバイト LUT (HKLUT) モデルを革新します。
HKLUT は、非対称 2 ブランチ多段ネットワークと一連の特殊なカーネルパターンを組み合わせて利用することで、既存の LUT スキームを上回る妥協のないパフォーマンスと優れたハードウェア効率を実証します。
私たちの実装は https://github.com/jasonli0707/hklut で公開されています。

要約(オリジナル)

Conventional super-resolution (SR) schemes make heavy use of convolutional neural networks (CNNs), which involve intensive multiply-accumulate (MAC) operations, and require specialized hardware such as graphics processing units. This contradicts the regime of edge AI that often runs on devices strained by power, computing, and storage resources. Such a challenge has motivated a series of lookup table (LUT)-based SR schemes that employ simple LUT readout and largely elude CNN computation. Nonetheless, the multi-megabyte LUTs in existing methods still prohibit on-chip storage and necessitate off-chip memory transport. This work tackles this storage hurdle and innovates hundred-kilobyte LUT (HKLUT) models amenable to on-chip cache. Utilizing an asymmetric two-branch multistage network coupled with a suite of specialized kernel patterns, HKLUT demonstrates an uncompromising performance and superior hardware efficiency over existing LUT schemes. Our implementation is publicly available at: https://github.com/jasonli0707/hklut.

arxiv情報

著者	Binxiao Huang,Jason Chun Lok Li,Jie Ran,Boyu Li,Jiajun Zhou,Dahai Yu,Ngai Wong
発行日	2024-05-08 12:36:49+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Hundred-Kilobyte Lookup Tables for Efficient Single-Image Super-Resolution

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー