In-Loop Filtering via Trained Look-Up Tables

要約

インループフィルタリング (ILF) は、画像/ビデオコーディング規格のアーティファクトを除去するための重要なテクノロジです。
最近、ニューラルネットワークベースのループ内フィルタリング手法は、高度なビデオコーディング標準の能力を超える顕著なコーディングゲインを達成しており、将来のビデオコーディング標準の強力なコーディングツールの候補となっています。
ただし、ディープニューラルネットワークの利用には多大な時間と計算の複雑さが伴い、高性能ハードウェアの要求も高く、コーディングシーンの一般的な用途に適用するのは困難です。
この制限に対処するために、画像復元の研究からインスピレーションを得て、ルックアップテーブル (LUT) を採用することで効率的で実用的なループ内フィルタリングスキームを提案します。
固定のフィルタリング参照範囲内でループ内フィルタリングの DNN をトレーニングし、考えられるすべての入力を走査して DNN の出力値を LUT にキャッシュします。
コーディングプロセスのテスト時に、入力ピクセル (参照ピクセルを持つフィルター対象ピクセル) を特定し、キャッシュされたフィルター済みピクセル値を補間することによって、フィルター済みピクセルが生成されます。
LUT の限られたストレージコストで大きなフィルタリング参照範囲をさらに有効にするために、フィルタリングプロセスに拡張インデックスメカニズムを導入し、トレーニングにクリッピング/微調整メカニズムを導入します。
提案された方法は、Versatile Videocoding (VVC) リファレンスソフトウェア VTM-11.0 に実装されています。
実験結果は、提案された方法の超高速、超高速、および高速モードが、オールイントラ (AI) の下で平均 0.13%/0.34%/0.51%、および 0.10%/0.27%/0.39% の BD レート削減を達成することを示しています。
およびランダムアクセス (RA) 構成。
特に、私たちの方法は時間と計算の複雑さが手頃で、0.13〜0.93 kMAC/ピクセルで時間の増加はわずか101％/102％〜104％/108％であり、単一モデルのストレージコストはわずか164〜1148 KBです。
私たちのソリューションは、実用的なニューラルネットワークベースのコーディングツールの進化の過程に光を当てる可能性があります。

要約(オリジナル)

In-loop filtering (ILF) is a key technology for removing the artifacts in image/video coding standards. Recently, neural network-based in-loop filtering methods achieve remarkable coding gains beyond the capability of advanced video coding standards, which becomes a powerful coding tool candidate for future video coding standards. However, the utilization of deep neural networks brings heavy time and computational complexity, and high demands of high-performance hardware, which is challenging to apply to the general uses of coding scene. To address this limitation, inspired by explorations in image restoration, we propose an efficient and practical in-loop filtering scheme by adopting the Look-up Table (LUT). We train the DNN of in-loop filtering within a fixed filtering reference range, and cache the output values of the DNN into a LUT via traversing all possible inputs. At testing time in the coding process, the filtered pixel is generated by locating input pixels (to-be-filtered pixel with reference pixels) and interpolating cached filtered pixel values. To further enable the large filtering reference range with the limited storage cost of LUT, we introduce the enhanced indexing mechanism in the filtering process, and clipping/finetuning mechanism in the training. The proposed method is implemented into the Versatile Video Coding (VVC) reference software, VTM-11.0. Experimental results show that the ultrafast, very fast, and fast mode of the proposed method achieves on average 0.13%/0.34%/0.51%, and 0.10%/0.27%/0.39% BD-rate reduction, under the all intra (AI) and random access (RA) configurations. Especially, our method has friendly time and computational complexity, only 101%/102%-104%/108% time increase with 0.13-0.93 kMACs/pixel, and only 164-1148 KB storage cost for a single model. Our solution may shed light on the journey of practical neural network-based coding tool evolution.

arxiv情報

著者	Zhuoyuan Li,Jiacheng Li,Yao Li,Li Li,Dong Liu,Feng Wu
発行日	2024-07-15 17:25:42+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

In-Loop Filtering via Trained Look-Up Tables

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー