NVRC: Neural Video Representation Compression

要約

暗黙的ニューラル表現 (INR) ベースのビデオコーディングの最近の進歩により、従来のアプローチや他の学習ベースのアプローチの両方と競合する可能性が実証されました。
INR 手法を使用すると、ビデオコンテンツのコンパクトな表現を取得するためにパラメータが圧縮された状態で、ビデオシーケンスをオーバーフィットするようにニューラルネットワークがトレーニングされます。
ただし、有望な結果が得られているとはいえ、採用されている単純なモデル圧縮技術のせいで、最良の INR ベースの方法でも VVC VTM などの最新の標準コーデックよりもパフォーマンスが優れています。
この論文では、多くの既存の研究のように表現アーキテクチャに焦点を当てるのではなく、表現の圧縮を対象とした、新しい INR ベースのビデオ圧縮フレームワークである Neural Video Representation Compression (NVRC) を提案します。
提案された新しいエントロピー符号化および量子化モデルに基づいて、NVRC は初めて、完全なエンドツーエンドの方法で INR ベースのビデオコーデックを最適化することができます。
エントロピーモデルによってもたらされる追加のビットレートオーバーヘッドをさらに最小限に抑えるために、すべてのネットワーク、量子化、およびエントロピーモデルパラメーターを階層的にコーディングするための新しいモデル圧縮フレームワークも提案しました。
私たちの実験では、NVRC が多くの従来の学習ベースのベンチマークコーデックを上回っており、PSNR で測定した UVG データセット上の VVC VTM (ランダムアクセス) よりも 24% の平均コーディングゲインを実現していることが示されています。
私たちが知る限り、INR ベースのビデオコーデックがこのようなパフォーマンスを達成したのはこれが初めてです。
NVRC の実装は www.github.com でリリースされます。

要約(オリジナル)

Recent advances in implicit neural representation (INR)-based video coding have demonstrated its potential to compete with both conventional and other learning-based approaches. With INR methods, a neural network is trained to overfit a video sequence, with its parameters compressed to obtain a compact representation of the video content. However, although promising results have been achieved, the best INR-based methods are still out-performed by the latest standard codecs, such as VVC VTM, partially due to the simple model compression techniques employed. In this paper, rather than focusing on representation architectures as in many existing works, we propose a novel INR-based video compression framework, Neural Video Representation Compression (NVRC), targeting compression of the representation. Based on the novel entropy coding and quantization models proposed, NVRC, for the first time, is able to optimize an INR-based video codec in a fully end-to-end manner. To further minimize the additional bitrate overhead introduced by the entropy models, we have also proposed a new model compression framework for coding all the network, quantization and entropy model parameters hierarchically. Our experiments show that NVRC outperforms many conventional and learning-based benchmark codecs, with a 24% average coding gain over VVC VTM (Random Access) on the UVG dataset, measured in PSNR. As far as we are aware, this is the first time an INR-based video codec achieving such performance. The implementation of NVRC will be released at www.github.com.

arxiv情報

著者	Ho Man Kwan,Ge Gao,Fan Zhang,Andrew Gower,David Bull
発行日	2024-09-11 16:57:12+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

NVRC: Neural Video Representation Compression

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー