Multi-View Learning with Context-Guided Receptance for Image Denoising

要約

画像ノイズ除去は、写真撮影や自動運転などの低レベル視覚アプリケーションにおいて不可欠である。既存の手法は、実世界のシーンにおける複雑なノイズパターンを区別するのに苦労しており、Transformerベースのモデルに依存しているため、多大な計算リソースを消費している。この研究では、効率的なシーケンスモデリングと強化された多視点特徴統合を組み合わせた、コンテキストガイド付きレセプタンス重み付きキー値（Context-guided Receptance Weighted Key-Value）モデル（Context-guided Receptance Weighted Key-Value）を提案する。我々のアプローチは、局所的な空間依存性を効果的に捕捉し、実世界のノイズ分布をモデル化するモデルの能力を向上させるContext-guided Token Shift (CTS)パラダイムを導入します。さらに、周波数領域の特徴を抽出するFMix(Frequency Mix)モジュールは、高周波スペクトルのノイズを分離するように設計されており、マルチビュー学習プロセスを通じて空間表現と統合される。計算効率を向上させるため、双方向WKV（BiWKV）機構を採用し、因果選択制約を克服しつつ、線形複雑度で完全な画素-配列相互作用を可能にする。このモデルは複数の実世界の画像ノイズ除去データセットで検証され、定量的には既存の最先端手法を上回り、推論時間は40%まで短縮された。定性的な結果はさらに、様々なシーンにおける微細なディテールを復元する我々のモデルの能力を示す。

要約(オリジナル)

Image denoising is essential in low-level vision applications such as photography and automated driving. Existing methods struggle with distinguishing complex noise patterns in real-world scenes and consume significant computational resources due to reliance on Transformer-based models. In this work, the Context-guided Receptance Weighted Key-Value (\M) model is proposed, combining enhanced multi-view feature integration with efficient sequence modeling. Our approach introduces the Context-guided Token Shift (CTS) paradigm, which effectively captures local spatial dependencies and enhance the model’s ability to model real-world noise distributions. Additionally, the Frequency Mix (FMix) module extracting frequency-domain features is designed to isolate noise in high-frequency spectra, and is integrated with spatial representations through a multi-view learning process. To improve computational efficiency, the Bidirectional WKV (BiWKV) mechanism is adopted, enabling full pixel-sequence interaction with linear complexity while overcoming the causal selection constraints. The model is validated on multiple real-world image denoising datasets, outperforming the existing state-of-the-art methods quantitatively and reducing inference time up to 40\%. Qualitative results further demonstrate the ability of our model to restore fine details in various scenes.

arxiv情報

著者	Binghong Chen,Tingting Chai,Wei Jiang,Yuanrong Xu,Guanglu Zhou,Xiangqian Wu
発行日	2025-05-05 14:57:43+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Multi-View Learning with Context-Guided Receptance for Image Denoising

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー