RainMamba: Enhanced Locality Learning with State Space Models for Video Deraining

要約

屋外ビジョンシステムは、雨筋や雨滴によって頻繁に汚染され、視覚タスクやマルチメディアアプリケーションのパフォーマンスを著しく低下させます。
ビデオの性質上、より高い安定性で雨を除去するための冗長な時間的手がかりが表示されます。
従来のビデオディレイン手法は、オプティカルフロー推定とカーネルベースの手法に大きく依存しており、受容野が限られています。
ただし、トランスフォーマーアーキテクチャは長期的な依存関係を可能にする一方で、計算の複雑さの大幅な増加をもたらします。
最近では、状態空間モデル (SSM) の線形複雑度演算子により、逆に効率的な長期時間モデリングが容易になりました。これは、ビデオの雨筋や雨滴の除去に不可欠です。
予想外なことに、ビデオ上のその一次元の逐次プロセスは、隣接するピクセルを遠ざけることによって、時空間次元にわたる局所的な相関関係を破壊します。
これに対処するために、シーケンスレベルのローカル情報をより適切にキャプチャするための新しいヒルベルトスキャンメカニズムを備えた、改良された SSM ベースのビデオディレインネットワーク (RainMamba) を紹介します。
また、提案されたネットワークのパッチレベルの自己類似性学習能力を強化するために、差分ガイドによる動的対比局所性学習戦略を導入します。
4 つの合成ビデオディレインデータセットと実際の雨のビデオに関する広範な実験により、雨筋と雨滴の除去におけるネットワークの有効性と効率が実証されました。
コードと結果は https://github.com/TonyHongtaoWu/RainMamba で入手できます。

要約(オリジナル)

The outdoor vision systems are frequently contaminated by rain streaks and raindrops, which significantly degenerate the performance of visual tasks and multimedia applications. The nature of videos exhibits redundant temporal cues for rain removal with higher stability. Traditional video deraining methods heavily rely on optical flow estimation and kernel-based manners, which have a limited receptive field. Yet, transformer architectures, while enabling long-term dependencies, bring about a significant increase in computational complexity. Recently, the linear-complexity operator of the state space models (SSMs) has contrarily facilitated efficient long-term temporal modeling, which is crucial for rain streaks and raindrops removal in videos. Unexpectedly, its uni-dimensional sequential process on videos destroys the local correlations across the spatio-temporal dimension by distancing adjacent pixels. To address this, we present an improved SSMs-based video deraining network (RainMamba) with a novel Hilbert scanning mechanism to better capture sequence-level local information. We also introduce a difference-guided dynamic contrastive locality learning strategy to enhance the patch-level self-similarity learning ability of the proposed network. Extensive experiments on four synthesized video deraining datasets and real-world rainy videos demonstrate the effectiveness and efficiency of our network in the removal of rain streaks and raindrops. Our code and results are available at https://github.com/TonyHongtaoWu/RainMamba.

arxiv情報

著者	Hongtao Wu,Yijun Yang,Huihui Xu,Weiming Wang,Jinni Zhou,Lei Zhu
発行日	2024-09-11 17:47:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

RainMamba: Enhanced Locality Learning with State Space Models for Video Deraining

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー