Image Coding for Machines with Edge Information Learning Using Segment Anything

要約

Image Coating for Machines (ICM) は、画像認識のための画像圧縮技術です。
画像認識 AI の需要が高まっているため、この技術は不可欠です。
本稿では、画像中の物体部分のエッジ情報のみを符号化・復号することに焦点を当てたICM手法を提案し、これをSA-ICMと呼ぶ。
これは、Segment Anything によって作成されたエッジ情報を使用してトレーニングされた学習済み画像圧縮 (LIC) モデルです。
私たちの方法は、さまざまなタスクの画像認識モデルに使用できます。
SA-ICM は入力データの変更にも強いため、さまざまなユースケースに効果的です。
さらに、私たちの方法は、エンコーダ側で人間の顔情報を削除するため、プライバシーの観点からも利点があり、プライバシーが保護されます。
さらに、この LIC モデルのトレーニング方法は、ビデオ圧縮モデルである Neural Representations for Videos (NeRV) のトレーニングに使用できます。
Segment Anythingで作成したエッジ情報を用いてNeRVを学習させることで、画像認識に有効なNeRV(SA-NeRV)を作成することができます。
実験結果は、SA-ICM の利点を裏付け、画像認識のための画像圧縮において最高のパフォーマンスを示します。
また、SA-NeRV がマシンのビデオ圧縮において通常の NeRV よりも優れていることも示します。
コードは https://github.com/final-0/SA-ICM で入手できます。

要約(オリジナル)

Image Coding for Machines (ICM) is an image compression technique for image recognition. This technique is essential due to the growing demand for image recognition AI. In this paper, we propose a method for ICM that focuses on encoding and decoding only the edge information of object parts in an image, which we call SA-ICM. This is an Learned Image Compression (LIC) model trained using edge information created by Segment Anything. Our method can be used for image recognition models with various tasks. SA-ICM is also robust to changes in input data, making it effective for a variety of use cases. Additionally, our method provides benefits from a privacy point of view, as it removes human facial information on the encoder’s side, thus protecting one’s privacy. Furthermore, this LIC model training method can be used to train Neural Representations for Videos (NeRV), which is a video compression model. By training NeRV using edge information created by Segment Anything, it is possible to create a NeRV that is effective for image recognition (SA-NeRV). Experimental results confirm the advantages of SA-ICM, presenting the best performance in image compression for image recognition. We also show that SA-NeRV is superior to ordinary NeRV in video compression for machines. Code is available at https://github.com/final-0/SA-ICM.

arxiv情報

著者	Takahiro Shindo,Kein Yamada,Taiju Watanabe,Hiroshi Watanabe
発行日	2024-06-07 15:11:34+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Image Coding for Machines with Edge Information Learning Using Segment Anything

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー