VI-Diff: Unpaired Visible-Infrared Translation Diffusion Model for Single Modality Labeled Visible-Infrared Person Re-identification

要約

現実世界のシナリオにおける可視赤外線人物再識別 (VI-ReID) は、クロスモダリティデータアノテーションのコストが高いため、重大な課題を引き起こします。
照明条件の良し悪しに対応する RGB/IR カメラなど、さまざまなセンシングカメラを使用すると、モダリティ全体で同じ人物を識別するのにコストがかかり、エラーが発生しやすくなります。
これを克服するために、より費用対効果が高く実用的な、VI-ReID タスク用の単一モダリティラベル付きデータの使用を検討します。
1 つのモダリティ (可視画像など) のみで歩行者にラベルを付け、別のモダリティ (赤外線画像など) で取得することにより、不対の画像間変換技術を使用して、元々ラベル付けされたデータとモダリティで変換されたデータの両方を含むトレーニングセットを作成することを目指しています。
。
この論文では、可視赤外線人物画像変換のタスクに効果的に対処する拡散モデル VI-Diff を提案します。
包括的な実験を通じて、VI-Diff が既存の拡散モデルや GAN モデルよりも優れたパフォーマンスを示し、単一モダリティでラベル付けされたデータを使用する VI-ReID の有望なソリューションとなることを実証しました。
私たちのアプローチは、単一モダリティでラベル付けされたデータを使用した VI-ReID タスクに対する有望なソリューションとなる可能性があり、将来の研究の良い出発点として役立ちます。
コードが利用可能になります。

要約(オリジナル)

Visible-Infrared person re-identification (VI-ReID) in real-world scenarios poses a significant challenge due to the high cost of cross-modality data annotation. Different sensing cameras, such as RGB/IR cameras for good/poor lighting conditions, make it costly and error-prone to identify the same person across modalities. To overcome this, we explore the use of single-modality labeled data for the VI-ReID task, which is more cost-effective and practical. By labeling pedestrians in only one modality (e.g., visible images) and retrieving in another modality (e.g., infrared images), we aim to create a training set containing both originally labeled and modality-translated data using unpaired image-to-image translation techniques. In this paper, we propose VI-Diff, a diffusion model that effectively addresses the task of Visible-Infrared person image translation. Through comprehensive experiments, we demonstrate that VI-Diff outperforms existing diffusion and GAN models, making it a promising solution for VI-ReID with single-modality labeled data. Our approach can be a promising solution to the VI-ReID task with single-modality labeled data and serves as a good starting point for future study. Code will be available.

arxiv情報

著者	Han Huang,Yan Huang,Liang Wang
発行日	2023-10-06 09:42:12+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

VI-Diff: Unpaired Visible-Infrared Translation Diffusion Model for Single Modality Labeled Visible-Infrared Person Re-identification

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー