SeaTurtleID: A novel long-span dataset highlighting the importance of timestamps in wildlife re-identification

要約

このホワイトペーパーでは、SeaTurtleID を紹介します。これは、野生で撮影されたウミガメの写真を含む、初めて公開された大規模で長期にわたるデータセットです。
このデータセットは、再識別方法のベンチマークや、他のいくつかのコンピュータービジョンタスクの評価に適しています。
このデータセットは、12 年間に 1081 回の遭遇で収集された 400 人のユニークな個人の 7774 枚の高解像度写真で構成されています。
各写真には、ID ラベル、ヘッドセグメンテーションマスク、エンカウンタータイムスタンプなどの豊富なメタデータが付随しています。
データセットの 12 年間のスパンは、タイムスタンプ付きの公開されている野生動物データセットの中で最も長いものです。
このユニークなプロパティを利用することで、動物の再識別方法の公平な評価にタイムスタンプが必要であることを示します。これは、データセットを参照セットとクエリセットに時間認識で分割できるためです。
機能ベースと CNN ベースの両方の再識別方法の時間認識分割と比較して、時間認識分割は 100% を超えるパフォーマンスの過大評価につながる可能性があることを示します。
また、時間認識分割は、時間非認識分割よりも現実的な再識別パイプラインに対応すると主張します。
動物の再識別方法は、時間認識分割を使用したタイムスタンプ付きのデータセットでのみテストすることをお勧めします。データセット管理者には、関連するメタデータにそのような情報を含めることをお勧めします。

要約(オリジナル)

This paper introduces SeaTurtleID, the first public large-scale, long-span dataset with sea turtle photographs captured in the wild. The dataset is suitable for benchmarking re-identification methods and evaluating several other computer vision tasks. The dataset consists of 7774 high-resolution photographs of 400 unique individuals collected within 12 years in 1081 encounters. Each photograph is accompanied by rich metadata, e.g., identity label, head segmentation mask, and encounter timestamp. The 12-year span of the dataset makes it the longest-spanned public wild animal dataset with timestamps. By exploiting this unique property, we show that timestamps are necessary for an unbiased evaluation of animal re-identification methods because they allow time-aware splits of the dataset into reference and query sets. We show that time-unaware splits can lead to performance overestimation of more than 100% compared to the time-aware splits for both feature- and CNN-based re-identification methods. We also argue that time-aware splits correspond to more realistic re-identification pipelines than the time-unaware ones. We recommend that animal re-identification methods should only be tested on datasets with timestamps using time-aware splits, and we encourage dataset curators to include such information in the associated metadata.

arxiv情報

著者	Kostas Papafitsoros,Lukáš Adam,Vojtěch Čermák,Lukáš Picek
発行日	2022-11-18 15:46:24+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SeaTurtleID: A novel long-span dataset highlighting the importance of timestamps in wildlife re-identification

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー