BURST: A Benchmark for Unifying Object Recognition, Segmentation and Tracking in Video

要約

ビデオオブジェクトセグメンテーション (VOS) やマルチオブジェクトトラッキングとセグメンテーション (MOTS) など、複数の既存のベンチマークには、ビデオ内のオブジェクトの追跡とセグメント化が含まれますが、異なるベンチマークデータセットとメトリック (J&F、
ｍＡＰ、ｓＭＯＴＳＡ）。
その結果、公開された作品は通常、特定のベンチマークを対象としており、互いに簡単に比較することはできません。
複数のタスクに取り組むことができる一般化された方法の開発には、これらの研究サブコミュニティ間のより大きな結束が必要であると考えています。
このホワイトペーパーでは、高品質のオブジェクトマスクを含む何千もの多様なビデオを含むデータセットである BURST と、ビデオのオブジェクトトラッキングとセグメンテーションを含む 6 つのタスクに関連するベンチマークを提案することで、これを促進することを目指しています。
すべてのタスクは、同じデータと同等の指標を使用して評価されます。これにより、研究者はそれらを一斉に検討できるため、さまざまなタスク間でさまざまな方法からの知識をより効果的にプールできます。
さらに、すべてのタスクのいくつかのベースラインを示し、あるタスクのアプローチを別のタスクに適用して、定量化および説明可能なパフォーマンスの違いがあることを示します。
データセットの注釈と評価コードは、https://github.com/Ali2500/BURST-benchmark で入手できます。

要約(オリジナル)

Multiple existing benchmarks involve tracking and segmenting objects in video e.g., Video Object Segmentation (VOS) and Multi-Object Tracking and Segmentation (MOTS), but there is little interaction between them due to the use of disparate benchmark datasets and metrics (e.g. J&F, mAP, sMOTSA). As a result, published works usually target a particular benchmark, and are not easily comparable to each another. We believe that the development of generalized methods that can tackle multiple tasks requires greater cohesion among these research sub-communities. In this paper, we aim to facilitate this by proposing BURST, a dataset which contains thousands of diverse videos with high-quality object masks, and an associated benchmark with six tasks involving object tracking and segmentation in video. All tasks are evaluated using the same data and comparable metrics, which enables researchers to consider them in unison, and hence, more effectively pool knowledge from different methods across different tasks. Additionally, we demonstrate several baselines for all tasks and show that approaches for one task can be applied to another with a quantifiable and explainable performance difference. Dataset annotations and evaluation code is available at: https://github.com/Ali2500/BURST-benchmark.

arxiv情報

著者	Ali Athar,Jonathon Luiten,Paul Voigtlaender,Tarasha Khurana,Achal Dave,Bastian Leibe,Deva Ramanan
発行日	2022-11-22 17:18:39+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

BURST: A Benchmark for Unifying Object Recognition, Segmentation and Tracking in Video

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー