TrackFlow: Multi-Object Tracking with Normalizing Flows


これを考慮して、我々は、検出による追跡をマルチモーダル設定に拡張することを目指しており、そこでは、2D モーション キュー、視覚的外観、姿勢推定などの異種情報から包括的なコストを計算する必要があります。
より正確には、3D 情報の大まかな推定も利用可能であり、他の従来の指標 (IoU など) と統合する必要があるケース スタディに従います。
ただし、i) ホールドアウト セットで調整されたハイパーパラメータを慎重に調整する必要があり、ii) これらのコストが独立していることを意味しますが、これは現実には当てはまりません。


The field of multi-object tracking has recently seen a renewed interest in the good old schema of tracking-by-detection, as its simplicity and strong priors spare it from the complex design and painful babysitting of tracking-by-attention approaches. In view of this, we aim at extending tracking-by-detection to multi-modal settings, where a comprehensive cost has to be computed from heterogeneous information e.g., 2D motion cues, visual appearance, and pose estimates. More precisely, we follow a case study where a rough estimate of 3D information is also available and must be merged with other traditional metrics (e.g., the IoU). To achieve that, recent approaches resort to either simple rules or complex heuristics to balance the contribution of each cost. However, i) they require careful tuning of tailored hyperparameters on a hold-out set, and ii) they imply these costs to be independent, which does not hold in reality. We address these issues by building upon an elegant probabilistic formulation, which considers the cost of a candidate association as the negative log-likelihood yielded by a deep density estimator, trained to model the conditional joint probability distribution of correct associations. Our experiments, conducted on both simulated and real benchmarks, show that our approach consistently enhances the performance of several tracking-by-detection algorithms.


著者 Gianluca Mancusi,Aniello Panariello,Angelo Porrello,Matteo Fabbri,Simone Calderara,Rita Cucchiara
発行日 2023-08-22 15:40:03+00:00
