ViewBirdiformer: Learning to recover ground-plane crowd trajectories and ego-motion from a single ego-centric view

要約

観察された自我中心のビデオから、群衆の歩行者と同じ群衆内の観察者の地上面の軌跡を回復するタスクである、ビューの鳥類化のための新しい学習ベースの方法を紹介します。
静的な背景が見えにくく、確実に追跡できない密集した群衆の中で移動ロボットのナビゲーションとローカリゼーションを行うには、ビューの鳥瞰図が不可欠になります。
主に 2 つの理由から困難です。
i) 歩行者の絶対的な軌跡は、自我中心のビデオで観測された相対的な動きから分離する必要がある観察者の動きと絡み合っており、ii) 歩行者の動きの相互作用を記述する群集運動モデルはシーンに固有のものであり、まだ不明です。
アプリオリ。
このために、ViewBirdiformer と呼ばれる Transformer ベースのネットワークを導入します。これは、自己注意によって群衆の動きを暗黙的にモデル化し、ビュー間の相互注意によって群衆とカメラの地表平面の軌跡に相対的な 2D 移動観測を分解します。
最も重要なのは、ViewBirdiformer が 1 回のフォワードパスでビューの鳥類化を実現し、正確なリアルタイムの常時オンの状況認識への扉を開くことです。
広範な実験結果は、ViewBirdiformer が最先端技術と同等またはそれ以上の精度を達成し、実行時間を 3 桁短縮することを示しています。

要約(オリジナル)

We introduce a novel learning-based method for view birdification, the task of recovering ground-plane trajectories of pedestrians of a crowd and their observer in the same crowd just from the observed ego-centric video. View birdification becomes essential for mobile robot navigation and localization in dense crowds where the static background is hard to see and reliably track. It is challenging mainly for two reasons; i) absolute trajectories of pedestrians are entangled with the movement of the observer which needs to be decoupled from their observed relative movements in the ego-centric video, and ii) a crowd motion model describing the pedestrian movement interactions is specific to the scene yet unknown a priori. For this, we introduce a Transformer-based network referred to as ViewBirdiformer which implicitly models the crowd motion through self-attention and decomposes relative 2D movement observations onto the ground-plane trajectories of the crowd and the camera through cross-attention between views. Most important, ViewBirdiformer achieves view birdification in a single forward pass which opens the door to accurate real-time, always-on situational awareness. Extensive experimental results demonstrate that ViewBirdiformer achieves accuracy similar to or better than state-of-the-art with three orders of magnitude reduction in execution time.

arxiv情報

著者	Mai Nishimura,Shohei Nobuhara,Ko Nishino
発行日	2022-10-12 15:53:04+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

ViewBirdiformer: Learning to recover ground-plane crowd trajectories and ego-motion from a single ego-centric view

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー