Evaluating the temporal understanding of neural networks on event-based action recognition with DVS-Gesture-Chain

要約

人工ニューラルネットワーク (ANN) が視覚タスクで時間的な理解を行えるようにすることは、ビデオシーケンスの完全な認識を達成するために不可欠な要件です。
従来のフレームベースのビデオシーケンスを使用する場合に、このような機能を評価できるように、さまざまなベンチマークデータセットが利用可能です。
対照的に、ニューロモルフィックデータをターゲットとするシステムの評価は、適切なデータセットが不足しているため、依然として課題です。
この作業では、イベントベースのビデオシーケンスにおけるアクション認識の新しいベンチマークタスクである DVS-Gesture-Chain (DVS-GC) を定義します。これは、広く使用されている DVS-Gesture データセットからの複数のジェスチャの一時的な組み合わせに基づいています。
この方法論により、時間次元で任意に複雑なデータセットを作成できます。
新しく定義したタスクを使用して、さまざまなフィードフォワード畳み込み ANN と畳み込みスパイキングニューラルネットワーク (SNN) の時空間理解を評価します。
私たちの研究は、イベントの順序の理解を必要とする新しい DVS-GC とは異なり、元の DVS Gesture ベンチマークが一時的な理解なしにネットワークによってどのように解決できるかを証明しています。
そこから、スパイクニューロンや時間依存の重みなどの特定の要素が、再帰的な接続を必要とせずにフィードフォワードネットワークで時間的な理解をどのように可能にするかを示す研究を提供します。
コードは https://github.com/VicenteAlex/DVS-Gesture-Chain で入手できます。

要約(オリジナル)

Enabling artificial neural networks (ANNs) to have temporal understanding in visual tasks is an essential requirement in order to achieve complete perception of video sequences. A wide range of benchmark datasets is available to allow for the evaluation of such capabilities when using conventional frame-based video sequences. In contrast, evaluating them for systems targeting neuromorphic data is still a challenge due to the lack of appropriate datasets. In this work we define a new benchmark task for action recognition in event-based video sequences, DVS-Gesture-Chain (DVS-GC), which is based on the temporal combination of multiple gestures from the widely used DVS-Gesture dataset. This methodology allows to create datasets that are arbitrarily complex in the temporal dimension. Using our newly defined task, we evaluate the spatio-temporal understanding of different feed-forward convolutional ANNs and convolutional Spiking Neural Networks (SNNs). Our study proves how the original DVS Gesture benchmark could be solved by networks without temporal understanding, unlike the new DVS-GC which demands an understanding of the ordering of events. From there, we provide a study showing how certain elements such as spiking neurons or time-dependent weights allow for temporal understanding in feed-forward networks without the need for recurrent connections. Code available at: https://github.com/VicenteAlex/DVS-Gesture-Chain

arxiv情報

著者	Alex Vicente-Sola,Davide L. Manna,Paul Kirkland,Gaetano Di Caterina,Trevor Bihl
発行日	2022-09-29 16:22:46+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Evaluating the temporal understanding of neural networks on event-based action recognition with DVS-Gesture-Chain

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー