GCN-DevLSTM: Path Development for Skeleton-Based Action Recognition

要約

ビデオにおけるスケルトンベースのアクション認識 (SAR) は、コンピュータービジョンにおいて重要ですが困難なタスクです。
SAR 用の最近の最先端モデルは、主にグラフ畳み込みニューラルネットワーク (GCN) に基づいており、スケルトンデータの空間情報の抽出に強力です。
しかし、そのような GCN ベースのモデルが人間の一連の行動の時間的ダイナミクスを効果的に捕捉できることはまだ明らかです。
この目的を達成するために、我々はパス展開、つまりリー群構造を活用することによる逐次データの原則に基づいた倹約表現を活用する DevLSTM モジュールを提案します。
ラフパス理論に由来するパス開発は、大幅な次元削減により高次元ストリームデータ内のイベントの順序を効果的にキャプチャすることができ、その結果 LSTM モジュールを大幅に強化できます。
私たちが提案する G-DevLSTM モジュールは、時間グラフに簡単にプラグインでき、既存の高度な GCN ベースのモデルを補完します。
NTU60、NTU120、および Chalearn2013 データセットに関する実証研究では、提案したハイブリッドモデルが SAR タスクで現在最高のパフォーマンスを発揮する手法を大幅に上回ることが実証されています。
コードは https://github.com/DeepIntoStreams/GCN-DevLSTM で入手できます。

要約(オリジナル)

Skeleton-based action recognition (SAR) in videos is an important but challenging task in computer vision. The recent state-of-the-art models for SAR are primarily based on graph convolutional neural networks (GCNs), which are powerful in extracting the spatial information of skeleton data. However, it is yet clear that such GCN-based models can effectively capture the temporal dynamics of human action sequences. To this end, we propose the DevLSTM module, which exploits the path development — a principled and parsimonious representation for sequential data by leveraging the Lie group structure. The path development, originated from Rough path theory, can effectively capture the order of events in high-dimensional stream data with massive dimension reduction and consequently enhance the LSTM module substantially. Our proposed G-DevLSTM module can be conveniently plugged into the temporal graph, complementing existing advanced GCN-based models. Our empirical studies on the NTU60, NTU120 and Chalearn2013 datasets demonstrate that our proposed hybrid model significantly outperforms the current best-performing methods in SAR tasks. The code is available at https://github.com/DeepIntoStreams/GCN-DevLSTM.

arxiv情報

著者	Lei Jiang,Weixin Yang,Xin Zhang,Hao Ni
発行日	2024-03-22 13:55:52+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

GCN-DevLSTM: Path Development for Skeleton-Based Action Recognition

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー