Duo Streamers: A Streaming Gesture Recognition Framework

要約

リソース制約のシナリオでのジェスチャー認識は、高精度と低遅延を達成する上で大きな課題に直面しています。
この論文で提案されているストリーミングジェスチャー認識フレームワーク、デュオストリーマーは、3段階のスパース認識メカニズム、外部隠された状態を持つRNNライトモデル、特殊なトレーニングとポスト処理パイプラインを通じてこれらの課題に対処し、それによって革新的な進歩を遂げます。
リアルタイムのパフォーマンスと軽量デザイン。
実験結果は、デュオストリーマーが精度メトリックの主流の方法を一致させ、リアルタイム係数を約92.3％削減すること、つまり13倍近くのスピードアップを提供することを示しています。
さらに、フレームワークは、主流モデルと比較して、パラメーターカウントを1/38（アイドル状態）および1/9（ビジー状態）に縮小します。
要約すると、DUOストリーマーは、リソースに制約のあるデバイスでジェスチャー認識をストリーミングするための効率的かつ実用的なソリューションを提供するだけでなく、マルチモーダルで多様なシナリオで拡張アプリケーションのための強固な基盤を置いています。

要約(オリジナル)

Gesture recognition in resource-constrained scenarios faces significant challenges in achieving high accuracy and low latency. The streaming gesture recognition framework, Duo Streamers, proposed in this paper, addresses these challenges through a three-stage sparse recognition mechanism, an RNN-lite model with an external hidden state, and specialized training and post-processing pipelines, thereby making innovative progress in real-time performance and lightweight design. Experimental results show that Duo Streamers matches mainstream methods in accuracy metrics, while reducing the real-time factor by approximately 92.3%, i.e., delivering a nearly 13-fold speedup. In addition, the framework shrinks parameter counts to 1/38 (idle state) and 1/9 (busy state) compared to mainstream models. In summary, Duo Streamers not only offers an efficient and practical solution for streaming gesture recognition in resource-constrained devices but also lays a solid foundation for extended applications in multimodal and diverse scenarios.

arxiv情報

著者	Boxuan Zhu,Sicheng Yang,Zhuo Wang,Haining Liang,Junxiao Shen
発行日	2025-02-25 15:39:52+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Duo Streamers: A Streaming Gesture Recognition Framework

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー