SMART-Vision: Survey of Modern Action Recognition Techniques in Vision

要約

人間動作認識 (HAR) は、ビデオ内の個人の動きの時空間ダイナミクスを分析することによって複雑なパターンを認識することを含む、コンピュータービジョンにおける挑戦的な分野です。
これらのパターンはビデオフレームなどの連続データで発生し、単一の画像では曖昧になるアクションを正確に区別するために不可欠であることがよくあります。
HAR は、ロボット工学や監視システムからスポーツ動作分析、ヘルスケア、自動運転車の急成長分野に至るまで、その幅広い応用性により大きな関心を集めています。
調査における HAR アプローチを分類するためにいくつかの分類法が提案されていますが、それらはハイブリッド手法を見落としていることが多く、さまざまなモデルがさまざまなアーキテクチャやモダリティをどのように組み込むかを実証できていません。
この包括的な調査では、新しい SMART-Vision 分類法を紹介します。これは、HAR 向けの深層学習のイノベーションがどのように相互補完し、従来のカテゴリーを超えたハイブリッドアプローチにつながるかを示しています。
私たちの調査は、基礎的な HAR 作業から現在の最先端システムに至る明確なロードマップを提供し、新たな研究の方向性を強調し、HAR ドメイン内のアーキテクチャに関するディスカッションセクションでの未解決の課題に対処します。
HAR アプローチの良さを測定および比較するためにさまざまなアプローチが使用した研究データセットの詳細を提供します。
また、テスト中に未知の新規クラスのサンプルを提示することで HAR システムに挑戦する、Open-HAR システムという急速に台頭している分野についても調査します。

要約(オリジナル)

Human Action Recognition (HAR) is a challenging domain in computer vision, involving recognizing complex patterns by analyzing the spatiotemporal dynamics of individuals’ movements in videos. These patterns arise in sequential data, such as video frames, which are often essential to accurately distinguish actions that would be ambiguous in a single image. HAR has garnered considerable interest due to its broad applicability, ranging from robotics and surveillance systems to sports motion analysis, healthcare, and the burgeoning field of autonomous vehicles. While several taxonomies have been proposed to categorize HAR approaches in surveys, they often overlook hybrid methodologies and fail to demonstrate how different models incorporate various architectures and modalities. In this comprehensive survey, we present the novel SMART-Vision taxonomy, which illustrates how innovations in deep learning for HAR complement one another, leading to hybrid approaches beyond traditional categories. Our survey provides a clear roadmap from foundational HAR works to current state-of-the-art systems, highlighting emerging research directions and addressing unresolved challenges in discussion sections for architectures within the HAR domain. We provide details of the research datasets that various approaches used to measure and compare goodness HAR approaches. We also explore the rapidly emerging field of Open-HAR systems, which challenges HAR systems by presenting samples from unknown, novel classes during test time.

arxiv情報

著者	Ali K. AlShami,Ryan Rabinowitz,Khang Lam,Yousra Shleibik,Melkamu Mersha,Terrance Boult,Jugal Kalita
発行日	2025-01-22 18:21:55+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SMART-Vision: Survey of Modern Action Recognition Techniques in Vision

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー