Continuous Sign Language Recognition with Correlation Network

要約

人間の体の軌跡は、映像中の行動を識別するための顕著な手がかりとなる。このような身体の軌跡は、手話では主に連続したフレームをまたぐ手や顔によって伝えられます。しかし、現在の連続手話認識（CSLR）の手法は、通常、フレームを独立して処理するため、手話を効果的に識別するためのフレーム間の軌跡を捉えることができない。この制限を処理するために、我々は、手話を識別するためにフレームをまたぐ体の軌跡を明示的に捕捉し活用する相関ネットワーク（CorrNet）を提案する。具体的には、まず相関モジュールが提案され、現在のフレームと隣接するフレームとの間の相関マップを動的に計算し、すべての空間パッチの軌跡を識別します。次に、これらの相関マップ内の身体軌跡を動的に強調する識別モジュールが提示される。その結果、生成された特徴量は、符号を識別するための局所的な時間的動きの概要を得ることができる。CorrNetは、体の軌跡に特別な注意を払うことで、4つの大規模データセット（PHOENIX14、PHOENIX14-T、CSL-Daily、CSL）において、最先端の精度を達成した。従来の空間-時間推論手法との包括的な比較により、CorrNetの有効性が検証された。ビジュアライゼーションにより、CorrNetが隣接するフレーム間の人体の軌跡を強調する効果を示す。

要約(オリジナル)

Human body trajectories are a salient cue to identify actions in the video. Such body trajectories are mainly conveyed by hands and face across consecutive frames in sign language. However, current methods in continuous sign language recognition (CSLR) usually process frames independently, thus failing to capture cross-frame trajectories to effectively identify a sign. To handle this limitation, we propose correlation network (CorrNet) to explicitly capture and leverage body trajectories across frames to identify signs. In specific, a correlation module is first proposed to dynamically compute correlation maps between the current frame and adjacent frames to identify trajectories of all spatial patches. An identification module is then presented to dynamically emphasize the body trajectories within these correlation maps. As a result, the generated features are able to gain an overview of local temporal movements to identify a sign. Thanks to its special attention on body trajectories, CorrNet achieves new state-of-the-art accuracy on four large-scale datasets, i.e., PHOENIX14, PHOENIX14-T, CSL-Daily, and CSL. A comprehensive comparison with previous spatial-temporal reasoning methods verifies the effectiveness of CorrNet. Visualizations demonstrate the effects of CorrNet on emphasizing human body trajectories across adjacent frames.

arxiv情報

著者	Lianyu Hu,Liqing Gao,Zekang Liu,Wei Feng
発行日	2023-03-06 15:02:12+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Continuous Sign Language Recognition with Correlation Network

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー