Short and Long Range Relation Based Spatio-Temporal Transformer for Micro-Expression Recognition

要約

自発的であるため、微表情は、たとえそれを隠そうとしても、人の本当の感情を推測するのに役立ちます.
持続時間が短く、強度が低いため、アフェクティブコンピューティングでは微表情の認識は困難な作業です。
いくつかの有望性を示した手作りの時空間機能に基づく初期の研究は、最近、最先端のパフォーマンスを競うさまざまな深層学習アプローチに取って代わられました。
それにもかかわらず、ローカルおよびグローバルな時空間パターンの両方をキャプチャするという問題は依然として困難です。
この目的のために、ここでは、新しい時空間変換アーキテクチャを提案します。これは、私たちの知る限りでは、マイクロ表現認識のための最初の純粋に変換ベースのアプローチ (つまり、畳み込みネットワークを使用しない) です。
このアーキテクチャは、空間パターンを学習する空間エンコーダ、時間次元分析用の時間アグリゲータ、および分類ヘッドで構成されています。
広く使用されている 3 つの自発的なマイクロ発現データセット、すなわち SMIC-HS、CASME II、および SAMM の包括的な評価は、提案されたアプローチが一貫して最新技術を上回っており、マイクロ発現に関する公開された文献の最初のフレームワークであることを示しています。
前述のデータセットのいずれかで 0.9 を超える重み付けされていない F1 スコアを達成するための認識。

要約(オリジナル)

Being spontaneous, micro-expressions are useful in the inference of a person’s true emotions even if an attempt is made to conceal them. Due to their short duration and low intensity, the recognition of micro-expressions is a difficult task in affective computing. The early work based on handcrafted spatio-temporal features which showed some promise, has recently been superseded by different deep learning approaches which now compete for the state of the art performance. Nevertheless, the problem of capturing both local and global spatio-temporal patterns remains challenging. To this end, herein we propose a novel spatio-temporal transformer architecture — to the best of our knowledge, the first purely transformer based approach (i.e. void of any convolutional network use) for micro-expression recognition. The architecture comprises a spatial encoder which learns spatial patterns, a temporal aggregator for temporal dimension analysis, and a classification head. A comprehensive evaluation on three widely used spontaneous micro-expression data sets, namely SMIC-HS, CASME II and SAMM, shows that the proposed approach consistently outperforms the state of the art, and is the first framework in the published literature on micro-expression recognition to achieve the unweighted F1-score greater than 0.9 on any of the aforementioned data sets.

arxiv情報

著者	Liangfei Zhang,Xiaopeng Hong,Ognjen Arandjelovic,Guoying Zhao
発行日	2022-10-10 16:56:49+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Short and Long Range Relation Based Spatio-Temporal Transformer for Micro-Expression Recognition

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー