Robust Multimodal Fusion for Human Activity Recognition

要約

異種センサーを搭載した IoT およびモバイルデバイスの急増により、さまざまなモダリティを持つ複数のセンサーによって生成された時系列データの融合に依存する新しいアプリケーションが可能になりました。
マルチモーダルフュージョンのための有望なディープニューラルネットワークアーキテクチャは存在しますが、複数のモダリティ/センサー間で継続的に欠落しているデータやノイズが存在する場合、それらのパフォーマンスは急速に低下します。
これらのデータ品質の問題に対してロバストな人間活動認識 (HAR) のマルチモーダル融合モデルである Centaur を提案します。
Centaur は、畳み込み層を備えたノイズ除去オートエンコーダーであるデータクリーニングモジュールと、クロスセンサー相関をキャプチャする自己注意メカニズムを備えた深い畳み込みニューラルネットワークであるマルチモーダルフュージョンモジュールを組み合わせています。
確率的データ破損スキームを使用してケンタウロスをトレーニングし、複数の慣性測定ユニットによって生成されたデータを含む 3 つのデータセットで評価します。
Centaur のデータクリーニングモジュールは、2 つの最先端のオートエンコーダーベースのモデルよりも優れており、そのマルチモーダルフュージョンモジュールは 4 つの強力なベースラインよりも優れています。
2 つの関連する堅牢なフュージョンアーキテクチャと比較すると、Centaur はより堅牢で、特に複数のセンサーチャネルに連続してデータが欠落している場合に、HAR で 11.59 ～ 17.52% 高い精度を達成しています。

要約(オリジナル)

The proliferation of IoT and mobile devices equipped with heterogeneous sensors has enabled new applications that rely on the fusion of time-series data generated by multiple sensors with different modalities. While there are promising deep neural network architectures for multimodal fusion, their performance falls apart quickly in the presence of consecutive missing data and noise across multiple modalities/sensors, the issues that are prevalent in real-world settings. We propose Centaur, a multimodal fusion model for human activity recognition (HAR) that is robust to these data quality issues. Centaur combines a data cleaning module, which is a denoising autoencoder with convolutional layers, and a multimodal fusion module, which is a deep convolutional neural network with the self-attention mechanism to capture cross-sensor correlation. We train Centaur using a stochastic data corruption scheme and evaluate it on three datasets that contain data generated by multiple inertial measurement units. Centaur’s data cleaning module outperforms 2 state-of-the-art autoencoder-based models and its multimodal fusion module outperforms 4 strong baselines. Compared to 2 related robust fusion architectures, Centaur is more robust, achieving 11.59-17.52% higher accuracy in HAR, especially in the presence of consecutive missing data in multiple sensor channels.

arxiv情報

著者	Sanju Xaviar,Xin Yang,Omid Ardakanian
発行日	2023-03-08 14:56:11+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Robust Multimodal Fusion for Human Activity Recognition

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー