Hand Pose Estimation via Multiview Collaborative Self-Supervised Learning

要約

3D の手の姿勢推定は、近年大きな進歩を遂げました。
ただし、この改善は、大規模な注釈付きデータセットの出現に大きく依存しています。
ラベルに飢えた制限を軽減するために、トレーニング用の疑似ラベルのみで手のポーズを推定するマルチビューの共同自己教師あり学習フレームワーク、HaMuCo を提案します。
ノイズの多いラベルの課題と多視点の「集団思考」の問題に取り組むために、2 段階の戦略を使用します。
最初の段階では、各ビューの 3D ハンドポーズを個別に推定します。
第 2 段階では、クロスビューインタラクションネットワークを使用してクロスビュー相関機能をキャプチャし、マルチビューの一貫性損失を使用してビュー間の共同学習を実現します。
シングルビューとマルチビュー間のコラボレーションをさらに強化するために、すべてのビューの結果を融合してシングルビューネットワークを監視します。
要約すると、クロスビューレベルとマルチビューからシングルビューレベルの 2 つのフォールドで共同学習を紹介します。
広範な実験により、私たちの方法がマルチビューの自己教師付きハンドポーズ推定で最先端のパフォーマンスを達成できることが示されています。
さらに、アブレーション研究により、各コンポーネントの有効性が検証されます。
複数のデータセットの結果は、ネットワークの一般化能力をさらに示しています。

要約(オリジナル)

3D hand pose estimation has made significant progress in recent years. However, the improvement is highly dependent on the emergence of large-scale annotated datasets. To alleviate the label-hungry limitation, we propose a multi-view collaborative self-supervised learning framework, HaMuCo, that estimates hand pose only with pseudo labels for training. We use a two-stage strategy to tackle the noisy label challenge and the multi-view “groupthink” problem. In the first stage, we estimate the 3D hand poses for each view independently. In the second stage, we employ a cross-view interaction network to capture the cross-view correlated features and use multi-view consistency loss to achieve collaborative learning among views. To further enhance the collaboration between single-view and multi-view, we fuse the results of all views to supervise the single-view network. To summarize, we introduce collaborative learning in two folds, the cross-view level and the multi- to single-view level. Extensive experiments show that our method can achieve state-of-the-art performance on multi-view self-supervised hand pose estimation. Moreover, ablation studies verify the effectiveness of each component. Results on multiple datasets further demonstrate the generalization ability of our network.

arxiv情報

著者	Xiaozheng Zheng,Chao Wen,Zhou Xue,Jingyu Wang
発行日	2023-02-02 10:13:04+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Hand Pose Estimation via Multiview Collaborative Self-Supervised Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー