Motion2Language, unsupervised learning of synchronized semantic motion segmentation

要約

この論文では、モーションから言語への翻訳と同期のためのシーケンス間アーキテクチャの構築について調査します。
目的は、モーションキャプチャ入力を英語の自然言語記述に変換し、その記述が実行されたアクションと同期して生成されるようにすることで、副産物としてセマンティックセグメンテーションを可能にしますが、同期されたトレーニングデータは必要ありません。
我々は、同期/ライブテキスト生成に適したローカルアテンションの新しい反復定式化と、より小さなデータと同期生成に適した改良されたモーションエンコーダアーキテクチャを提案します。
KIT モーション言語データセットに対する標準の BLEU4 メトリクスと単純な意味的同等性の尺度を使用して、個々の実験における両方の寄与を評価します。
追跡実験では、提案したアプローチで生成されたテキストの同期の品質を、複数の評価指標を通じて評価します。
アテンションメカニズムとエンコーダアーキテクチャへの貢献の両方が、生成されたテキスト (BLEU と意味的等価性) の品質を付加的に向上させるだけでなく、同期の品質も向上させることがわかりました。
私たちのコードは https://github.com/rd20karim/M2T-Segmentation/tree/main で入手できます。

要約(オリジナル)

In this paper, we investigate building a sequence to sequence architecture for motion to language translation and synchronization. The aim is to translate motion capture inputs into English natural-language descriptions, such that the descriptions are generated synchronously with the actions performed, enabling semantic segmentation as a byproduct, but without requiring synchronized training data. We propose a new recurrent formulation of local attention that is suited for synchronous/live text generation, as well as an improved motion encoder architecture better suited to smaller data and for synchronous generation. We evaluate both contributions in individual experiments, using the standard BLEU4 metric, as well as a simple semantic equivalence measure, on the KIT motion language dataset. In a follow-up experiment, we assess the quality of the synchronization of generated text in our proposed approaches through multiple evaluation metrics. We find that both contributions to the attention mechanism and the encoder architecture additively improve the quality of generated text (BLEU and semantic equivalence), but also of synchronization. Our code is available at https://github.com/rd20karim/M2T-Segmentation/tree/main

arxiv情報

著者	Karim Radouane,Andon Tchechmedjiev,Julien Lagarde,Sylvie Ranwez
発行日	2023-12-13 17:29:15+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Motion2Language, unsupervised learning of synchronized semantic motion segmentation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー