Pose2Room: Understanding 3D Scenes from Human Activities

要約

ウェアラブルIMUセンサーを使用すると、視覚的な入力を必要とせずに、ウェアラブルデバイスから人間のポーズを推定できます〜\cite{von2017sparse}。
この作品では、私たちは質問を提起します：人間の軌道情報だけから現実世界の環境でのオブジェクト構造について推論することができますか？
重要なのは、人間の動きと相互作用がシーン内のオブジェクトに関する強力な情報を提供する傾向があることです。たとえば、座っている人は椅子やソファの存在の可能性を示しています。
この目的のために、P2R-Netを提案して、環境内で観測された入力人間の軌跡に基づいて、クラスカテゴリと方向付けられた3Dバウンディングボックスによって特徴付けられるシーン内のオブジェクトの確率的3Dモデルを学習します。
P2R-Netは、オブジェクトクラスの確率分布と、オブジェクトボックスの深いガウス混合モデルをモデル化し、観察された人間の軌道からオブジェクト構成の複数の多様でありそうなモードのサンプリングを可能にします。
私たちの実験では、P2R-Netが人間の動きの可能性のあるオブジェクトのマルチモーダル分布を効果的に学習し、視覚的な情報がなくても、環境のさまざまなもっともらしいオブジェクト構造を生成できることを示します。
結果は、P2R-NetがPROXデータセットとVirtualHomeプラットフォームのベースラインを一貫して上回っていることを示しています。

要約(オリジナル)

With wearable IMU sensors, one can estimate human poses from wearable devices without requiring visual input~\cite{von2017sparse}. In this work, we pose the question: Can we reason about object structure in real-world environments solely from human trajectory information? Crucially, we observe that human motion and interactions tend to give strong information about the objects in a scene — for instance a person sitting indicates the likely presence of a chair or sofa. To this end, we propose P2R-Net to learn a probabilistic 3D model of the objects in a scene characterized by their class categories and oriented 3D bounding boxes, based on an input observed human trajectory in the environment. P2R-Net models the probability distribution of object class as well as a deep Gaussian mixture model for object boxes, enabling sampling of multiple, diverse, likely modes of object configurations from an observed human trajectory. In our experiments we show that P2R-Net can effectively learn multi-modal distributions of likely objects for human motions, and produce a variety of plausible object structures of the environment, even without any visual information. The results demonstrate that P2R-Net consistently outperforms the baselines on the PROX dataset and the VirtualHome platform.

arxiv情報

著者	Yinyu Nie,Angela Dai,Xiaoguang Han,Matthias Nießner
発行日	2022-07-14 16:20:50+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Pose2Room: Understanding 3D Scenes from Human Activities

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー