Learning Explicit Contact for Implicit Reconstruction of Hand-held Objects from Monocular Images

要約

単眼の RGB 画像から手持ちのオブジェクトを再構成することは、魅力的ではありますが、困難な作業です。
このタスクでは、手と物体の接触が、手持ちの物体の 3D ジオメトリを復元するための重要な手がかりとなります。
最近の作品では、暗黙的な関数を使用して目覚ましい進歩を遂げていますが、フレームワーク内での接触の定式化が無視されているため、現実的ではないオブジェクトメッシュが生成されます。
この研究では、手持ちオブジェクトの暗黙的な再構成に利益をもたらす、明示的な方法で接触をモデル化する方法を検討します。
私たちの方法は、明示的な接触予測と暗黙的な形状再構成という 2 つのコンポーネントで構成されます。
最初の部分では、単一の画像から 3D 手と物体の接触を直接推定する新しいサブタスクを提案します。
部品レベルおよび頂点レベルのグラフベースの変換器はカスケード接続され、粗い方法から細かい方法まで共同で学習されるため、より正確な接触確率が得られます。
2 番目の部分では、推定された接触状態をハンドメッシュ表面から近くの 3D 空間に拡散し、拡散された接触確率を利用して操作対象オブジェクトの暗黙的なニューラル表現を構築する新しい方法を紹介します。
手と物体の間の相互作用パターンを推定することを利用して、私たちの方法は、特に手と接触している物体の部分について、より現実的な物体メッシュを再構築できます。
困難なベンチマークに関する広範な実験により、提案された手法が現在の技術水準を大幅に上回ることが示されました。
私たちのコードは https://junxinghu.github.io/projects/hoi.html で公開されています。

要約(オリジナル)

Reconstructing hand-held objects from monocular RGB images is an appealing yet challenging task. In this task, contacts between hands and objects provide important cues for recovering the 3D geometry of the hand-held objects. Though recent works have employed implicit functions to achieve impressive progress, they ignore formulating contacts in their frameworks, which results in producing less realistic object meshes. In this work, we explore how to model contacts in an explicit way to benefit the implicit reconstruction of hand-held objects. Our method consists of two components: explicit contact prediction and implicit shape reconstruction. In the first part, we propose a new subtask of directly estimating 3D hand-object contacts from a single image. The part-level and vertex-level graph-based transformers are cascaded and jointly learned in a coarse-to-fine manner for more accurate contact probabilities. In the second part, we introduce a novel method to diffuse estimated contact states from the hand mesh surface to nearby 3D space and leverage diffused contact probabilities to construct the implicit neural representation for the manipulated object. Benefiting from estimating the interaction patterns between the hand and the object, our method can reconstruct more realistic object meshes, especially for object parts that are in contact with hands. Extensive experiments on challenging benchmarks show that the proposed method outperforms the current state of the arts by a great margin. Our code is publicly available at https://junxinghu.github.io/projects/hoi.html.

arxiv情報

著者	Junxing Hu,Hongwen Zhang,Zerui Chen,Mengcheng Li,Yunlong Wang,Yebin Liu,Zhenan Sun
発行日	2024-01-16 08:10:46+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Learning Explicit Contact for Implicit Reconstruction of Hand-held Objects from Monocular Images

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー