T4DT: Tensorizing Time for Learning Temporal 3D Visual Data

要約

2D ラスターイメージとは異なり、3D ビジュアルデータ処理の主要な表現は 1 つではありません。
点群、メッシュ、陰関数などのさまざまな形式には、それぞれ長所と短所があります。
それでも、符号付き距離関数などのグリッド表現には、3D でも魅力的な特性があります。
特に、それらは一定時間のランダムアクセスを提供し、最新の機械学習に非常に適しています。
残念ながら、グリッドのストレージサイズは、その次元とともに指数関数的に増加します。
したがって、中程度の解像度であってもメモリの制限を超えることがよくあります。
この作業では、時変 3D データを圧縮するために、タッカー、テンソルトレイン、量子テンソルトレイン分解など、さまざまな低ランクテンソル形式を調査します。
私たちの方法は、各フレームの切り捨てられた符号付き距離関数を反復的に計算、ボクセル化、および圧縮し、テンソルランクの切り捨てを適用して、すべてのフレームを 4D シーン全体を表す単一の圧縮されたテンソルに圧縮します。
低ランクのテンソル圧縮は、時変の符号付き距離関数を格納およびクエリするのに非常にコンパクトであることを示します。
驚くほど幾何学的な品質を維持しながら、4D シーンのメモリフットプリントを大幅に削減します。
DeepSDF や NeRF などの既存の反復学習ベースのアプローチとは異なり、私たちの方法は、理論的な保証のある閉形式アルゴリズムを使用します。

要約(オリジナル)

Unlike 2D raster images, there is no single dominant representation for 3D visual data processing. Different formats like point clouds, meshes, or implicit functions each have their strengths and weaknesses. Still, grid representations such as signed distance functions have attractive properties also in 3D. In particular, they offer constant-time random access and are eminently suitable for modern machine learning. Unfortunately, the storage size of a grid grows exponentially with its dimension. Hence they often exceed memory limits even at moderate resolution. This work explores various low-rank tensor formats, including the Tucker, tensor train, and quantics tensor train decompositions, to compress time-varying 3D data. Our method iteratively computes, voxelizes, and compresses each frame’s truncated signed distance function and applies tensor rank truncation to condense all frames into a single, compressed tensor that represents the entire 4D scene. We show that low-rank tensor compression is extremely compact to store and query time-varying signed distance functions. It significantly reduces the memory footprint of 4D scenes while surprisingly preserving their geometric quality. Unlike existing iterative learning-based approaches like DeepSDF and NeRF, our method uses a closed-form algorithm with theoretical guarantees.

arxiv情報

著者	Mikhail Usvyatsov,Rafael Ballester-Rippoll,Lina Bashaeva,Konrad Schindler,Gonzalo Ferrer,Ivan Oseledets
発行日	2022-08-02 12:57:08+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

T4DT: Tensorizing Time for Learning Temporal 3D Visual Data

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー