TDAvec: Computing Vector Summaries of Persistence Diagrams for Topological Data Analysis in R and Python

要約

永続性相同性は、複雑なデータの基礎となる形状を理解するためのトポロジーデータ分析（TDA）の広く使用されているツールです。
データポイントから単純化複合体のろ過を構築することにより、複数のスケールにわたって接続されたコンポーネント、ループ、ボイドなどのトポロジー特徴をキャプチャします。
これらの機能は、データのトポロジ構造の簡潔な要約を提供する永続的図（PDS）にエンコードされています。
ただし、PDSの空間の非ヒルベルトの性質は、機械学習アプリケーションで直接使用するための課題をもたらします。
これに対処するために、PDSを機械学習互換形式に変換するためのカーネルメソッドとベクトル化手法が開発されました。
このペーパーでは、PDSのベクトル化を合理化するように設計された新しいソフトウェアパッケージを紹介し、直感的なワークフローと高度な機能を提供します。
実用的な例を通じてパッケージの必要性を実証し、応用TDAへの貢献に関する詳細な議論を提供します。
パッケージで使用されるすべてのベクトル化概要の定義は、付録に含まれています。

要約(オリジナル)

Persistent homology is a widely-used tool in topological data analysis (TDA) for understanding the underlying shape of complex data. By constructing a filtration of simplicial complexes from data points, it captures topological features such as connected components, loops, and voids across multiple scales. These features are encoded in persistence diagrams (PDs), which provide a concise summary of the data’s topological structure. However, the non-Hilbert nature of the space of PDs poses challenges for their direct use in machine learning applications. To address this, kernel methods and vectorization techniques have been developed to transform PDs into machine-learning-compatible formats. In this paper, we introduce a new software package designed to streamline the vectorization of PDs, offering an intuitive workflow and advanced functionalities. We demonstrate the necessity of the package through practical examples and provide a detailed discussion on its contributions to applied TDA. Definitions of all vectorization summaries used in the package are included in the appendix.

arxiv情報

著者	Aleksei Luchinsky,Umar Islambekov
発行日	2025-04-25 13:07:56+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

TDAvec: Computing Vector Summaries of Persistence Diagrams for Topological Data Analysis in R and Python

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー