Aggregation Schemes for Single-Vector WSI Representation Learning in Digital Pathology

要約

計算病理学で全体のスライド画像（WSI）を効率的に統合するための重要なステップは、単一の高品質の特徴ベクトル、つまり1つの埋め込みを各WSIに割り当てることです。
多くの事前に訓練された深いニューラルネットワークの存在と基礎モデルの出現に伴い、サブイメージ（つまり、タイルまたはパッチ）の埋め込みを抽出することは簡単です。
ただし、WSIの場合、高解像度とギガピクセルの性質を考慮して、単一の画像として既存のGPUに入力することは実行不可能です。
その結果、WSIは通常、多くのパッチに分割されます。
各パッチを事前に訓練したモデルに送り、各WSIはパッチのセットで表すことができます。したがって、埋め込みセットです。
したがって、このようなセットアップでは、WSI表現学習は、各WSIのパッチ埋め込みセットにアクセスできる場所で表現学習を設定するために減少します。
各WSIのパッチ埋め込みのセットから単一の埋め込みを取得するために、文献には複数のセットベースの学習スキームが提案されています。
このホワイトペーパーでは、単純な平均または最大プーリング操作、ディープセット、メモリネットワーク、焦点注意、ガウス混合モデル（GMM）フィッシャーベクトル、およびフィッシャーベクター、および漁師ベクトル、および複数の平均プーリング操作など、最近開発された複数の集約技術（主にセット表現学習手法）のWSI検索パフォーマンスを評価します。
TCGAの膀胱、乳房、腎臓、結腸を含む4つの異なる一次部位の深いスパースおよびバイナリフィッシャーベクター。
さらに、WSI検索に使用される非凝集アプローチであるパッチ埋め込みの最小距離の中央値に対して、これらのメソッドの検索パフォーマンスをベンチマークします。

要約(オリジナル)

A crucial step to efficiently integrate Whole Slide Images (WSIs) in computational pathology is assigning a single high-quality feature vector, i.e., one embedding, to each WSI. With the existence of many pre-trained deep neural networks and the emergence of foundation models, extracting embeddings for sub-images (i.e., tiles or patches) is straightforward. However, for WSIs, given their high resolution and gigapixel nature, inputting them into existing GPUs as a single image is not feasible. As a result, WSIs are usually split into many patches. Feeding each patch to a pre-trained model, each WSI can then be represented by a set of patches, hence, a set of embeddings. Hence, in such a setup, WSI representation learning reduces to set representation learning where for each WSI we have access to a set of patch embeddings. To obtain a single embedding from a set of patch embeddings for each WSI, multiple set-based learning schemes have been proposed in the literature. In this paper, we evaluate the WSI search performance of multiple recently developed aggregation techniques (mainly set representation learning techniques) including simple average or max pooling operations, Deep Sets, Memory networks, Focal attention, Gaussian Mixture Model (GMM) Fisher Vector, and deep sparse and binary Fisher Vector on four different primary sites including bladder, breast, kidney, and Colon from TCGA. Further, we benchmark the search performance of these methods against the median of minimum distances of patch embeddings, a non-aggregating approach used for WSI retrieval.

arxiv情報

著者	Sobhan Hemati,Ghazal Alabtah,Saghir Alfasly,H. R. Tizhoosh
発行日	2025-01-29 18:14:51+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Aggregation Schemes for Single-Vector WSI Representation Learning in Digital Pathology

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー