Are we using appropriate segmentation metrics? Identifying correlates of human expert perception for CNN training beyond rolling the DICE coefficient

要約

タイトル：適切なセグメンテーションメトリックを使用していますか？ダイス係数を超えたCNNトレーニングのための人間の専門家の知覚との相関を特定する

要約：

– 複雑な機械学習タスクで最適化されたメトリックは、しばしばアドホックに選択されます。
– 人間の専門家の知覚とどのように一致するかは不明です。
– 2つの複雑なバイオメディカルセマンティックセグメンテーション問題のために、定量的なセグメンテーション品質メトリックと専門家による定性的評価の相関を探究します。
– 現在の標準的なメトリックと損失関数は、専門家のセグメンテーション品質評価と中程度に相関します。
– 重要なのは、この効果は、脳磁気共鳴のグリオーマの高度な腫瘍区画や超音波画像の灰色物質など、臨床的に重要な構造に特に顕著に現れることです。
– 人間の専門家の知覚などの抽象的なメトリックをどのように最適化するかはしばしば不明です。
– この課題に対処するため、古典的な統計技術を利用して補完的な複合損失関数を作成する新しい戦略を提案します。
– すべての評価実験で、人間の専門家はコンピュータ生成のセグメンテーションを人間が作成したリファレンスラベルよりも優れて評価しました。
– そのため、私たちの結果は、医療画像セグメンテーションの現在の多くの実践を強く問い、将来の研究に有意義な手がかりを提供します。

要約(オリジナル)

Metrics optimized in complex machine learning tasks are often selected in an ad-hoc manner. It is unknown how they align with human expert perception. We explore the correlations between established quantitative segmentation quality metrics and qualitative evaluations by professionally trained human raters. Therefore, we conduct psychophysical experiments for two complex biomedical semantic segmentation problems. We discover that current standard metrics and loss functions correlate only moderately with the segmentation quality assessment of experts. Importantly, this effect is particularly pronounced for clinically relevant structures, such as the enhancing tumor compartment of glioma in brain magnetic resonance and grey matter in ultrasound imaging. It is often unclear how to optimize abstract metrics, such as human expert perception, in convolutional neural network (CNN) training. To cope with this challenge, we propose a novel strategy employing techniques of classical statistics to create complementary compound loss functions to better approximate human expert perception. Across all rating experiments, human experts consistently scored computer-generated segmentations better than the human-curated reference labels. Our results, therefore, strongly question many current practices in medical image segmentation and provide meaningful cues for future research.

arxiv情報

著者	Florian Kofler,Ivan Ezhov,Fabian Isensee,Fabian Balsiger,Christoph Berger,Maximilian Koerner,Beatrice Demiray,Julia Rackerseder,Johannes Paetzold,Hongwei Li,Suprosanna Shit,Richard McKinley,Marie Piraud,Spyridon Bakas,Claus Zimmer,Nassir Navab,Jan Kirschke,Benedikt Wiestler,Bjoern Menze
発行日	2023-05-02 13:42:03+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, OpenAI

Are we using appropriate segmentation metrics? Identifying correlates of human expert perception for CNN training beyond rolling the DICE coefficient

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー