Concordance in basal cell carcinoma diagnosis. Building a proper ground truth to train Artificial Intelligence tools

要約

背景: さまざまな基底細胞癌 (BCC) 臨床基準の存在を客観的に検証することはできません。
ダーモスコピー機能を提供することで BCC 診断を説明する人工知能 (AI) ツールをトレーニングするには、適切なグラウンドトゥルースが必要です。
目的: 204 BCC のダーモスコピー基準に関する皮膚科医の間のコンセンサスを決定すること。
グラウンドトゥルースを推論する際の AI ツールのパフォーマンスを分析するため。
方法: 4 人の皮膚科医によるダーモスコピー基準の一致を分析するために、単一センターの診断および前向き研究が実施され、参照標準が導出されました。
主治医が撮影し、遠隔皮膚科経由で送信され、皮膚科医が診断した 1,434 枚のダーモスコピー画像が使用されました。
彼らは遠隔皮膚科学プラットフォーム (2019 ～ 2021 年) からランダムに選択されました。
そのうち 204 件は AI ツールでテストされました。
残りはそれを訓練しました。
1 人の皮膚科医のグラウンドトゥルースを使用してトレーニングされた AI ツールのパフォーマンスと、4 人の皮膚科医のコンセンサスから統計的に推定されたグラウンドトゥルースをマクネマーテストとハミング距離を使用して分析しました。
結果: 皮膚科医は BCC の診断において完全な一致 (Fleiss-Kappa=0.9079) を達成し、生検との高い相関関係 (PPV=0.9670) を達成しました。
ただし、一部のダーモスコピー基準の検出に関しては一致度が低いです。
1 人の皮膚科医のグラウンドトゥルースを使用してトレーニングされた AI ツールのパフォーマンスと、4 人の皮膚科医のコンセンサスから統計的に推定されたグラウンドトゥルースを使用してトレーニングされた AI ツールのパフォーマンスに統計的な違いが見つかりました。
結論: 病変内に存在する BCC パターンを決定するために AI ツールをトレーニングする場合は注意が必要です。
複数の皮膚科医から正確な情報を確立する必要があります。

要約(オリジナル)

Background: The existence of different basal cell carcinoma (BCC) clinical criteria cannot be objectively validated. An adequate ground-truth is needed to train an artificial intelligence (AI) tool that explains the BCC diagnosis by providing its dermoscopic features. Objectives: To determine the consensus among dermatologists on dermoscopic criteria of 204 BCC. To analyze the performance of an AI tool when the ground-truth is inferred. Methods: A single center, diagnostic and prospective study was conducted to analyze the agreement in dermoscopic criteria by four dermatologists and then derive a reference standard. 1434 dermoscopic images have been used, that were taken by a primary health physician, sent via teledermatology, and diagnosed by a dermatologist. They were randomly selected from the teledermatology platform (2019-2021). 204 of them were tested with an AI tool; the remainder trained it. The performance of the AI tool trained using the ground-truth of one dermatologist versus the ground-truth statistically inferred from the consensus of four dermatologists was analyzed using McNemar’s test and Hamming distance. Results: Dermatologists achieve perfect agreement in the diagnosis of BCC (Fleiss-Kappa=0.9079), and a high correlation with the biopsy (PPV=0.9670). However, there is low agreement in detecting some dermoscopic criteria. Statistical differences were found in the performance of the AI tool trained using the ground-truth of one dermatologist versus the ground-truth statistically inferred from the consensus of four dermatologists. Conclusions: Care should be taken when training an AI tool to determine the BCC patterns present in a lesion. Ground-truth should be established from multiple dermatologists.

arxiv情報

著者	Francisca Silva-Clavería,Carmen Serrano,Iván Matas,Amalia Serrano,Tomás Toledo-Pastrana,David Moreno-Ramírez,Begoña Acha
発行日	2024-06-26 10:44:48+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Concordance in basal cell carcinoma diagnosis. Building a proper ground truth to train Artificial Intelligence tools

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー