Reproducible Machine Learning-based Voice Pathology Detection: Introducing the Pitch Difference Feature

要約

本研究では、公開されているSaarbr’ucken Voice Database (SVD)データベースと、一般的に使用されている音響ハンドクラフト特徴量と、ピッチ差（基本周波数の相対変動）とNaN特徴量（基本周波数の推定失敗）の2つの新規特徴量を組み合わせたロバストな特徴セットを使用した、音声病理検出のための新しい手法を紹介する。サポートベクターマシン、k-nearest neighbors、ナイーブベイズ、決定木、ランダムフォレスト、AdaBoostの6つの機械学習（ML）分類器を評価し、選択した分類器と20480種類の特徴サブセットの実現可能なハイパーパラメータをグリッド検索した。各分類器タイプについて、上位1000の分類器と特徴サブセットの組み合わせが、層化クロスバリデーションの繰り返しによって検証される。クラスの不均衡に対処するため、K-Means SMOTEを適用して学習データを増強する。我々のアプローチは優れた性能を達成し、女性、男性、および複合結果に対してそれぞれ85.61%、84.69%、および85.22%の非加重平均想起率（UAR）を達成した。精度は不均衡なデータに対して非常に偏った指標であるため、意図的に省略しています。この進歩は、ML手法の臨床展開の大きな可能性を示しており、音声病理の客観的な検査のための貴重な支援ツールを提供する。我々の方法論をより簡単に使用できるようにし、我々の主張をサポートするために、DOI 10.5281/zenodo.13771573のGitHubリポジトリを公開する。最後に、我々のアプローチの可読性、再現性、正当性を高めるために、REFORMSチェックリストを提供する。

要約(オリジナル)

This study introduces a novel methodology for voice pathology detection using the publicly available Saarbr\’ucken Voice Database (SVD) database and a robust feature set combining commonly used acoustic handcrafted features with two novel ones: pitch difference (relative variation in fundamental frequency) and a NaN feature (failed fundamental frequency estimation). We evaluate six machine learning (ML) classifiers – support vector machine, k-nearest neighbors, naive Bayes, decision tree, random forest, and AdaBoost – using grid search for feasible hyperparameters of selected classifiers and 20480 different feature subsets. Top 1000 classifier-feature subset combinations for each classifier type are validated with repeated stratified cross-validation. To address class imbalance, we apply K-Means SMOTE to augment the training data. Our approach achieves outstanding performance, reaching 85.61%, 84.69% and 85.22% unweighted average recall (UAR) for females, males and combined results respectivelly. We intentionally omit accuracy as it is a highly biased metric for imbalanced data. This advancement demonstrates significant potential for clinical deployment of ML methods, offering a valuable supportive tool for an objective examination of voice pathologies. To enable an easier use of our methodology and to support our claims, we provide a publicly available GitHub repository with DOI 10.5281/zenodo.13771573. Finally, we provide a REFORMS checklist to enhance readability, reproducibility and justification of our approach.

arxiv情報

著者	Jan Vrba,Jakub Steinbach,Tomáš Jirsa,Laura Verde,Roberta De Fazio,Yuwen Zeng,Kei Ichiji,Lukáš Hájek,Zuzana Sedláková,Zuzana Urbániová,Martin Chovanec,Jan Mareš,Noriyasu Homma
発行日	2025-02-03 12:57:09+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Reproducible Machine Learning-based Voice Pathology Detection: Introducing the Pitch Difference Feature

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー