Improving the quality of Persian clinical text with a novel spelling correction system

要約

背景: 電子医療記録 (EHR) のスペルの正確さは、効率的な臨床ケア、研究、および患者の安全の確保にとって重要な要素です。
ペルシア語は豊富な語彙と複雑な特徴を備えているため、実際の単語の誤り訂正に独特の課題をもたらします。
この研究は、ペルシア語の臨床テキストのスペルミスを検出して修正するための革新的なアプローチを開発することを目的としていました。
方法: 私たちの戦略は、ペルシャ語の臨床領域におけるスペル修正のタスクのために特に細心の注意を払って微調整された最先端の事前トレーニング済みモデルを採用しています。
このモデルは、文字の視覚的類似性を使用して修正候補をランク付けする、革新的な正書法類似性マッチングアルゴリズム PERTO によって補完されています。
結果: 私たちのアプローチの評価により、ペルシア語の臨床テキストにおける単語の誤りの検出と修正における堅牢性と精度が実証されました。
単語以外の誤り訂正に関しては、PERTO アルゴリズムを使用した場合、モデルは 90.0% の F1 スコアを達成しました。
実際のエラー検出では、私たちのモデルは最高のパフォーマンスを実証し、90.6% の F1 スコアを達成しました。
さらに、PERTO アルゴリズムを使用した場合、モデルは実際の誤り訂正で最高の F1 スコア 91.5% に達しました。
結論: 一定の制限があるにもかかわらず、私たちの方法はペルシア語の臨床テキストのスペルミスの検出と修正の分野で大幅な進歩を示しています。
ペルシア語によってもたらされる独特の課題に効果的に対処することで、当社のアプローチはより正確かつ効率的な臨床文書作成への道を切り開き、患者ケアと安全性の向上に貢献します。
将来の研究では、ペルシャの医療領域の他の分野での使用が検討され、その影響と有用性が高まる可能性があります。

要約(オリジナル)

Background: The accuracy of spelling in Electronic Health Records (EHRs) is a critical factor for efficient clinical care, research, and ensuring patient safety. The Persian language, with its abundant vocabulary and complex characteristics, poses unique challenges for real-word error correction. This research aimed to develop an innovative approach for detecting and correcting spelling errors in Persian clinical text. Methods: Our strategy employs a state-of-the-art pre-trained model that has been meticulously fine-tuned specifically for the task of spelling correction in the Persian clinical domain. This model is complemented by an innovative orthographic similarity matching algorithm, PERTO, which uses visual similarity of characters for ranking correction candidates. Results: The evaluation of our approach demonstrated its robustness and precision in detecting and rectifying word errors in Persian clinical text. In terms of non-word error correction, our model achieved an F1-Score of 90.0% when the PERTO algorithm was employed. For real-word error detection, our model demonstrated its highest performance, achieving an F1-Score of 90.6%. Furthermore, the model reached its highest F1-Score of 91.5% for real-word error correction when the PERTO algorithm was employed. Conclusions: Despite certain limitations, our method represents a substantial advancement in the field of spelling error detection and correction for Persian clinical text. By effectively addressing the unique challenges posed by the Persian language, our approach paves the way for more accurate and efficient clinical documentation, contributing to improved patient care and safety. Future research could explore its use in other areas of the Persian medical domain, enhancing its impact and utility.

arxiv情報

著者	Seyed Mohammad Sadegh Dashti,Seyedeh Fatemeh Dashti
発行日	2024-08-07 08:31:42+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Improving the quality of Persian clinical text with a novel spelling correction system

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー