Frozen Language Model Helps ECG Zero-Shot Learning

要約

心電図 (ECG) は、心臓病の臨床診断を支援する、最も一般的に使用されている非侵襲的で便利な医療モニタリングツールの 1 つです。
最近、深層学習 (DL) 技術、特に自己教師あり学習 (SSL) は、ECG の分類において大きな可能性を示しました。
SSL 事前トレーニングは、微調整後の少量の注釈付きデータのみで、競争力のあるパフォーマンスを達成しました。
ただし、現在の SSL メソッドは、注釈付きデータの可用性に依存しており、微調整データセットに存在しないラベルを予測することはできません。
この課題に対処するために、自動生成された臨床レポートを利用して ECG SSL 事前トレーニングをガイドする最初の作業である Multimodal ECG-Text Self-supervised pre-training (METS) を提案します。
トレーニング可能な ECG エンコーダーと凍結された言語モデルを使用して、ペアの ECG と自動的に機械生成された臨床レポートを別々に埋め込みます。
SSL は、ECG と他のレポートとの類似性を最小限に抑えながら、ペアの ECG と自動生成レポートとの類似性を最大化することを目的としています。
ダウンストリームの分類タスクでは、METS は、ゼロショット分類により、注釈付きデータを使用せずに、注釈付きデータに依存する他の教師あり SSL ベースラインと比較して、約 10% のパフォーマンスの向上を達成します。
さらに、MIT-BIH には事前トレーニング済みのデータセットとは異なるクラスの ECG が含まれているにもかかわらず、METS は MIT-BIH データセットで最高の再現率と F1 スコアを達成しています。
広範な実験により、一般化可能性、有効性、および効率の点で、ECG-Text マルチモーダル自己教師あり学習を使用する利点が実証されています。

要約(オリジナル)

The electrocardiogram (ECG) is one of the most commonly used non-invasive, convenient medical monitoring tools that assist in the clinical diagnosis of heart diseases. Recently, deep learning (DL) techniques, particularly self-supervised learning (SSL), have demonstrated great potential in the classification of ECG. SSL pre-training has achieved competitive performance with only a small amount of annotated data after fine-tuning. However, current SSL methods rely on the availability of annotated data and are unable to predict labels not existing in fine-tuning datasets. To address this challenge, we propose Multimodal ECG-Text Self-supervised pre-training (METS), the first work to utilize the auto-generated clinical reports to guide ECG SSL pre-training. We use a trainable ECG encoder and a frozen language model to embed paired ECG and automatically machine-generated clinical reports separately. The SSL aims to maximize the similarity between paired ECG and auto-generated report while minimize the similarity between ECG and other reports. In downstream classification tasks, METS achieves around 10% improvement in performance without using any annotated data via zero-shot classification, compared to other supervised and SSL baselines that rely on annotated data. Furthermore, METS achieves the highest recall and F1 scores on the MIT-BIH dataset, despite MIT-BIH containing different classes of ECG compared to the pre-trained dataset. The extensive experiments have demonstrated the advantages of using ECG-Text multimodal self-supervised learning in terms of generalizability, effectiveness, and efficiency.

arxiv情報

著者	Jun Li,Che Liu,Sibo Cheng,Rossella Arcucci,Shenda Hong
発行日	2023-03-22 05:01:14+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Frozen Language Model Helps ECG Zero-Shot Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー