MvKeTR: Chest CT Report Generation with Multi-View Perception and Knowledge Enhancement

要約

CT レポート生成 (CTRG) は、3D ボリュームの診断レポートを自動的に生成し、臨床医の作業負荷を軽減し、患者ケアを向上させることを目的としています。
臨床的価値があるにもかかわらず、既存の研究は複数の解剖学的観点からの診断情報を効果的に組み込むことができず、正確で信頼性の高い診断に不可欠な関連する臨床専門知識が不足しています。
これらの制限を解決するために、臨床医の診断ワークフローを模倣する新しいマルチビュー知覚知識強化トランスフォーマー (MvKeTR) を提案します。
放射線科医が最初に複数の平面からの CT スキャンを検査するのと同じように、ビューを意識した注意を備えた Multi-View Perception Aggregator (MVPA) は、複数の解剖学的ビューからの診断情報を効果的に統合します。
次に、放射線科医が診断の意思決定をガイドするために関連する臨床記録をさらに参照する方法にヒントを得て、Cross-Modal Knowledge Enhancer (CMKE) がクエリ量に基づいて最も類似したレポートを取得し、専門知識を診断手順に組み込みます。
さらに、従来の MLP の代わりに、学習可能な非線形活性化関数を備えたコルモゴロフ・アーノルドネットワーク (KAN) を両方のモジュールの基本構成要素として採用し、CT 読影における複雑な診断パターンをより適切に捕捉します。
公開されている CTRG-Chest-548K データセットでの広範な実験により、私たちの手法がすべての指標にわたって以前の最先端のモデルを上回ることが実証されました。

要約(オリジナル)

CT report generation (CTRG) aims to automatically generate diagnostic reports for 3D volumes, relieving clinicians’ workload and improving patient care. Despite clinical value, existing works fail to effectively incorporate diagnostic information from multiple anatomical views and lack related clinical expertise essential for accurate and reliable diagnosis. To resolve these limitations, we propose a novel Multi-view perception Knowledge-enhanced Tansformer (MvKeTR) to mimic the diagnostic workflow of clinicians. Just as radiologists first examine CT scans from multiple planes, a Multi-View Perception Aggregator (MVPA) with view-aware attention effectively synthesizes diagnostic information from multiple anatomical views. Then, inspired by how radiologists further refer to relevant clinical records to guide diagnostic decision-making, a Cross-Modal Knowledge Enhancer (CMKE) retrieves the most similar reports based on the query volume to incorporate domain knowledge into the diagnosis procedure. Furthermore, instead of traditional MLPs, we employ Kolmogorov-Arnold Networks (KANs) with learnable nonlinear activation functions as the fundamental building blocks of both modules to better capture intricate diagnostic patterns in CT interpretation. Extensive experiments on the public CTRG-Chest-548K dataset demonstrate that our method outpaces prior state-of-the-art models across all metrics.

arxiv情報

著者	Xiwei Deng,Xianchun He,Yudan Zhou,Shuhui Cai,Congbo Cai,Zhong Chen
発行日	2024-11-27 12:58:23+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

MvKeTR: Chest CT Report Generation with Multi-View Perception and Knowledge Enhancement

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー