CheXalign: Preference fine-tuning in chest X-ray interpretation models without human feedback

要約

放射線科医は、医療画像を実用的な報告に変換する上で重要な役割を果たします。
ただし、フィールドは人員配置の不足とワークロードの増加に直面しています。
Vision-Language Models（VLMS）を使用した自動化されたアプローチは、アシスタントとしての可能性を示していますが、非常に高い精度が必要です。
放射線学のほとんどの現在のVLMは、監視された微調整のみに依存しています。
一方、トレーニング後のパイプラインでの追加の優先微調整は、一般的なドメインで標準的な慣行となっています。
放射線学の課題は、大規模に放射線科医のフィードバックを得るための法外なコストにあります。
この課題に対処するために、胸部X線放射線レポートの生成（RRG）に焦点を当てて、優先フィードバックのための自動パイプラインを提案します。
具体的には、この方法では、画像のペアと放射線科医が作成した参照レポートを含む公的に利用可能なデータセットを参照ベースのメトリックまたは審査員とともに活用して、追加の放射線科医のフィードバックの必要性を排除します。
この設定での長さの活用を介して過剰な最適化の報酬を調査し、グリーンスコアの長さ制御バージョンを導入します。
当社の最高のパフォーマンスのセットアップは、RRGタスクのMIMIC-CXRデータセットで最先端のChexbertスコアを達成し、平均して6つの追加の画像認識と推論タスクにわたって堅牢なパフォーマンスを維持します。

要約(オリジナル)

Radiologists play a crucial role in translating medical images into actionable reports. However, the field faces staffing shortages and increasing workloads. While automated approaches using vision-language models (VLMs) show promise as assistants, they require exceptionally high accuracy. Most current VLMs in radiology rely solely on supervised fine-tuning. Meanwhile, additional preference fine-tuning in the post-training pipeline has become standard practice in the general domain. The challenge in radiology lies in the prohibitive cost of obtaining radiologist feedback at scale. To address this challenge, we propose an automated pipeline for preference feedback, focusing on chest X-ray radiology report generation (RRG). Specifically, our method leverages publicly available datasets containing pairs of images and radiologist-written reference reports with reference-based metrics, or Judges, eliminating the need for additional radiologist feedback. We investigate reward overoptimization via length exploitation in this setting and introduce a length-controlled version of the GREEN score. Our best-performing setup achieves state-of-the-art CheXbert scores on the MIMIC-CXR dataset for the RRG task while on average maintaining robust performance across six additional image perception and reasoning tasks.

arxiv情報

著者	Dennis Hein,Zhihong Chen,Sophie Ostmeier,Justin Xu,Maya Varma,Eduardo Pontes Reis,Arne Edward Michalson,Christian Bluethgen,Hyun Joo Shin,Curtis Langlotz,Akshay S Chaudhari
発行日	2025-02-25 12:35:17+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

CheXalign: Preference fine-tuning in chest X-ray interpretation models without human feedback

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー