Can artificial intelligence predict clinical trial outcomes?

要約

特に腫瘍学や先端治療の分野では、臨床試験の複雑さとコストの増大が、医薬品開発に重大な課題をもたらしています。
この研究では、臨床試験の結果を決定する際に、GPT-3.5、GPT-4、HINT などの大規模言語モデル (LLM) の予測能力を評価します。
ClinicalTrials.gov から厳選された試験のデータセットを活用することで、バランスの取れた精度、特異性、再現率、マシューズ相関係数 (MCC) などの指標を使用してモデルのパフォーマンスを比較します。
結果は、GPT-4o が試験の初期段階で堅牢なパフォーマンスを示し、高い再現率を達成するものの、特異性の限界に直面していることを示しています。
逆に、HINT モデルは、特に後期試験段階での否定的な結果の認識に優れており、多様なエンドポイントにわたってバランスの取れたアプローチを提供します。
非常に複雑であることを特徴とする腫瘍学の治験は、すべてのモデルにとって依然として困難です。
さらに、試験期間と疾患カテゴリーも予測性能に影響を与え、期間が長くなると新生物などの複雑な疾患により精度が低下します。
この研究では、LLM と HINT の相補的な強みに焦点を当て、臨床試験の設計とリスク管理のための予測ツールの最適化に関する洞察を提供します。
LLM の今後の進歩は、否定的な結果や複雑な領域の処理における現在のギャップに対処するために不可欠です。

要約(オリジナル)

The increasing complexity and cost of clinical trials, particularly in the context of oncology and advanced therapies, pose significant challenges for drug development. This study evaluates the predictive capabilities of large language models (LLMs) such as GPT-3.5, GPT-4, and HINT in determining clinical trial outcomes. By leveraging a curated dataset of trials from ClinicalTrials.gov, we compare the models’ performance using metrics including balanced accuracy, specificity, recall, and Matthews Correlation Coefficient (MCC). Results indicate that GPT-4o demonstrates robust performance in early trial phases, achieving high recall but facing limitations in specificity. Conversely, the HINT model excels in recognizing negative outcomes, particularly in later trial phases, offering a balanced approach across diverse endpoints. Oncology trials, characterized by high complexity, remain challenging for all models. Additionally, trial duration and disease categories influence predictive performance, with longer durations and complex diseases such as neoplasms reducing accuracy. This study highlights the complementary strengths of LLMs and HINT, providing insights into optimizing predictive tools for clinical trial design and risk management. Future advancements in LLMs are essential to address current gaps in handling negative outcomes and complex domains.

arxiv情報

著者	Shuyi Jin,Lu Chen,Hongru Ding,Meijie Wang,Lun Yu
発行日	2024-11-26 17:05:27+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Can artificial intelligence predict clinical trial outcomes?

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー