Forecasting from Clinical Textual Time Series: Adaptations of the Encoder and Decoder Language Model Families

要約

臨床症例報告は、構造化されたデータに依存する従来の機械学習方法によってしばしば露出度が低い豊富な時間的患者の軌跡をコードします。
この作業では、テキストの時系列から予測問題を紹介します。ここでは、LLM支援の注釈パイプラインを介して抽出された臨床所見を予測の主要な入力として描写しました。
イベントの発生予測、時間的順序付け、生存分析のタスクで、微調整されたデコーダーベースの大手言語モデルとエンコーダーベースのトランスを含む多様なモデルスイートを体系的に評価します。
私たちの実験では、エンコーダーベースのモデルが一貫してより高いF1スコアと短距離および長距離イベントの予測に優れた時間的一致を達成し、微調整されたマスキングアプローチはランキングパフォーマンスを向上させることが明らかになりました。
対照的に、命令チューニングされたデコーダーモデルは、特に早期予後の設定で、生存分析において相対的な利点を示しています。
私たちの感度分析は、LLMが古典的に訓練されているテキスト入力の形式であるテキスト順序と比較して、臨床時系列構造を必要とする時間順序の重要性をさらに実証します。
これは、時間式のコーパスから確認できる追加の利点を強調し、広範囲にわたるLLM使用の時代の時間的タスクに影響を与えます。

要約(オリジナル)

Clinical case reports encode rich, temporal patient trajectories that are often underexploited by traditional machine learning methods relying on structured data. In this work, we introduce the forecasting problem from textual time series, where timestamped clinical findings–extracted via an LLM-assisted annotation pipeline–serve as the primary input for prediction. We systematically evaluate a diverse suite of models, including fine-tuned decoder-based large language models and encoder-based transformers, on tasks of event occurrence prediction, temporal ordering, and survival analysis. Our experiments reveal that encoder-based models consistently achieve higher F1 scores and superior temporal concordance for short- and long-horizon event forecasting, while fine-tuned masking approaches enhance ranking performance. In contrast, instruction-tuned decoder models demonstrate a relative advantage in survival analysis, especially in early prognosis settings. Our sensitivity analyses further demonstrate the importance of time ordering, which requires clinical time series construction, as compared to text ordering, the format of the text inputs that LLMs are classically trained on. This highlights the additional benefit that can be ascertained from time-ordered corpora, with implications for temporal tasks in the era of widespread LLM use.

arxiv情報

著者	Shahriar Noroozizadeh,Sayantan Kumar,Jeremy C. Weiss
発行日	2025-04-14 15:48:56+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Forecasting from Clinical Textual Time Series: Adaptations of the Encoder and Decoder Language Model Families

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー