Demystifying Domain-adaptive Post-training for Financial LLMs

要約

大規模言語モデル (LLM) のドメイン適応型ポストトレーニングは、医療や金融などの特殊なドメインに対する有望なアプローチとして浮上しています。
ただし、さまざまなデータとモデル構成にわたって最適な適応基準とトレーニング戦略を特定するには、依然として大きな課題が残っています。
これらの課題に対処するために、金融ドメイン向けの LLM のドメイン適応型事後トレーニングに関する体系的かつ詳細な調査である FINDAP を導入します。
当社のアプローチは、対象ドメインに必要なコア機能を特定し、これらのニーズに合わせた包括的な評価スイートを設計することから始まります。
次に、継続的な事前トレーニング、指示の調整、好みの調整など、主要なトレーニング後の段階の有効性を分析します。
これらの洞察に基づいて、生成報酬モデルからのプロセス信号を活用する、新しい嗜好データ抽出方法を中心とした効果的なトレーニングレシピを提案します。
結果として得られたモデル Llama-Fin は、幅広い金融タスクにわたって最先端のパフォーマンスを実現します。
また、私たちの分析では、トレーニング後の各段階がどのように特有の能力に貢献しているかが強調され、特定の課題と効果的な解決策が明らかになり、LLM のドメイン適応のための貴重な洞察が得られます。
プロジェクトページ: https://github.com/SalesforceAIResearch/FinDap

要約(オリジナル)

Domain-adaptive post-training of large language models (LLMs) has emerged as a promising approach for specialized domains such as medicine and finance. However, significant challenges remain in identifying optimal adaptation criteria and training strategies across varying data and model configurations. To address these challenges, we introduce FINDAP, a systematic and fine-grained investigation into domain-adaptive post-training of LLMs for the finance domain. Our approach begins by identifying the core capabilities required for the target domain and designing a comprehensive evaluation suite aligned with these needs. We then analyze the effectiveness of key post-training stages, including continual pretraining, instruction tuning, and preference alignment. Building on these insights, we propose an effective training recipe centered on a novel preference data distillation method, which leverages process signals from a generative reward model. The resulting model, Llama-Fin, achieves state-of-the-art performance across a wide range of financial tasks. Our analysis also highlights how each post-training stage contributes to distinct capabilities, uncovering specific challenges and effective solutions, providing valuable insights for domain adaptation of LLMs. Project page: https://github.com/SalesforceAIResearch/FinDap

arxiv情報

著者	Zixuan Ke,Yifei Ming,Xuan-Phi Nguyen,Caiming Xiong,Shafiq Joty
発行日	2025-01-09 04:26:15+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Demystifying Domain-adaptive Post-training for Financial LLMs

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー