Navigating Tomorrow: Reliably Assessing Large Language Models Performance on Future Event Prediction

要約

将来のイベントを予測することは、複数の分野やドメインにわたるアプリケーションにとって重要なアクティビティです。
たとえば、株式市場の動向、自然災害、事業展開、政治的出来事を予測する能力は、早期の予防措置を促進し、新たな機会を発見することができます。
予測分析、時系列予測、シミュレーションなど、将来予測を試みるための多様な計算手法が複数提案されています。
この研究では、まだ研究が進んでいない領域である将来予測タスクをサポートする際の、いくつかの大規模言語モデル (LLM) のパフォーマンスを評価します。
私たちは、肯定的質問と可能性の質問、推論、反事実分析の 3 つのシナリオにわたってモデルを評価します。
このために、エンティティタイプとその人気に基づいてニュース記事を検索して分類することにより、データセット 1 を作成します。
モデルのパフォーマンスを徹底的にテストして比較するために、LLM のトレーニングの締め切り日の前後にニュース記事を収集します。
私たちの研究は、予測モデリングにおける LLM の可能性と限界を浮き彫りにし、将来の改善のための基盤を提供します。

要約(オリジナル)

Predicting future events is an important activity with applications across multiple fields and domains. For example, the capacity to foresee stock market trends, natural disasters, business developments, or political events can facilitate early preventive measures and uncover new opportunities. Multiple diverse computational methods for attempting future predictions, including predictive analysis, time series forecasting, and simulations have been proposed. This study evaluates the performance of several large language models (LLMs) in supporting future prediction tasks, an under-explored domain. We assess the models across three scenarios: Affirmative vs. Likelihood questioning, Reasoning, and Counterfactual analysis. For this, we create a dataset1 by finding and categorizing news articles based on entity type and its popularity. We gather news articles before and after the LLMs training cutoff date in order to thoroughly test and compare model performance. Our research highlights LLMs potential and limitations in predictive modeling, providing a foundation for future improvements.

arxiv情報

著者	Petraq Nako,Adam Jatowt
発行日	2025-01-10 12:44:46+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Navigating Tomorrow: Reliably Assessing Large Language Models Performance on Future Event Prediction

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー