Predictions as Surrogates: Revisiting Surrogate Outcomes in the Age of AI

要約

私たちは、生物統計学と経済学における数十年前の代理結果モデルと、予測力による推論 (PPI) の新興分野との間の正式な関係を確立します。
この接続では、AI の時代に普及している事前トレーニング済みモデルからの予測を、高価な結果の費用対効果の高い代替物として扱います。
代理結果の文献に基づいて、既存の PPI 提案よりも統計的推論へのより効率的なアプローチである、再調整された予測を利用した推論を開発します。
私たちの方法は、柔軟な機械学習技術を使用して、再校正と呼ばれるステップを通じて最適な「帰属損失」を学習することで、既存の提案から逸脱しています。
重要なのは、最適な帰属損失が不完全に推定されている場合でも、この方法は利用可能な真の結果を含むデータのみに依存する推定量を常に改善しており、推定値が一貫している場合には PPI 推定量間で最小の漸近分散を達成します。
計算上、ターゲットパラメーターを定義する損失関数が凸である場合は常に、最適化目標は凸になります。
さらに、機械学習の予測が目的の結果から体系的に逸脱するいくつかの一般的なシナリオにおいて、理論的および数値的な両方で再調整の利点を分析します。
私たちは、最先端の機械学習/AI モデルを活用した 3 つのアプリケーションを通じて、既存の PPI 提案に比べて有効サンプルサイズが大幅に向上していることを実証します。

要約(オリジナル)

We establish a formal connection between the decades-old surrogate outcome model in biostatistics and economics and the emerging field of prediction-powered inference (PPI). The connection treats predictions from pre-trained models, prevalent in the age of AI, as cost-effective surrogates for expensive outcomes. Building on the surrogate outcomes literature, we develop recalibrated prediction-powered inference, a more efficient approach to statistical inference than existing PPI proposals. Our method departs from the existing proposals by using flexible machine learning techniques to learn the optimal “imputed loss” through a step we call recalibration. Importantly, the method always improves upon the estimator that relies solely on the data with available true outcomes, even when the optimal imputed loss is estimated imperfectly, and it achieves the smallest asymptotic variance among PPI estimators if the estimate is consistent. Computationally, our optimization objective is convex whenever the loss function that defines the target parameter is convex. We further analyze the benefits of recalibration, both theoretically and numerically, in several common scenarios where machine learning predictions systematically deviate from the outcome of interest. We demonstrate significant gains in effective sample size over existing PPI proposals via three applications leveraging state-of-the-art machine learning/AI models.

arxiv情報

著者	Wenlong Ji,Lihua Lei,Tijana Zrnic
発行日	2025-01-16 18:30:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Predictions as Surrogates: Revisiting Surrogate Outcomes in the Age of AI

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー