Sequential Underspecified Instrument Selection for Cause-Effect Estimation


操作変数 (IV) 法は、治療変数を直接実験することができない、観察されていない交絡がある設定での因果効果を推定するために使用されます。
ほとんどの IV アプリケーションは低次元の治療に焦点を当てており、少なくとも治療と同じ数の器具が非常に必要です。
この仮定は限定的です。自然科学では、高次元の治療の因果関係 (遺伝子発現や微生物叢が健康や病気に及ぼす影響など) を推論しようとすることがよくありますが、限られた数の機器を使用して少数の実験しか実行できません (
例: 薬や抗生物質)。
このような不明確な問題では、たとえ線形の場合であっても、1 回の実験では完全な治療効果を特定することはできません。


Instrumental variable (IV) methods are used to estimate causal effects in settings with unobserved confounding, where we cannot directly experiment on the treatment variable. Instruments are variables which only affect the outcome indirectly via the treatment variable(s). Most IV applications focus on low-dimensional treatments and crucially require at least as many instruments as treatments. This assumption is restrictive: in the natural sciences we often seek to infer causal effects of high-dimensional treatments (e.g., the effect of gene expressions or microbiota on health and disease), but can only run few experiments with a limited number of instruments (e.g., drugs or antibiotics). In such underspecified problems, the full treatment effect is not identifiable in a single experiment even in the linear case. We show that one can still reliably recover the projection of the treatment effect onto the instrumented subspace and develop techniques to consistently combine such partial estimates from different sets of instruments. We then leverage our combined estimators in an algorithm that iteratively proposes the most informative instruments at each round of experimentation to maximize the overall information about the full causal effect.


