具体的には、測定マトリックス$ \ mathbf {x} \ in \ mathbb {r}^{n \ times d} $ and noisy observations $ \ mathbf {y} = \ mathbf {x}} \ mathbf {\ theta}^\ star + \ mathbf {
$ \ mathbf {\ theta}^\ star $は、ガウス拡散密度と予想されるスパースkを備えたスパイクアンドスラブ以前の$ \ pi $から描画されました。
\ sigma^2 \ mathbf {i} _n)$。
事後$ \ pi(\ cdot \ mid \ mathbf {x}、\ mathbf {y})$の多項式時間の高精度サンプラーをsnr $ \ sigma^{-1} $> 0に与えます。
さらに、$ n \ geq k^5 \ cdot \ text {polylog}(d)$である限り、同じ設定でほぼ線形の時間$ \約nd $で実行されるサンプラーを提供します。
フレームワークの柔軟性を実証するために、ラプラス拡散密度によるスパイクアンドスラブ後のサンプリングに結果を拡張し、$ \ sigma = o(\ frac {1} {k})$が境界がある場合、同様の保証を達成します。
Posterior sampling with the spike-and-slab prior [MB88], a popular multimodal distribution used to model uncertainty in variable selection, is considered the theoretical gold standard method for Bayesian sparse linear regression [CPS09, Roc18]. However, designing provable algorithms for performing this sampling task is notoriously challenging. Existing posterior samplers for Bayesian sparse variable selection tasks either require strong assumptions about the signal-to-noise ratio (SNR) [YWJ16], only work when the measurement count grows at least linearly in the dimension [MW24], or rely on heuristic approximations to the posterior. We give the first provable algorithms for spike-and-slab posterior sampling that apply for any SNR, and use a measurement count sublinear in the problem dimension. Concretely, assume we are given a measurement matrix $\mathbf{X} \in \mathbb{R}^{n\times d}$ and noisy observations $\mathbf{y} = \mathbf{X}\mathbf{\theta}^\star + \mathbf{\xi}$ of a signal $\mathbf{\theta}^\star$ drawn from a spike-and-slab prior $\pi$ with a Gaussian diffuse density and expected sparsity k, where $\mathbf{\xi} \sim \mathcal{N}(\mathbb{0}_n, \sigma^2\mathbf{I}_n)$. We give a polynomial-time high-accuracy sampler for the posterior $\pi(\cdot \mid \mathbf{X}, \mathbf{y})$, for any SNR $\sigma^{-1}$ > 0, as long as $n \geq k^3 \cdot \text{polylog}(d)$ and $X$ is drawn from a matrix ensemble satisfying the restricted isometry property. We further give a sampler that runs in near-linear time $\approx nd$ in the same setting, as long as $n \geq k^5 \cdot \text{polylog}(d)$. To demonstrate the flexibility of our framework, we extend our result to spike-and-slab posterior sampling with Laplace diffuse densities, achieving similar guarantees when $\sigma = O(\frac{1}{k})$ is bounded.
著者 | Syamantak Kumar,Purnamrita Sarkar,Kevin Tian,Yusong Zhu |
発行日 | 2025-03-04 17:16:07+00:00 |
arxivサイト | arxiv_id(pdf) |
提供元, 利用サービス
arxiv.jp, Google