Efficient Parallelization of an Ubiquitous Sequential Computation

要約

$t = (1, 2, \dots, n)$, $a_t \in \ として、シーケンス $x_t = a_t x_{t-1} + b_t$ を 2 つの接頭辞の合計と並行して計算するための簡潔な式を見つけます。
mathbb{R}^n$、$b_t \in \mathbb{R}^n$、および初期値 $x_0 \in \mathbb{R}$。
$n$ 並列プロセッサでは、$n$ 要素の計算に $\mathcal{O}(\log n)$ 時間と $\mathcal{O}(n)$ スペースが発生します。
この形式のシーケンスは科学と工学のいたるところに存在し、効率的な並列化が膨大な数のアプリケーションに役立ちます。
ソフトウェアで式を実装し、並列ハードウェアでテストし、逐次計算より $\frac{n}{\log n}$ 倍高速に実行されることを確認します。

要約(オリジナル)

We find a succinct expression for computing the sequence $x_t = a_t x_{t-1} + b_t$ in parallel with two prefix sums, given $t = (1, 2, \dots, n)$, $a_t \in \mathbb{R}^n$, $b_t \in \mathbb{R}^n$, and initial value $x_0 \in \mathbb{R}$. On $n$ parallel processors, the computation of $n$ elements incurs $\mathcal{O}(\log n)$ time and $\mathcal{O}(n)$ space. Sequences of this form are ubiquitous in science and engineering, making efficient parallelization useful for a vast number of applications. We implement our expression in software, test it on parallel hardware, and verify that it executes faster than sequential computation by a factor of $\frac{n}{\log n}$.

arxiv情報

著者	Franz A. Heinsen
発行日	2023-11-15 14:53:19+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Efficient Parallelization of an Ubiquitous Sequential Computation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー