Fast and Private Inference of Deep Neural Networks by Co-designing Activation Functions

要約

サービスとしての機械学習 (MLaaS) は、豊富なコンピューティングリソースを持つ企業がディープニューラルネットワークをトレーニングし、画像分類などのタスクに対するクエリアクセスを提供する、ますます人気のある設計です。
この設計の課題は、MLaaS ではクライアントが機密性の高いクエリをモデルをホストしている企業に明らかにする必要があることです。
マルチパーティコンピューテーション (MPC) は、暗号化された推論を許可することでクライアントのデータを保護します。
ただし、現在のアプローチでは、推論時間が法外に長いという問題があります。
MPC における推論時間のボトルネックは、ReLU 活性化関数などの非線形層の評価です。
機械学習と MPC を共同設計した以前の研究の成功を動機として、活性化関数の共同設計を開発します。
すべての ReLU を多項式近似に置き換え、単一ラウンド MPC プロトコルで評価します。これにより、広域ネットワークで最先端の推論時間が得られます。
さらに、多項式アクティベーションで以前に遭遇した精度の問題に対処するために、平文モデルと競合する精度を提供する新しいトレーニングアルゴリズムを提案します。
私たちの評価では、競争力のある推論精度を維持しながら、最大 2,300 万ドルのパラメータを持つ大規模モデルで推論時間が 3 ドルから 110 ドル倍スピードアップすることがわかりました。

要約(オリジナル)

Machine Learning as a Service (MLaaS) is an increasingly popular design where a company with abundant computing resources trains a deep neural network and offers query access for tasks like image classification. The challenge with this design is that MLaaS requires the client to reveal their potentially sensitive queries to the company hosting the model. Multi-party computation (MPC) protects the client’s data by allowing encrypted inferences. However, current approaches suffer from prohibitively large inference times. The inference time bottleneck in MPC is the evaluation of non-linear layers such as ReLU activation functions. Motivated by the success of previous work co-designing machine learning and MPC, we develop an activation function co-design. We replace all ReLUs with a polynomial approximation and evaluate them with single-round MPC protocols, which give state-of-the-art inference times in wide-area networks. Furthermore, to address the accuracy issues previously encountered with polynomial activations, we propose a novel training algorithm that gives accuracy competitive with plaintext models. Our evaluation shows between $3$ and $110\times$ speedups in inference time on large models with up to $23$ million parameters while maintaining competitive inference accuracy.

arxiv情報

著者	Abdulrahman Diaa,Lucas Fenaux,Thomas Humphries,Marian Dietz,Faezeh Ebrahimianghazani,Bailey Kacsmar,Xinda Li,Nils Lukas,Rasoul Akhavan Mahdavi,Simon Oya,Ehsan Amjadian,Florian Kerschbaum
発行日	2024-04-16 16:48:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Fast and Private Inference of Deep Neural Networks by Co-designing Activation Functions

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー