Federated Linear Contextual Bandits with User-level Differential Privacy

要約

この論文では、ユーザーレベルの差分プライバシー (DP) の概念の下で、フェデレーションされた線形コンテキストバンディットを研究します。
まず、逐次意思決定設定における DP のさまざまな定義に対応できる、統合されたフェデレーションバンディットフレームワークを導入します。
次に、フェデレーテッドバンディットフレームワークにユーザーレベルの中央 DP (CDP) とローカル DP (LDP) を正式に導入し、フェデレーテッド線形コンテキストバンディットモデルにおける学習後悔と対応する DP 保証の間の基本的なトレードオフを調査します。
CDP については、$\texttt{ROBIN}$ と呼ばれるフェデレーションアルゴリズムを提案し、ほぼ一致する上限と
ユーザーレベルの DP が満たされた場合のリグレスの下限。
LDP の場合、いくつかの下限が得られ、ユーザーレベル $(\varepsilon,\delta)$-LDP での学習は少なくとも $\min\{1/\varepsilon,M\} の後悔爆発係数を受ける必要があることを示しています。
$ または $\min\{1/\sqrt{\varepsilon},\sqrt{M}\}$ をさまざまな条件で使用します。

要約(オリジナル)

This paper studies federated linear contextual bandits under the notion of user-level differential privacy (DP). We first introduce a unified federated bandits framework that can accommodate various definitions of DP in the sequential decision-making setting. We then formally introduce user-level central DP (CDP) and local DP (LDP) in the federated bandits framework, and investigate the fundamental trade-offs between the learning regrets and the corresponding DP guarantees in a federated linear contextual bandits model. For CDP, we propose a federated algorithm termed as $\texttt{ROBIN}$ and show that it is near-optimal in terms of the number of clients $M$ and the privacy budget $\varepsilon$ by deriving nearly-matching upper and lower regret bounds when user-level DP is satisfied. For LDP, we obtain several lower bounds, indicating that learning under user-level $(\varepsilon,\delta)$-LDP must suffer a regret blow-up factor at least $\min\{1/\varepsilon,M\}$ or $\min\{1/\sqrt{\varepsilon},\sqrt{M}\}$ under different conditions.

arxiv情報

著者	Ruiquan Huang,Huanyu Zhang,Luca Melis,Milan Shen,Meisam Hajzinia,Jing Yang
発行日	2023-06-09 11:32:04+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Federated Linear Contextual Bandits with User-level Differential Privacy

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー