Graph-Dependent Regret Bounds in Multi-Armed Bandits with Interference

要約

マルチアライムバンディット（MAB）は、パーソナライズされたコンテンツの推奨から患者への治療の割り当てまで、アプリケーションでのオンラインシーケンシャルな意思決定に頻繁に使用されます。
現実世界の設定への古典的なMABフレームワークの適用性における繰り返しの課題は、\ textit {干渉}を無視することです。ユニットの結果は、他の人に割り当てられた治療に依存します。
これは、指数関数的に成長するアクションスペースにつながり、標準的なアプローチを計算的に非現実的にします。
ネットワーク干渉の下でMABの問題を研究します。各ユニットの報酬は、独自の治療と、特定の干渉グラフの隣人の問題に依存します。
干渉グラフの局所構造を使用して後悔を最小限に抑える新しいアルゴリズムを提案します。
以前の作業で改善されることを示す累積後悔のグラフ依存上の上限を導き出します。
さらに、任意のネットワーク干渉を伴う盗賊の最初の下限を提供します。各バウンドには、干渉グラフの明確な構造特性が含まれます。
これらの境界は、グラフが密度またはまばらである場合、アルゴリズムがほぼ最適であり、上限と下限が対数因子に一致することを示しています。
理論的な結果を数値実験で補完します。これは、私たちのアプローチがベースラインの方法を上回ることを示しています。

要約(オリジナル)

Multi-armed bandits (MABs) are frequently used for online sequential decision-making in applications ranging from recommending personalized content to assigning treatments to patients. A recurring challenge in the applicability of the classic MAB framework to real-world settings is ignoring \textit{interference}, where a unit’s outcome depends on treatment assigned to others. This leads to an exponentially growing action space, rendering standard approaches computationally impractical. We study the MAB problem under network interference, where each unit’s reward depends on its own treatment and those of its neighbors in a given interference graph. We propose a novel algorithm that uses the local structure of the interference graph to minimize regret. We derive a graph-dependent upper bound on cumulative regret showing that it improves over prior work. Additionally, we provide the first lower bounds for bandits with arbitrary network interference, where each bound involves a distinct structural property of the interference graph. These bounds demonstrate that when the graph is either dense or sparse, our algorithm is nearly optimal, with upper and lower bounds that match up to logarithmic factors. We complement our theoretical results with numerical experiments, which show that our approach outperforms baseline methods.

arxiv情報

著者	Fateme Jamshidi,Mohammad Shahverdikondori,Negar Kiyavash
発行日	2025-03-10 17:25:24+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Graph-Dependent Regret Bounds in Multi-Armed Bandits with Interference

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー