Online Learning under Budget and ROI Constraints via Weak Adaptivity

要約

我々は、意思決定者が、予算と投資収益率（ROI）の制約を守りながら、期待報酬を最大化することを目標に、一連の高価な決定を行わなければならないオンライン学習問題を研究する。敵対的入力の下での制約付きオンライン学習問題用に設計された既存の原始双対アルゴリズムは、2つの基本的な仮定に依存している。第一に、意思決定者は問題の厳密な実行可能性の程度に関係するパラメータ（すなわちスレーターパラメータ）の値を事前に知っていなければならない。第二に、オフライン最適化問題の厳密な実行可能解が各ラウンドに存在しなければならない。この二つの要件は、オンライン広告オークションにおける入札のような実用的な応用には非現実的である。本論文では、標準的なプライマル・デュアルテンプレートに弱い適応的後悔最小化器を与えることで、このような仮定を回避できることを示す。この結果、スレーターのパラメータに関する知識がない場合でも、双対変数が十分に小さいままであることを保証する“双対バランシング”フレームワークが得られる。我々は、確率的かつ敵対的な入力の下で、前述の2つの仮定がない場合に成立する最善の無残約保証を初めて証明する。最後に、第一価格オークションや第二価格オークションのような、実用的な様々なメカニズムにおいて最適に入札するために、このフレームワークをどのようにインスタンス化するかを示す。

要約(オリジナル)

We study online learning problems in which a decision maker has to make a sequence of costly decisions, with the goal of maximizing their expected reward while adhering to budget and return-on-investment (ROI) constraints. Existing primal-dual algorithms designed for constrained online learning problems under adversarial inputs rely on two fundamental assumptions. First, the decision maker must know beforehand the value of parameters related to the degree of strict feasibility of the problem (i.e. Slater parameters). Second, a strictly feasible solution to the offline optimization problem must exist at each round. Both requirements are unrealistic for practical applications such as bidding in online ad auctions. In this paper, we show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers. This results in a “dual-balancing” framework which ensures that dual variables stay sufficiently small, even in the absence of knowledge about Slater’s parameter. We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions, under stochastic and adversarial inputs. Finally, we show how to instantiate the framework to optimally bid in various mechanisms of practical relevance, such as first- and second-price auctions.

arxiv情報

著者	Matteo Castiglioni,Andrea Celli,Christian Kroer
発行日	2024-03-02 17:26:22+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Online Learning under Budget and ROI Constraints via Weak Adaptivity

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー