Exploration and Persuasion


彼らは集合的にこれら 2 つの目的のバランスをとる必要がありますが、彼らのインセンティブは搾取に偏っています。
「Incentivized Exploration」は、戦略的コミュニケーションを通じてこの問題に対処します。
目標は、(i) 探索と活用の間の望ましいバランスを達成し、(ii) エージェントが推奨に従うよう動機づける、コミュニケーションおよび推奨ポリシーを設計することです。
それを実現可能にするのは「情報の非対称性」です。プリンシパルは多くのエージェントから情報を収集するため、1 人のエージェントよりも多くのことを知っています。
インセンティブ付き探索では、機械学習と理論経済学の 2 つの重要な問題が組み合わされます。
第 2 に、単一のエージェントとの対話は、プリンシパルが情報の非対称性を利用してエージェントに特定の行動を取るよう説得する「ベイジアン説得」に対応します。


How to incentivize self-interested agents to explore when they prefer to exploit? Consider a population of self-interested agents that make decisions under uncertainty. They ‘explore’ to acquire new information and ‘exploit’ this information to make good decisions. Collectively they need to balance these two objectives, but their incentives are skewed toward exploitation. This is because exploration is costly, but its benefits are spread over many agents in the future. ‘Incentivized Exploration’ addresses this issue via strategic communication. Consider a benign “principal’ which can communicate with the agents and make recommendations, but cannot force the agents to comply. Moreover, suppose the principal can observe the agents’ decisions and the outcomes of these decisions. The goal is to design a communication and recommendation policy which (i) achieves a desirable balance between exploration and exploitation, and (ii) incentivizes the agents to follow recommendations. What makes it feasible is ‘information asymmetry’: the principal knows more than any one agent, as it collects information from many. It is essential that the principal does not fully reveal all its knowledge to the agents. Incentivized exploration combines two important problems in, resp., machine learning and theoretical economics. First, if agents always follow recommendations, the principal faces a multi-armed bandit problem: essentially, design an algorithm that balances exploration and exploitation. Second, interaction with a single agent corresponds to ‘Bayesian persuasion’, where a principal leverages information asymmetry to convince an agent to take a particular action. We provide a brief but self-contained introduction to each problem through the lens of incentivized exploration, solving a key special case of the former as a sub-problem of the latter.


著者 Aleksandrs Slivkins
発行日 2024-10-22 15:13:13+00:00
