Multi-Armed Bandits Meet Large Language Models

要約

Banditアルゴリズムと大規模な言語モデル（LLM）は、人工知能の強力なツールとして浮上しており、それぞれが意思決定と自然言語処理における明確で補完的な課題に対処しています。
この調査では、これら2つの分野間の相乗的可能性を調査し、BanditアルゴリズムがLLMのパフォーマンスをどのように強化するか、LLMSが盗賊ベースの意思決定を改善するための新しい洞察を提供する方法を強調しています。
最初に、LLMの微調整、迅速なエンジニアリング、および適応対応の生成を最適化する際の盗賊アルゴリズムの役割を調べ、大規模な学習タスクでの探査と搾取のバランスをとる能力に焦点を当てています。
その後、LLMSが自然言語の推論を使用して、高度なコンテキスト理解、動的適応、および改善されたポリシー選択を通じて盗賊アルゴリズムをどのように増強できるかを探ります。
既存の研究の包括的なレビューを提供し、主要な課題と機会を特定することにより、この調査は、BanditアルゴリズムとLLMSのギャップを埋めることを目的としており、AIの革新的なアプリケーションと学際的研究への道を開いています。

要約(オリジナル)

Bandit algorithms and Large Language Models (LLMs) have emerged as powerful tools in artificial intelligence, each addressing distinct yet complementary challenges in decision-making and natural language processing. This survey explores the synergistic potential between these two fields, highlighting how bandit algorithms can enhance the performance of LLMs and how LLMs, in turn, can provide novel insights for improving bandit-based decision-making. We first examine the role of bandit algorithms in optimizing LLM fine-tuning, prompt engineering, and adaptive response generation, focusing on their ability to balance exploration and exploitation in large-scale learning tasks. Subsequently, we explore how LLMs can augment bandit algorithms through advanced contextual understanding, dynamic adaptation, and improved policy selection using natural language reasoning. By providing a comprehensive review of existing research and identifying key challenges and opportunities, this survey aims to bridge the gap between bandit algorithms and LLMs, paving the way for innovative applications and interdisciplinary research in AI.

arxiv情報

著者	Djallel Bouneffouf,Raphael Feraud
発行日	2025-05-19 16:57:57+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Multi-Armed Bandits Meet Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー