A Survey of Reinforcement Learning from Human Feedback

要約

ヒューマンフィードバックからの強化学習 (RLHF) は、設計された報酬関数に依存するのではなく、人間のフィードバックから学習する強化学習 (RL) の一種です。
これは、好みに基づく強化学習 (PbRL) の関連設定に関する以前の研究に基づいて構築されており、人工知能と人間とコンピューターの対話の交差点に位置します。
この位置付けは、インテリジェントシステムのパフォーマンスと適応性を向上させると同時に、その目的と人間の価値観との整合性を向上させるための有望な手段を提供します。
近年、大規模言語モデル (LLM) のトレーニングによってこの可能性が印象的に実証されており、RLHF はモデルの機能を人間の目的に向ける上で決定的な役割を果たしました。
この記事では、RLHF の基礎の包括的な概要を提供し、RL エージェントと人間の入力の間の複雑なダイナミクスを探ります。
最近は LLM の RLHF に焦点が当てられていますが、私たちの調査ではより広い視点を採用し、この技術の多様なアプリケーションと広範な影響を調査しています。
私たちは、RLHF を支える中心原則を掘り下げ、アルゴリズムと人間のフィードバックの間の共生関係に光を当て、この分野の主な研究動向について議論します。
この記事は、RLHF 研究の現在の状況を総合することにより、研究者だけでなく実務者にも、この急速に成長する研究分野についての包括的な理解を提供することを目的としています。

要約(オリジナル)

Reinforcement learning from human feedback (RLHF) is a variant of reinforcement learning (RL) that learns from human feedback instead of relying on an engineered reward function. Building on prior work on the related setting of preference-based reinforcement learning (PbRL), it stands at the intersection of artificial intelligence and human-computer interaction. This positioning offers a promising avenue to enhance the performance and adaptability of intelligent systems while also improving the alignment of their objectives with human values. The training of large language models (LLMs) has impressively demonstrated this potential in recent years, where RLHF played a decisive role in directing the model’s capabilities toward human objectives. This article provides a comprehensive overview of the fundamentals of RLHF, exploring the intricate dynamics between RL agents and human input. While recent focus has been on RLHF for LLMs, our survey adopts a broader perspective, examining the diverse applications and wide-ranging impact of the technique. We delve into the core principles that underpin RLHF, shedding light on the symbiotic relationship between algorithms and human feedback, and discuss the main research trends in the field. By synthesizing the current landscape of RLHF research, this article aims to provide researchers as well as practitioners with a comprehensive understanding of this rapidly growing field of research.

arxiv情報

著者	Timo Kaufmann,Paul Weng,Viktor Bengs,Eyke Hüllermeier
発行日	2024-04-30 17:59:01+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A Survey of Reinforcement Learning from Human Feedback

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー