Ethics and Persuasion in Reinforcement Learning from Human Feedback: A Procedural Rhetorical Approach

要約

2022年以来、ChatGptやClaudeなどの生成AIチャットボットのバージョンは、人間のフィードバック（RLHF）からのRehneduction Learningと呼ばれる専門的な手法を使用して、人間のアノテーターからのフィードバックを使用して言語モデルの出力を微調整して訓練されています。
その結果、RLHFの統合により、これらの大規模な言語モデル（LLM）の出力が大幅に強化され、監視された学習のみを使用して以前のバージョンのものよりも相互作用と応答がより「人間のように」表示されました。
人間と機械で書かれたテキストの収束の増加は、透明性、信頼、バイアス、および対人関係に関連する潜在的に深刻な倫理的、社会技術的、および教育的な意味を持っています。
これらの意味を強調するために、このペーパーでは、RLHFが強化した生成AIチャットボットによって現在再形成されている中心的な手順とプロセスのいくつかの修辞分析を提示します。言語慣習、情報探索の実践、社会的関係への期待を支持します。
生成AIおよびLLMの修辞的調査は、この時点で、生成されたコンテンツの説得力に主に焦点を合わせています。
Ian Bogostの手続き的なレトリックの概念を使用して、この論文では、修辞的調査の部位をコンテンツ分析からRLHF強化LLMに組み込まれた説得の基礎メカニズムにシフトします。
そうすることで、この理論的調査は、AI主導のテクノロジーを通じて手順がどのように再ルーティングされ、覇権的な言語の使用を強化し、バイアスを永続させ、学習を永続化し、人間関係に侵入するかを考慮するAI倫理のさらなる調査のための新しい方向性を開きます。
したがって、教育者、研究者、学者、および生成AIチャットボットのユーザーの増加に興味があります。

要約(オリジナル)

Since 2022, versions of generative AI chatbots such as ChatGPT and Claude have been trained using a specialized technique called Reinforcement Learning from Human Feedback (RLHF) to fine-tune language model output using feedback from human annotators. As a result, the integration of RLHF has greatly enhanced the outputs of these large language models (LLMs) and made the interactions and responses appear more ‘human-like’ than those of previous versions using only supervised learning. The increasing convergence of human and machine-written text has potentially severe ethical, sociotechnical, and pedagogical implications relating to transparency, trust, bias, and interpersonal relations. To highlight these implications, this paper presents a rhetorical analysis of some of the central procedures and processes currently being reshaped by RLHF-enhanced generative AI chatbots: upholding language conventions, information seeking practices, and expectations for social relationships. Rhetorical investigations of generative AI and LLMs have, to this point, focused largely on the persuasiveness of the content generated. Using Ian Bogost’s concept of procedural rhetoric, this paper shifts the site of rhetorical investigation from content analysis to the underlying mechanisms of persuasion built into RLHF-enhanced LLMs. In doing so, this theoretical investigation opens a new direction for further inquiry in AI ethics that considers how procedures rerouted through AI-driven technologies might reinforce hegemonic language use, perpetuate biases, decontextualize learning, and encroach upon human relationships. It will therefore be of interest to educators, researchers, scholars, and the growing number of users of generative AI chatbots.

arxiv情報

著者	Shannon Lodoen,Alexi Orchard
発行日	2025-05-14 17:29:19+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Ethics and Persuasion in Reinforcement Learning from Human Feedback: A Procedural Rhetorical Approach

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー