What Should We Engineer in Prompts? Training Humans in Requirement-Driven LLM Use

要約

複雑なタスクのLLMSを促す（たとえば、トリップアドバイザーチャットボットの構築）には、カスタマイズされた要件を明確に明確に表現する必要があります（たとえば、「TL; DR」で応答を開始します）。
ただし、既存の迅速なエンジニアリングの指示には、要件の明確化に関する集中トレーニングが不足していることが多く、代わりにますます自動化可能な戦略を強調する傾向があります（例えば、ロールプレイや「段階的な段階を考える」などのトリック）。
ギャップに対処するために、要件指向のプロンプトエンジニアリング（ロープ）を紹介します。これは、プロンプト中に明確で完全な要件を生成することに人間の注意を集中させるパラダイムです。
LLMで生成されたフィードバックを使用して意図的な実践を提供する評価およびトレーニングスイートを通じてロープを実装します。
30の初心者を使用したランダム化比較実験では、ロープは従来の迅速なエンジニアリングトレーニング（20％対1％のゲイン）を大幅に上回ります。これは、自動迅速な最適化が閉じることができないギャップです。
さらに、入力要件の品質とLLM出力の間に直接的な相関関係を示します。
私たちの仕事は、より多くのエンドユーザーに複雑なLLMアプリケーションを構築できるようにする方法を舗装しています。

要約(オリジナル)

Prompting LLMs for complex tasks (e.g., building a trip advisor chatbot) needs humans to clearly articulate customized requirements (e.g., ‘start the response with a tl;dr’). However, existing prompt engineering instructions often lack focused training on requirement articulation and instead tend to emphasize increasingly automatable strategies (e.g., tricks like adding role-plays and ‘think step-by-step’). To address the gap, we introduce Requirement-Oriented Prompt Engineering (ROPE), a paradigm that focuses human attention on generating clear, complete requirements during prompting. We implement ROPE through an assessment and training suite that provides deliberate practice with LLM-generated feedback. In a randomized controlled experiment with 30 novices, ROPE significantly outperforms conventional prompt engineering training (20% vs. 1% gains), a gap that automatic prompt optimization cannot close. Furthermore, we demonstrate a direct correlation between the quality of input requirements and LLM outputs. Our work paves the way to empower more end-users to build complex LLM applications.

arxiv情報

著者	Qianou Ma,Weirui Peng,Chenyang Yang,Hua Shen,Kenneth Koedinger,Tongshuang Wu
発行日	2025-04-28 16:07:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

What Should We Engineer in Prompts? Training Humans in Requirement-Driven LLM Use

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー