Multi-party Goal Tracking with LLMs: Comparing Pre-training, Fine-tuning, and Prompt Engineering

要約

このペーパーでは、現在の大規模言語モデル (LLM) がタスク指向の多者間会話 (MPC) をどの程度キャプチャできるかを評価します。
私たちは、病院内の患者、その同伴者、ソーシャルロボットの間で行われた 29 件の MPC を記録し、転写しました。
次に、マルチパーティの目標追跡とインテントスロット認識のためにこのコーパスに注釈を付けました。
MPC では、人々は目標を共有し、お互いの目標に答え、他の人の目標を提供しますが、これらはいずれも二者関係では起こりません。
MPC でのユーザーの目標を理解するために、ゼロショット設定と少数ショット設定で 3 つの方法を比較しました。T5 を微調整し、LED を使用して DialogLM をトレーニングするための事前トレーニングタスクを作成し、GPT-3.5-turbo を使用したプロンプトエンジニアリングテクニックを採用しました。
どのアプローチが限られたデータでこの新しいタスクを完了できるかを判断します。
GPT-3.5-turbo は、数ショット設定で他のものを大幅に上回りました。
注釈付き会話の例としてコーパスの 7% を与えた場合、「推論」スタイルのプロンプトが最もパフォーマンスの高い方法でした。
目標追跡 MPC の 62.32%、インテントスロット認識 MPC の 69.57% に正しくアノテーションが付けられました。
「ストーリー」スタイルはモデルの幻覚を増加させ、安全性が重要な設定で導入されると有害になる可能性があります。
私たちは、マルチパーティでの会話は依然として最先端の LLM にとって課題であると結論付けています。

要約(オリジナル)

This paper evaluates the extent to which current Large Language Models (LLMs) can capture task-oriented multi-party conversations (MPCs). We have recorded and transcribed 29 MPCs between patients, their companions, and a social robot in a hospital. We then annotated this corpus for multi-party goal-tracking and intent-slot recognition. People share goals, answer each other’s goals, and provide other people’s goals in MPCs – none of which occur in dyadic interactions. To understand user goals in MPCs, we compared three methods in zero-shot and few-shot settings: we fine-tuned T5, created pre-training tasks to train DialogLM using LED, and employed prompt engineering techniques with GPT-3.5-turbo, to determine which approach can complete this novel task with limited data. GPT-3.5-turbo significantly outperformed the others in a few-shot setting. The `reasoning’ style prompt, when given 7% of the corpus as example annotated conversations, was the best performing method. It correctly annotated 62.32% of the goal tracking MPCs, and 69.57% of the intent-slot recognition MPCs. A `story’ style prompt increased model hallucination, which could be detrimental if deployed in safety-critical settings. We conclude that multi-party conversations still challenge state-of-the-art LLMs.

arxiv情報

著者	Angus Addlesee,Weronika Sieińska,Nancie Gunson,Daniel Hernández Garcia,Christian Dondrup,Oliver Lemon
発行日	2023-08-29 11:40:03+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Multi-party Goal Tracking with LLMs: Comparing Pre-training, Fine-tuning, and Prompt Engineering

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー