Controllable Generation of Dialogue Acts for Dialogue Systems via Few-Shot Response Generation and Ranking

要約

対話システムは、複数のタイプの対話行為 (DA) を高い意味的忠実度で実現する応答を生成する必要があります。
これまで、対話用の自然言語ジェネレーター (NLG) は、ドメイン固有の DA とその意味論的属性を出力発話にマッピングする大規模な並列コーパスでトレーニングされていました。
最近の研究では、事前トレーニング済み言語モデル (LLM) が、プロンプトベースの学習を使用した制御可能な NLG の新しい可能性を提供することが示されています。
ここでは、DA の制御された生成を実現する、新しい少数ショットのオーバージェネレートおよびランク付けアプローチを開発します。
テキストスタイル転送アプローチを使用してテキスト疑似参照から生成する新しい方法を含む 8 つの少数ショットプロンプトスタイルを比較します。
私たちは、生成時に正しい DA と高いセマンティック精度の両方を備えた出力を識別する 6 つの自動ランキング関数を開発しました。
3 つのドメインと 4 つの LLM でアプローチをテストします。
私たちの知る限り、これは、DA と属性の精度の両方を使用して出力を自動的にランク付けする対話用の NLG に関する最初の作業です。
完全を期すために、DA あたり 5 ～ 100 のインスタンスでトレーニングされた微調整された少数ショットモデルと結果を比較します。
私たちの結果は、いくつかのプロンプト設定が完璧な DA 精度とほぼ完璧なセマンティック精度 (99.81%) を達成し、数ショットの微調整よりも優れたパフォーマンスを発揮することを示しています。

要約(オリジナル)

Dialogue systems need to produce responses that realize multiple types of dialogue acts (DAs) with high semantic fidelity. In the past, natural language generators (NLGs) for dialogue were trained on large parallel corpora that map from a domain-specific DA and its semantic attributes to an output utterance. Recent work shows that pretrained language models (LLMs) offer new possibilities for controllable NLG using prompt-based learning. Here we develop a novel few-shot overgenerate-and-rank approach that achieves the controlled generation of DAs. We compare eight few-shot prompt styles that include a novel method of generating from textual pseudo-references using a textual style transfer approach. We develop six automatic ranking functions that identify outputs with both the correct DA and high semantic accuracy at generation time. We test our approach on three domains and four LLMs. To our knowledge, this is the first work on NLG for dialogue that automatically ranks outputs using both DA and attribute accuracy. For completeness, we compare our results to fine-tuned few-shot models trained with 5 to 100 instances per DA. Our results show that several prompt settings achieve perfect DA accuracy, and near perfect semantic accuracy (99.81%) and perform better than few-shot fine-tuning.

arxiv情報

著者	Angela Ramirez,Karik Agarwal,Juraj Juraska,Utkarsh Garg,Marilyn A. Walker
発行日	2023-07-26 18:16:45+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Controllable Generation of Dialogue Acts for Dialogue Systems via Few-Shot Response Generation and Ranking

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー