Evaluating Task-Oriented Dialogue Consistency through Constraint Satisfaction

要約

タスク指向の対話では、対話自体の中で一貫性を維持し、ターン全体で論理的な一貫性を確保するとともに、外部の知識を正確に反映する会話領域との両方で一貫性を維持する必要があります。
私たちは、対話の一貫性を制約満足問題 (CSP) として概念化することを提案します。CSP では、変数は会話ドメインを参照する対話のセグメントを表し、変数間の制約は、言語的、会話的、およびドメインベースの側面を含む対話のプロパティを反映します。
このアプローチの実現可能性を実証するために、CSP ソルバーを利用して、LLM によって再語彙化された対話内の不一致を検出します。
私たちの調査結果は、(i) CSP は対話の不一致を検出するのに効果的であることを示しています。
(ii) 一貫した対話の再語彙化は最先端の LLM にとって困難であり、CSP ソルバーと比較した場合、精度率は 0.15 しか達成できません。
さらに、アブレーション研究を通じて、領域知識に由来する制約が尊重される上で最大の困難を引き起こすことを明らかにしました。
私たちは、CSP は、コンポーネントパイプラインに基づくアプローチでは十分に考慮されていなかった対話の一貫性の中核となる特性を捉えていると主張します。

要約(オリジナル)

Task-oriented dialogues must maintain consistency both within the dialogue itself, ensuring logical coherence across turns, and with the conversational domain, accurately reflecting external knowledge. We propose to conceptualize dialogue consistency as a Constraint Satisfaction Problem (CSP), wherein variables represent segments of the dialogue referencing the conversational domain, and constraints among variables reflect dialogue properties, including linguistic, conversational, and domain-based aspects. To demonstrate the feasibility of the approach, we utilize a CSP solver to detect inconsistencies in dialogues re-lexicalized by an LLM. Our findings indicate that: (i) CSP is effective to detect dialogue inconsistencies; and (ii) consistent dialogue re-lexicalization is challenging for state-of-the-art LLMs, achieving only a 0.15 accuracy rate when compared to a CSP solver. Furthermore, through an ablation study, we reveal that constraints derived from domain knowledge pose the greatest difficulty in being respected. We argue that CSP captures core properties of dialogue consistency that have been poorly considered by approaches based on component pipelines.

arxiv情報

著者	Tiziano Labruna,Bernardo Magnini
発行日	2024-07-16 15:38:41+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Evaluating Task-Oriented Dialogue Consistency through Constraint Satisfaction

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー