Analysis of Utterance Embeddings and Clustering Methods Related to Intent Induction for Task-Oriented Dialogue

要約

この作業の焦点は、タスク指向のダイアログスキーマの設計における典型的な課題を克服するための教師なしアプローチを調査することです。つまり、各ダイアログターンにインテントラベルを割り当て (インテントクラスタリング)、インテントクラスタリングメソッドに基づいて一連のインテントを生成します (インテント誘導)。
意図の自動誘導には、(1) 意図ラベル付けのためのクラスタリングアルゴリズムと (2) ユーザー発話埋め込みスペースの 2 つの顕著な要因があると仮定します。
DSTC11 評価に基づいて、既存の既製のクラスタリングモデルと埋め込みを比較します。
私たちの広範な実験は、意図誘導タスクにおける発話埋め込みとクラスタリング方法の組み合わせの選択を慎重に検討する必要があることを示しています。
また、凝集クラスタリングを使用した事前トレーニング済みの MiniLM は、意図誘導タスクの NMI、ARI、F1、精度、およびサンプルカバレッジの大幅な改善を示していることも示します。
ソースコードは、https://github.com/Jeiyoon/dstc11-track2 で入手できます。

要約(オリジナル)

The focus of this work is to investigate unsupervised approaches to overcome quintessential challenges in designing task-oriented dialog schema: assigning intent labels to each dialog turn (intent clustering) and generating a set of intents based on the intent clustering methods (intent induction). We postulate there are two salient factors for automatic induction of intents: (1) clustering algorithm for intent labeling and (2) user utterance embedding space. We compare existing off-the-shelf clustering models and embeddings based on DSTC11 evaluation. Our extensive experiments demonstrate that the combined selection of utterance embedding and clustering method in the intent induction task should be carefully considered. We also present that pretrained MiniLM with Agglomerative clustering shows significant improvement in NMI, ARI, F1, accuracy and example coverage in intent induction tasks. The source codes are available at https://github.com/Jeiyoon/dstc11-track2.

arxiv情報

著者	Jeiyoon Park,Yoonna Jang,Chanhee Lee,Heuiseok Lim
発行日	2023-03-17 06:46:27+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Analysis of Utterance Embeddings and Clustering Methods Related to Intent Induction for Task-Oriented Dialogue

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー