DORIC : Domain Robust Fine-Tuning for Open Intent Clustering through Dependency Parsing

要約

Dialog System Technology Challenges 11 (DSTC11) のトラック 2 に関する私たちの取り組みを紹介します。
DSTC11-Track2 は、ゼロショット、クロスドメイン、インテントセット誘導のベンチマークを提供することを目的としています。
ドメイン内のトレーニングデータセットがない場合、ユーザーの意図を誘導するには、ドメイン間で使用できる堅牢な発話表現が必要です。
これを実現するために、マルチドメイン対話データセットを活用して言語モデルを微調整し、動詞とオブジェクトのペアを抽出して不要な情報のアーティファクトを削除することを提案しました。
さらに、クラスタリング結果の説明可能性のために、各クラスタの名前を生成する方法を考案しました。
私たちのアプローチは、精度スコアで 3 位を達成し、さまざまなドメインデータセットのベースラインモデルよりも優れた精度と正規化相互情報量 (NMI) スコアを示しました。

要約(オリジナル)

We present our work on Track 2 in the Dialog System Technology Challenges 11 (DSTC11). DSTC11-Track2 aims to provide a benchmark for zero-shot, cross-domain, intent-set induction. In the absence of in-domain training dataset, robust utterance representation that can be used across domains is necessary to induce users’ intentions. To achieve this, we leveraged a multi-domain dialogue dataset to fine-tune the language model and proposed extracting Verb-Object pairs to remove the artifacts of unnecessary information. Furthermore, we devised the method that generates each cluster’s name for the explainability of clustered results. Our approach achieved 3rd place in the precision score and showed superior accuracy and normalized mutual information (NMI) score than the baseline model on various domain datasets.

arxiv情報

著者	Jihyun Lee,Seungyeon Seo,Yunsu Kim,Gary Geunbae Lee
発行日	2023-03-17 08:12:36+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

DORIC : Domain Robust Fine-Tuning for Open Intent Clustering through Dependency Parsing

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー