Pre-training Multi-party Dialogue Models with Latent Discourse Inference


複数の当事者による対話は、複数の対話者が関与し、応答と情報の流れが織り交ぜられるため、1 対 1 の 2 者間の対話よりもモデルが理解するのが困難です。
複数の下流タスクの実験では、事前トレーニングされたモデルが強力なベースラインを大幅に上回り、最先端 (SOTA) の結果を達成することが示され、私たちの手法の有効性が正当化されます。
このペーパーの正式な実装は、 で入手できます。


Multi-party dialogues are more difficult for models to understand than one-to-one two-party dialogues, since they involve multiple interlocutors, resulting in interweaving reply-to relations and information flows. To step over these obstacles, an effective way is to pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying. However, due to the lack of explicitly annotated discourse labels in multi-party dialogue corpora, previous works fail to scale up the pre-training process by putting aside the unlabeled multi-party conversational data for nothing. To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model by unsupervised latent variable inference methods. Experiments on multiple downstream tasks show that our pre-trained model outperforms strong baselines by large margins and achieves state-of-the-art (SOTA) results, justifying the effectiveness of our method. The official implementation of this paper is available at


著者 Yiyang Li,Xinting Huang,Wei Bi,Hai Zhao
発行日 2023-05-24 14:06:27+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス, Google

カテゴリー: cs.CL パーマリンク