End-to-end Spoken Language Understanding with Tree-constrained Pointer Generator

要約

エンドツーエンドの音声言語理解 (SLU) には、ロングテールワードの問題があります。
このホワイトペーパーでは、エンドツーエンドの SLU システムで、まれな単語の音声認識を改善する手法であるコンテキストバイアスを活用します。
具体的には、強力で効率的なバイアスモデルコンポーネントであるツリー制約付きポインタージェネレーター (TCPGen) が調査されます。これは、対応するエンティティを含むスロットショートリストを利用して、バイアスリストを抽出します。
一方、SLU モデルの出力スロット分布にバイアスをかけるために、TCPGen からスロット分布を計算するスロット確率バイアス (SPB) メカニズムが提案されています。
SLURP データセットの実験では、TCPGen と SPB を使用して、特に目に見えないエンティティで一貫した SLU-F1 の改善が見られました。
テスト用に 5 つのスロットタイプを提供することによる新しい分割では、SPB を使用した TCPGen は、それに対処できないベースラインと比較して、50% を超える SLU-F1 スコアでゼロショット学習を達成しました。
スロットの充填に加えて、インテント分類の精度も向上しました。

要約(オリジナル)

End-to-end spoken language understanding (SLU) suffers from the long-tail word problem. This paper exploits contextual biasing, a technique to improve the speech recognition of rare words, in end-to-end SLU systems. Specifically, a tree-constrained pointer generator (TCPGen), a powerful and efficient biasing model component, is studied, which leverages a slot shortlist with corresponding entities to extract biasing lists. Meanwhile, to bias the SLU model output slot distribution, a slot probability biasing (SPB) mechanism is proposed to calculate a slot distribution from TCPGen. Experiments on the SLURP dataset showed consistent SLU-F1 improvements using TCPGen and SPB, especially on unseen entities. On a new split by holding out 5 slot types for the test, TCPGen with SPB achieved zero-shot learning with an SLU-F1 score over 50% compared to baselines which can not deal with it. In addition to slot filling, the intent classification accuracy was also improved.

arxiv情報

著者	Guangzhi Sun,Chao Zhang,Philip C. Woodland
発行日	2023-03-14 22:57:12+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

End-to-end Spoken Language Understanding with Tree-constrained Pointer Generator

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー