Improving Dialectal Slot and Intent Detection with Auxiliary Tasks: A Multi-Dialectal Bavarian Case Study

要約

信頼性の高いスロットおよびインテント検出 (SID) は、デジタルアシスタントなどのアプリケーションの自然言語理解において重要です。
高リソース言語で微調整されたエンコーダーのみのトランスフォーマーモデルは、通常、SID で適切に動作します。
しかし、標準化された形式が存在せず、トレーニングデータが不足し、作成にコストがかかるため、方言データに苦労しています。
私たちは、複数のバイエルン方言に焦点を当てて、SID のゼロショット転移学習を調査し、ミュンヘン方言の新しいデータセットをリリースします。
バイエルン語の補助タスクでトレーニングされたモデルを評価し、共同マルチタスク学習と中間タスクトレーニングを比較します。
また、トークンレベルの構文タスク、固有表現認識 (NER)、および言語モデリングの 3 種類の補助タスクも比較します。
含まれている補助タスクは、意図分類よりもスロット充填にプラスの効果をもたらし (NER が最もプラスの効果をもたらします)、中間タスクのトレーニングにより、より一貫したパフォーマンスの向上が得られることがわかりました。
私たちの最もパフォーマンスの高いアプローチは、バイエルン方言の意図分類パフォーマンスを 5.1 パーセント向上させ、スロット充填 F1 を 8.4 パーセントポイント向上させます。

要約(オリジナル)

Reliable slot and intent detection (SID) is crucial in natural language understanding for applications like digital assistants. Encoder-only transformer models fine-tuned on high-resource languages generally perform well on SID. However, they struggle with dialectal data, where no standardized form exists and training data is scarce and costly to produce. We explore zero-shot transfer learning for SID, focusing on multiple Bavarian dialects, for which we release a new dataset for the Munich dialect. We evaluate models trained on auxiliary tasks in Bavarian, and compare joint multi-task learning with intermediate-task training. We also compare three types of auxiliary tasks: token-level syntactic tasks, named entity recognition (NER), and language modelling. We find that the included auxiliary tasks have a more positive effect on slot filling than intent classification (with NER having the most positive effect), and that intermediate-task training yields more consistent performance gains. Our best-performing approach improves intent classification performance on Bavarian dialects by 5.1 and slot filling F1 by 8.4 percentage points.

arxiv情報

著者	Xaver Maria Krückl,Verena Blaschke,Barbara Plank
発行日	2025-01-07 15:21:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Improving Dialectal Slot and Intent Detection with Auxiliary Tasks: A Multi-Dialectal Bavarian Case Study

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー