Spoken Language Understanding on Unseen Tasks With In-Context Learning

要約

音声言語理解（SLU）タスクには、モデルの情報抽出、分類、および/または生成機能を調査する多様なスキルが含まれます。
この設定では、タスク固有のトレーニングデータが常に利用できるとは限りません。
従来のタスク固有のSLUモデルはそのような要件に対応することはできませんが、音声テキストの大規模な言語モデル（LLM）は、緊急能力を備えた有望な代替品を提供します。
ただし、すぐに使用できるように、我々の評価は、SLUタスク上の顕著なオープンソースの音声テキストLLMのゼロ/少ないショットパフォーマンスがマークまでではないことを示しています。
このホワイトペーパーでは、ランダム化クラスラベルを使用して、堅牢なタスクに依存しない微調整に対する新しいアプローチを紹介します。
この提案された微調整により、目に見えないタスクでの音声テキストLLMのパフォーマンスは、標準的なアプローチで大幅に改善されていることを示しています。
重要なことに、提案されたアプローチは、音声テキストLLMで新しいタスクを有効にするためのタスク固有のデータアノテーションの要件を回避します。

要約(オリジナル)

Spoken language understanding (SLU) tasks involve diverse skills that probe the information extraction, classification and/or generation capabilities of models. In this setting, task-specific training data may not always be available. While traditional task-specific SLU models are unable to cater to such requirements, the speech-text large language models (LLMs) offer a promising alternative with emergent abilities. However, out of-the-box, our evaluations indicate that the zero/few-shot performance of prominent open-source speech-text LLMs on SLU tasks are not up to the mark. In this paper, we introduce a novel approach to robust task-agnostic fine-tuning using randomized class labels. With this proposed fine-tuning, we illustrate that the performance of the speech-text LLMs on an unseen task is significantly improved over standard approaches. Critically, the proposed approach avoids the requirement of task-specific data annotations for enabling new tasks in speech-text LLMs.

arxiv情報

著者	Neeraj Agrawal,Sriram Ganapathy
発行日	2025-05-12 16:38:43+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Spoken Language Understanding on Unseen Tasks With In-Context Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー