Illuminating Blind Spots of Language Models with Targeted Agent-in-the-Loop Synthetic Data

要約

言語モデル (LM) は、さまざまなタスクにわたって優れた精度を達成していますが、未知の未知数 (UU) とも呼ばれる、信頼性の高い誤分類に対して依然として脆弱です。
これらの UU は機能空間の盲点に集中し、一か八かのアプリケーションで重大なリスクにつながります。
これは、このようなエラーの影響を受けやすい小型軽量の LM に特に関係します。
UU の識別は広範囲に研究されていますが、識別された UU を使用して目に見えない盲点を排除する方法を含め、UU の軽減は未解決の課題のままです。
この研究では、UU タイプのエラーを特徴付ける教師としてインテリジェントエージェント (人間または大型 LM) を使用することで、盲点の軽減に取り組む新しいアプローチを提案します。
インテリジェントエージェントの一般化機能を活用することで、信頼度の高い誤分類のパターンを特定し、それを使用してターゲットを絞った合成サンプルを生成し、モデルの堅牢性を向上させ、盲点を減らします。
私たちは 3 つの分類タスクに関してメソッドの広範な評価を実施し、同様のレベルの精度を維持しながら UU 数を削減するその有効性を実証しました。
人間による計算の有効性には上限があり、その根底にあるタスクへの習熟度に大きく依存していることがわかりました。
さらに、LM はよりスケーラブルでありながら人間のような一般化と生成パフォーマンスを達成するため、人間と LM の間のコストの差は 1 桁を超えています。

要約(オリジナル)

Language models (LMs) have achieved impressive accuracy across a variety of tasks but remain vulnerable to high-confidence misclassifications, also referred to as unknown unknowns (UUs). These UUs cluster into blind spots in the feature space, leading to significant risks in high-stakes applications. This is particularly relevant for smaller, lightweight LMs that are more susceptible to such errors. While the identification of UUs has been extensively studied, their mitigation remains an open challenge, including how to use identified UUs to eliminate unseen blind spots. In this work, we propose a novel approach to address blind spot mitigation through the use of intelligent agents — either humans or large LMs — as teachers to characterize UU-type errors. By leveraging the generalization capabilities of intelligent agents, we identify patterns in high-confidence misclassifications and use them to generate targeted synthetic samples to improve model robustness and reduce blind spots. We conduct an extensive evaluation of our method on three classification tasks and demonstrate its effectiveness in reducing the number of UUs, all while maintaining a similar level of accuracy. We find that the effectiveness of human computation has a high ceiling but is highly dependent on familiarity with the underlying task. Moreover, the cost gap between humans and LMs surpasses an order of magnitude, as LMs attain human-like generalization and generation performance while being more scalable.

arxiv情報

著者	Philip Lippmann,Matthijs T. J. Spaan,Jie Yang
発行日	2024-11-04 15:59:19+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Illuminating Blind Spots of Language Models with Targeted Agent-in-the-Loop Synthetic Data

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー