I Don’t Know: Explicit Modeling of Uncertainty with an [IDK] Token

要約

大規模言語モデルは現実世界の知識を捕捉し、多くの下流タスクで優れた能力を発揮できることが知られています。
最近の進歩にも関わらず、これらのモデルは依然として幻覚として知られる傾向があり、望ましくない事実に反するテキストを出力する原因となります。
この研究では、幻覚と戦うために使用できる新しいキャリブレーション方法を提案します。
特別な [IDK] (「わかりません」) トークンをモデルの語彙に追加し、不正確な予測に対して確率質量を [IDK] トークンにシフトする目的関数を導入します。
このアプローチにより、モデルは出力の不確実性を明示的に表現できるようになります。
私たちは、複数のモデルアーキテクチャと実際の下流タスクにわたって提案された手法を評価します。
私たちの方法でトレーニングされたモデルは、エンコードされた知識の損失をわずかに抑えながら、以前は間違いを犯していた場所で不確実性を表現できることがわかりました。
さらに、アプローチの複数のバリエーションについて広範なアブレーション研究を実行し、メソッドの精度と再現率のトレードオフの詳細な分析を提供します。

要約(オリジナル)

Large Language Models are known to capture real-world knowledge, allowing them to excel in many downstream tasks. Despite recent advances, these models are still prone to what are commonly known as hallucinations, causing them to emit unwanted and factually incorrect text. In this work, we propose a novel calibration method that can be used to combat hallucinations. We add a special [IDK] (‘I don’t know’) token to the model’s vocabulary and introduce an objective function that shifts probability mass to the [IDK] token for incorrect predictions. This approach allows the model to express uncertainty in its output explicitly. We evaluate our proposed method across multiple model architectures and factual downstream tasks. We find that models trained with our method are able to express uncertainty in places where they would previously make mistakes while suffering only a small loss of encoded knowledge. We further perform extensive ablation studies of multiple variations of our approach and provide a detailed analysis of the precision-recall tradeoff of our method.

arxiv情報

著者	Roi Cohen,Konstantin Dobler,Eden Biran,Gerard de Melo
発行日	2024-12-09 17:13:20+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

I Don’t Know: Explicit Modeling of Uncertainty with an [IDK] Token

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー