Pitfalls of Scale: Investigating the Inverse Task of Redefinition in Large Language Models

要約

逆タスクは、大規模な言語モデル（LLM）のスケールアップとして、潜在的な推論ギャップを明らかにすることができます。
この作業では、再定義タスクを調査します。このタスクでは、よく知られている物理定数と測定単位に代替値を割り当て、LLMSにそれに応じて応答するように促します。
私たちの調査結果は、モデルのパフォーマンスがスケールで低下するだけでなく、その誤った信頼も上昇することを示しています。
さらに、戦略の促進や応答のフォーマットなどの要因は影響力がありますが、LLMはアンカーから記憶された値まで排除しません。

要約(オリジナル)

Inverse tasks can uncover potential reasoning gaps as Large Language Models (LLMs) scale up. In this work, we explore the redefinition task, in which we assign alternative values to well-known physical constants and units of measure, prompting LLMs to respond accordingly. Our findings show that not only does model performance degrade with scale, but its false confidence also rises. Moreover, while factors such as prompting strategies or response formatting are influential, they do not preclude LLMs from anchoring to memorized values.

arxiv情報

著者	Elena Stringli,Maria Lymperaiou,Giorgos Filandrianos,Athanasios Voulodimos,Giorgos Stamou
発行日	2025-06-02 15:40:35+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Pitfalls of Scale: Investigating the Inverse Task of Redefinition in Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー