StereoMap: Quantifying the Awareness of Human-like Stereotypes in Large Language Models

要約

大規模言語モデル (LLM) は、トレーニングデータに存在する有害な関連性をエンコードし、永続化することが観察されています。
私たちは、人口統計上のグループが社会からどのように見られているかについての認識を洞察するために、ステレオマップと呼ばれる理論に基づいたフレームワークを提案します。
このフレームワークはステレオタイプコンテンツモデル (SCM) に基づいています。
心理学で確立された理論。
SCM によると、ステレオタイプはすべて同じではありません。
代わりに、温かさと有能さの次元が、固定観念の性質を描写する要素として機能します。
SCM 理論に基づいて、ステレオマップは、暖かさと有能さの次元を使用して、LLM の社会グループ (社会人口統計的特徴によって定義される) の認識をマッピングします。
さらに、このフレームワークにより、キーワードの調査と LLM の判断の推論の言語化が可能になり、LLM の認識に影響を与える根本的な要因を明らかにすることができます。
私たちの結果は、LLMがこれらのグループに対して多様な認識を示し、温かさと有能さの次元に沿った混合評価を特徴とすることを示しています。
さらに、LLM の推論を分析すると、LLM は社会的格差に対する認識を示しており、多くの場合、推論を裏付ける統計データや研究結果を述べていることがわかりました。
この研究は、LLM が社会的集団をどのように認識し、表現するかを理解するのに貢献し、彼らの潜在的な偏見と有害な関連性の永続に光を当てます。

要約(オリジナル)

Large Language Models (LLMs) have been observed to encode and perpetuate harmful associations present in the training data. We propose a theoretically grounded framework called StereoMap to gain insights into their perceptions of how demographic groups have been viewed by society. The framework is grounded in the Stereotype Content Model (SCM); a well-established theory from psychology. According to SCM, stereotypes are not all alike. Instead, the dimensions of Warmth and Competence serve as the factors that delineate the nature of stereotypes. Based on the SCM theory, StereoMap maps LLMs’ perceptions of social groups (defined by socio-demographic features) using the dimensions of Warmth and Competence. Furthermore, the framework enables the investigation of keywords and verbalizations of reasoning of LLMs’ judgments to uncover underlying factors influencing their perceptions. Our results show that LLMs exhibit a diverse range of perceptions towards these groups, characterized by mixed evaluations along the dimensions of Warmth and Competence. Furthermore, analyzing the reasonings of LLMs, our findings indicate that LLMs demonstrate an awareness of social disparities, often stating statistical data and research findings to support their reasoning. This study contributes to the understanding of how LLMs perceive and represent social groups, shedding light on their potential biases and the perpetuation of harmful associations.

arxiv情報

著者	Sullam Jeoung,Yubin Ge,Jana Diesner
発行日	2023-10-20 17:22:30+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

StereoMap: Quantifying the Awareness of Human-like Stereotypes in Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー