GOAT-Bench: Safety Insights to Large Multimodal Models through Meme-Based Social Abuse

要約

ソーシャルメディアの指数関数的な成長は、デジタル時代の前例を超える情報の作成、普及、吸収の方法を大きく変化させました。
残念ながら、この爆発はまた、ミームのオンライン乱用の大幅な増加をもたらしました。
ミームのマイナスの影響を評価することは、しばしば微妙で暗黙の意味があり、明白なテキストとイメージを通して直接伝えられないため、特に挑戦的です。
これに照らして、多様なマルチモーダルタスクの処理において顕著な能力のために、大きなマルチモーダルモデル（LMM）が焦点として焦点を当てています。
この開発に応えて、私たちの論文は、ミームに現れた社会的虐待の微妙な側面を識別して対応するために、さまざまなLMM（GPT-4Oなど）の能力を徹底的に調べることを目指しています。
暗黙のヘイトスピーチ、性差別、サイバーいじめなどのテーマをカプセル化する6Kを超えるさまざまなミームを含む包括的なミームベンチマーク、ヤギのベンチを紹介します。ヤギのベンチを利用して、LMMSの能力を掘り下げて、憎しみ、女嫌い、攻撃、有害なコンテンツを正確に評価します。
さまざまなLMMにわたる広範な実験は、現在のモデルが依然として安全性の認識に欠陥を示しており、さまざまな形態の暗黙的虐待に対する非感受性を示していることを明らかにしています。
この不足は、安全な人工知能の実現に対する重大な障害を表していると仮定します。
ヤギのベンチと付随するリソースは、https://goatlmm.github.io/で公開されており、この重要な分野で進行中の研究に貢献しています。

要約(オリジナル)

The exponential growth of social media has profoundly transformed how information is created, disseminated, and absorbed, exceeding any precedent in the digital age. Regrettably, this explosion has also spawned a significant increase in the online abuse of memes. Evaluating the negative impact of memes is notably challenging, owing to their often subtle and implicit meanings, which are not directly conveyed through the overt text and image. In light of this, large multimodal models (LMMs) have emerged as a focal point of interest due to their remarkable capabilities in handling diverse multimodal tasks. In response to this development, our paper aims to thoroughly examine the capacity of various LMMs (e.g., GPT-4o) to discern and respond to the nuanced aspects of social abuse manifested in memes. We introduce the comprehensive meme benchmark, GOAT-Bench, comprising over 6K varied memes encapsulating themes such as implicit hate speech, sexism, and cyberbullying, etc. Utilizing GOAT-Bench, we delve into the ability of LMMs to accurately assess hatefulness, misogyny, offensiveness, sarcasm, and harmful content. Our extensive experiments across a range of LMMs reveal that current models still exhibit a deficiency in safety awareness, showing insensitivity to various forms of implicit abuse. We posit that this shortfall represents a critical impediment to the realization of safe artificial intelligence. The GOAT-Bench and accompanying resources are publicly accessible at https://goatlmm.github.io/, contributing to ongoing research in this vital field.

arxiv情報

著者	Hongzhan Lin,Ziyang Luo,Bo Wang,Ruichao Yang,Jing Ma
発行日	2025-02-28 15:13:46+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

GOAT-Bench: Safety Insights to Large Multimodal Models through Meme-Based Social Abuse

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー