Calling a Spade a Heart: Gaslighting Multimodal Large Language Models via Negation

要約

マルチモーダル大手言語モデル（MLLM）は、複雑な理解と生成タスクに優れているさまざまなモダリティを統合する際の顕著な進歩を示しています。
彼らの成功にもかかわらず、MLLMは会話の敵対的なインプット、特に否定的な議論に対して脆弱なままです。
このペーパーでは、多様なベンチマーク全体で最先端のMLLMを体系的に評価し、最初に正しい応答をするために否定引数が導入されたときに大きなパフォーマンス低下を明らかにします。
特に、MLLMの否定的議論に対する脆弱性を評価するために特別に設計された最初のベンチマークガスライトベンチを導入します。
Gaslightingbenchは、既存のデータセットからキュレーションされた複数選択の質問と、20の多様なカテゴリにわたって生成された否定プロンプトで構成されています。
広範な評価を通して、Gemini-1.5-Flash、GPT-4O、Claude-3.5-Sonnetなどの独自のモデルは、QWEN2-VLやLlavaなどのオープンソースのカウンターパートと比較してより良い回復力を示していることがわかります。
しかし、評価されたすべてのMLLMは、会話中の否定的な議論の下で論理的な一貫性を維持するのに苦労しています。
私たちの調査結果は、否定入力に対するMLLMの堅牢性を改善するための重要な洞察を提供し、より信頼性が高く信頼できるマルチモーダルAIシステムの開発に貢献しています。

要約(オリジナル)

Multimodal Large Language Models (MLLMs) have exhibited remarkable advancements in integrating different modalities, excelling in complex understanding and generation tasks. Despite their success, MLLMs remain vulnerable to conversational adversarial inputs, particularly negation arguments. This paper systematically evaluates state-of-the-art MLLMs across diverse benchmarks, revealing significant performance drops when negation arguments are introduced to initially correct responses. Notably, we introduce the first benchmark GaslightingBench, specifically designed to evaluate the vulnerability of MLLMs to negation arguments. GaslightingBench consists of multiple-choice questions curated from existing datasets, along with generated negation prompts across 20 diverse categories. Throughout extensive evaluation, we find that proprietary models such as Gemini-1.5-flash, GPT-4o and Claude-3.5-Sonnet demonstrate better resilience compared to open-source counterparts like Qwen2-VL and LLaVA. However, all evaluated MLLMs struggle to maintain logical consistency under negation arguments during conversation. Our findings provide critical insights for improving the robustness of MLLMs against negation inputs, contributing to the development of more reliable and trustworthy multimodal AI systems.

arxiv情報

著者	Bin Zhu,Huiyan Qi,Yinxuan Gui,Jingjing Chen,Chong-Wah Ngo,Ee-Peng Lim
発行日	2025-03-10 13:50:13+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Calling a Spade a Heart: Gaslighting Multimodal Large Language Models via Negation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー