Private Attribute Inference from Images with Vision-Language Models

要約

大規模言語モデル (LLM) が日常業務やデジタルインタラクションの至る所で使用されるようになるにつれて、関連するプライバシーリスクがますます注目されています。
LLM のプライバシー研究は主にモデルのトレーニングデータの漏洩に焦点を当ててきましたが、最近ではモデルの機能の向上により、LLM がこれまで目に見えなかったテキストからプライバシーを侵害する正確な推論を行えるようになったことが示されています。
画像とテキストの両方を理解できるマルチモーダル視覚言語モデル (VLM) の台頭により、関連する問題は、そのような結果が、オンラインに投稿された無害な画像というこれまで未踏の領域に移行するかどうかです。
新しく出現した VLM の画像推論機能に関連するリスクを調査するために、画像所有者の個人的属性について人間が注釈を付けたラベルを含む画像データセットをコンパイルします。
従来の人間の属性認識を超えて VLM によってもたらされる追加のプライバシーリスクを理解するために、私たちのデータセットは、推測可能なプライベート属性が人間の直接の描写に由来しない画像で構成されています。
このデータセットで 7 つの最先端の VLM の推論機能を評価したところ、さまざまな個人属性を最大 77.6% の精度で推論できることがわかりました。
関連して、精度はモデルの一般的な機能に応じて変化することが観察されており、将来のモデルがより強力な敵として悪用される可能性があり、適切な防御の開発が不可欠であることが示唆されています。

要約(オリジナル)

As large language models (LLMs) become ubiquitous in our daily tasks and digital interactions, associated privacy risks are increasingly in focus. While LLM privacy research has primarily focused on the leakage of model training data, it has recently been shown that the increase in models’ capabilities has enabled LLMs to make accurate privacy-infringing inferences from previously unseen texts. With the rise of multimodal vision-language models (VLMs), capable of understanding both images and text, a pertinent question is whether such results transfer to the previously unexplored domain of benign images posted online. To investigate the risks associated with the image reasoning capabilities of newly emerging VLMs, we compile an image dataset with human-annotated labels of the image owner’s personal attributes. In order to understand the additional privacy risk posed by VLMs beyond traditional human attribute recognition, our dataset consists of images where the inferable private attributes do not stem from direct depictions of humans. On this dataset, we evaluate the inferential capabilities of 7 state-of-the-art VLMs, finding that they can infer various personal attributes at up to 77.6% accuracy. Concerningly, we observe that accuracy scales with the general capabilities of the models, implying that future models can be misused as stronger adversaries, establishing an imperative for the development of adequate defenses.

arxiv情報

著者	Batuhan Tömekçe,Mark Vero,Robin Staab,Martin Vechev
発行日	2024-04-16 14:42:49+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Private Attribute Inference from Images with Vision-Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー