要約
Text-to-image (T2I) models are increasingly used in impactful real-life applications.
As such, there is a growing need to audit these models to ensure that they generate desirable, task-appropriate images.
However, systematically inspecting the associations between prompts and generated content in a human-understandable way remains challenging.
これに対処するために、これらの概念の観点から定義できる解釈可能な概念とメトリックを使用して、ビジョン言語モデルの条件付き分布を特徴付ける概念2Conceptを提案します。
This characterization allows us to use our framework to audit models and prompt-datasets.
実証するために、ユーザー定義の分布や経験的、実際の分布など、プロンプトの条件付き分布に関するいくつかのケーススタディを調査します。
Lastly, we implement Concept2Concept as an open-source interactive visualization tool to facilitate use by non-technical end-users.
A demo is available at https://tinyurl.com/Concept2ConceptDemo.
要約(オリジナル)
Text-to-image (T2I) models are increasingly used in impactful real-life applications. As such, there is a growing need to audit these models to ensure that they generate desirable, task-appropriate images. However, systematically inspecting the associations between prompts and generated content in a human-understandable way remains challenging. To address this, we propose Concept2Concept, a framework where we characterize conditional distributions of vision language models using interpretable concepts and metrics that can be defined in terms of these concepts. This characterization allows us to use our framework to audit models and prompt-datasets. To demonstrate, we investigate several case studies of conditional distributions of prompts, such as user-defined distributions or empirical, real-world distributions. Lastly, we implement Concept2Concept as an open-source interactive visualization tool to facilitate use by non-technical end-users. A demo is available at https://tinyurl.com/Concept2ConceptDemo.
arxiv情報
著者 | Salma Abdel Magid,Weiwei Pan,Simon Warchol,Grace Guo,Junsik Kim,Mahia Rahman,Hanspeter Pfister |
発行日 | 2025-02-14 14:52:51+00:00 |
arxivサイト | arxiv_id(pdf) |
提供元, 利用サービス
arxiv.jp, Google