For a semiotic AI: Bridging computer vision and visual semiotics for computational observation of large scale facial image archives

要約

ソーシャルネットワークは、人間の顔や体のイメージの認知的、感情的、実用的な価値が間違いなく変化するデジタル世界を作り出している。しかし、デジタル人文科学分野の研究者は、こうした現象を大規模に研究するのに不向きな場合が多い。この研究では、ソーシャルメディアプラットフォーム上の画像の社会文化的な意味合いをスケールで探求するためにデザインされたフレームワーク、FRESCO（Face Representation in E-Societies through Computational Observation）を紹介する。FRESCOは、視覚記号論の原則に沿った最先端のコンピュータビジョン技術を用いて、画像を数値変数とカテゴリー変数に分解する。すなわち、線や色といった基本的な視覚的特徴を包含する「造形レベル」、特定の実体や概念を表す「具象レベル」、そして特に観客や観察者の視点を構築することに重点を置く「発音レベル」である。これらのレベルを分析することで、イメージの中のより深い物語層を見分けることができる。実験的検証によってFRESCOの信頼性と有用性が確認され、2つの公開データセットにわたってその一貫性と精度が評価された。その後、フレームワークの出力から導き出された、画像コンテンツの類似性を測る信頼性の高い指標であるFRESCOスコアを紹介する。

要約(オリジナル)

Social networks are creating a digital world in which the cognitive, emotional, and pragmatic value of the imagery of human faces and bodies is arguably changing. However, researchers in the digital humanities are often ill-equipped to study these phenomena at scale. This work presents FRESCO (Face Representation in E-Societies through Computational Observation), a framework designed to explore the socio-cultural implications of images on social media platforms at scale. FRESCO deconstructs images into numerical and categorical variables using state-of-the-art computer vision techniques, aligning with the principles of visual semiotics. The framework analyzes images across three levels: the plastic level, encompassing fundamental visual features like lines and colors; the figurative level, representing specific entities or concepts; and the enunciation level, which focuses particularly on constructing the point of view of the spectator and observer. These levels are analyzed to discern deeper narrative layers within the imagery. Experimental validation confirms the reliability and utility of FRESCO, and we assess its consistency and precision across two public datasets. Subsequently, we introduce the FRESCO score, a metric derived from the framework’s output that serves as a reliable measure of similarity in image content.

arxiv情報

著者	Lia Morra,Antonio Santangelo,Pietro Basci,Luca Piano,Fabio Garcea,Fabrizio Lamberti,Massimo Leone
発行日	2024-07-03 16:57:38+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

For a semiotic AI: Bridging computer vision and visual semiotics for computational observation of large scale facial image archives

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー