HOLMES: HOLonym-MEronym based Semantic inspection for Convolutional Image Classifiers

要約

畳み込みニューラルネットワーク (CNN) は、ビジュアルタスクにおける特徴抽出プロセスを自動化できるため、現在、コンピュータービジョンで選ばれるモデルとなっています。
ただし、トレーニング中に取得した知識は完全に部分記号的なため、エンドユーザーに理解して説明するのは困難です。
この論文では、ラベルを一連の関連概念に分解し、画像分類モデルにコンポーネントレベルの説明を提供する、HOLMES (HOLonym-MEronym based Semantic Inspection) と呼ばれる新しい手法を提案します。
具体的には、HOLMES はオントロジー、Web スクレイピング、転移学習を活用して、特定のホロニム (クラス) に対するメロニム (パーツ) ベースの検出器を自動的に構築します。
次に、メロニムレベルでヒートマップを生成し、最後に、オクルージョンされた画像でホロニム CNN を調査することで、分類出力における各部分の重要性を強調します。
最先端の顕著性手法と比較して、HOLMES はさらに一歩進んで、高密度に注釈が付けられたデータセットに依存したり、概念を単一の計算単位に強制的に関連付けたりすることなく、ホロニム CNN がどこで何を見ているのかに関する情報を提供します。
さまざまなカテゴリーの物体 (動物、道具、乗り物) に関する広範な実験評価により、私たちのアプローチの実現可能性が示されています。
平均して、HOLMES の説明には少なくとも 2 つのメロニムが含まれており、単一のメロニムが除去されるとホロニムモデルの信頼性がおよそ半分になります。
結果として得られたヒートマップは、削除/挿入/保存曲線を使用して定量的に評価されました。
すべてのメトリクスは GradCAM によって達成されたものと同等でしたが、人間が理解できる概念でヒートマップをさらに分解できるという利点があり、オブジェクト分類に対するメロニムの関連性と、それを捕捉する HOLMES の能力の両方が強調されました。
コードは https://github.com/FrancesC0de/HOLMES で入手できます。

要約(オリジナル)

Convolutional Neural Networks (CNNs) are nowadays the model of choice in Computer Vision, thanks to their ability to automatize the feature extraction process in visual tasks. However, the knowledge acquired during training is fully subsymbolic, and hence difficult to understand and explain to end users. In this paper, we propose a new technique called HOLMES (HOLonym-MEronym based Semantic inspection) that decomposes a label into a set of related concepts, and provides component-level explanations for an image classification model. Specifically, HOLMES leverages ontologies, web scraping and transfer learning to automatically construct meronym (parts)-based detectors for a given holonym (class). Then, it produces heatmaps at the meronym level and finally, by probing the holonym CNN with occluded images, it highlights the importance of each part on the classification output. Compared to state-of-the-art saliency methods, HOLMES takes a step further and provides information about both where and what the holonym CNN is looking at, without relying on densely annotated datasets and without forcing concepts to be associated to single computational units. Extensive experimental evaluation on different categories of objects (animals, tools and vehicles) shows the feasibility of our approach. On average, HOLMES explanations include at least two meronyms, and the ablation of a single meronym roughly halves the holonym model confidence. The resulting heatmaps were quantitatively evaluated using the deletion/insertion/preservation curves. All metrics were comparable to those achieved by GradCAM, while offering the advantage of further decomposing the heatmap in human-understandable concepts, thus highlighting both the relevance of meronyms to object classification, as well as HOLMES ability to capture it. The code is available at https://github.com/FrancesC0de/HOLMES.

arxiv情報

著者	Francesco Dibitonto,Fabio Garcea,André Panisson,Alan Perotti,Lia Morra
発行日	2024-03-13 13:51:02+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

HOLMES: HOLonym-MEronym based Semantic inspection for Convolutional Image Classifiers

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー