Toward Auto-evaluation with Confidence-based Category Relation-aware Regression

要約

【タイトル】信頼度に基づくカテゴリ関係感知回帰に向けての自動評価

【要約】

– 自動評価は、人間の注釈なしにトレーニングされたモデルを任意のテストデータセットで自動的に評価することを目的としています。
– 既存の多くの方法は、データセットの表現として、モデルによって抽出された特徴のグローバル統計を利用します。
– これにより、分類ヘッドの影響を無視し、モデルのカテゴリ別混乱情報を失います。
– しかし、異なるカテゴリに割り当てられたインスタンスの比率とその信頼度スコアは、どのカテゴリのインスタンスがモデルにとって分類が困難であるかを反映し、全体的な性能とカテゴリごとの性能の両方に重要な指標を含んでいます。
– この論文では、信頼度に基づくカテゴリ関係感知回帰（$C^2R^2$）方法を提案しています。
– $C^2R^2$は、メタセット内のすべてのインスタンスを、その信頼スコアに基づいて異なるカテゴリに分割し、グローバル表現を抽出します。
– 各カテゴリについて、$C^2R^2$はその他のカテゴリへのローカル混乱関係をローカル表現にエンコードします。
– 全体的な性能とカテゴリごとの性能は、それぞれグローバル表現とローカル表現から回帰されます。
– 幅広い実験により、提案された手法の有効性が示されています。

要約(オリジナル)

Auto-evaluation aims to automatically evaluate a trained model on any test dataset without human annotations. Most existing methods utilize global statistics of features extracted by the model as the representation of a dataset. This ignores the influence of the classification head and loses category-wise confusion information of the model. However, ratios of instances assigned to different categories together with their confidence scores reflect how many instances in which categories are difficult for the model to classify, which contain significant indicators for both overall and category-wise performances. In this paper, we propose a Confidence-based Category Relation-aware Regression ($C^2R^2$) method. $C^2R^2$ divides all instances in a meta-set into different categories according to their confidence scores and extracts the global representation from them. For each category, $C^2R^2$ encodes its local confusion relations to other categories into a local representation. The overall and category-wise performances are regressed from global and local representations, respectively. Extensive experiments show the effectiveness of our method.

arxiv情報

著者	Jiexin Wang,Jiahao Chen,Bing Su
発行日	2023-04-17 14:00:29+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, OpenAI

Toward Auto-evaluation with Confidence-based Category Relation-aware Regression

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー