Frugal Reinforcement-based Active Learning

要約

既存の学習モデルのほとんど、特にディープニューラルネットワークは大規模なデータセットに依存しており、その手作業によるラベル付けには費用と時間がかかります。
現在の傾向は、これらのモデルの学習を倹約し、ラベル付きデータの大規模なコレクションへの依存度を下げることです。
既存のソリューションの中で、ディープアクティブラーニングが現在大きな関心を集めており、その目的は、できるだけ少ないラベル付きサンプルを使用してディープネットワークをトレーニングすることです。
ただし、アクティブラーニングの成功は、モデルのトレーニング時にこれらのサンプルがどれほど重要であるかに大きく依存します。
この論文では、ラベル効率的なトレーニングのための新しいアクティブラーニングアプローチを考案します。
提案された方法は反復的であり、多様性、代表性、および不確実性の基準を組み合わせた制約付き目的関数を最小化することを目的としています。
提案されたアプローチは確率論的であり、これらすべての基準を単一の目的関数に統合します。この目的関数の解は、決定関数を学習する際のサンプルの関連性の確率 (つまり、どれほど重要か) をモデル化します。
また、特定のステートレス Q ラーニングモデルを使用して、トレーニングの反復ごとにこれらの基準を適応的にバランスさせる、強化学習に基づく新しい重み付けメカニズムも紹介します。
Object-DOTA を含む主要な画像分類データに対して実施された広範な実験は、提案されたモデルの有効性を示しています。
ランダム、不確実性、フラット、その他の作業を含むいくつかのベースライン。

要約(オリジナル)

Most of the existing learning models, particularly deep neural networks, are reliant on large datasets whose hand-labeling is expensive and time demanding. A current trend is to make the learning of these models frugal and less dependent on large collections of labeled data. Among the existing solutions, deep active learning is currently witnessing a major interest and its purpose is to train deep networks using as few labeled samples as possible. However, the success of active learning is highly dependent on how critical are these samples when training models. In this paper, we devise a novel active learning approach for label-efficient training. The proposed method is iterative and aims at minimizing a constrained objective function that mixes diversity, representativity and uncertainty criteria. The proposed approach is probabilistic and unifies all these criteria in a single objective function whose solution models the probability of relevance of samples (i.e., how critical) when learning a decision function. We also introduce a novel weighting mechanism based on reinforcement learning, which adaptively balances these criteria at each training iteration, using a particular stateless Q-learning model. Extensive experiments conducted on staple image classification data, including Object-DOTA, show the effectiveness of our proposed model w.r.t. several baselines including random, uncertainty and flat as well as other work.

arxiv情報

著者	Sebastien Deschamps,Hichem Sahbi
発行日	2022-12-09 14:17:45+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Frugal Reinforcement-based Active Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー