Approaches of large-scale images recognition with more than 50,000 categoris

要約

現在の CV モデルは、数百または数千のカテゴリを含む小規模な画像分類データセットでは高レベルの精度を達成できていますが、50,000 を超えるカテゴリを含む大規模なデータセットとなると、多くのモデルは計算量またはスペース消費の点で実行不可能になります。
このペーパーでは、特徴抽出と処理、BOVW (Bag of Visual Words) などの従来の CV 技術と、ミニバッチ K 平均法、SVM などのいくつかの統計学習技術を使用して、大規模な種データセットを分類するための実行可能なソリューションを提供します。
私たちの作品に使われています。
そしてニューラルネットワークモデルと混合します。
これらの手法を適用する際には、大規模なデータセットでも実行できるように、時間とメモリ消費量の最適化を行いました。
また、データの誤ったラベルの影響を軽減するためにいくつかの技術も使用しています。
私たちは 50,000 を超えるカテゴリを含むデータセットを使用し、すべての操作は 6GB RAM と 3.0GHz の CPU を備えた一般的なコンピューターで実行されます。
私たちの貢献は次のとおりです。 1) トレーニングプロセスでどのような問題が発生するかを分析し、これらの問題を解決するための実行可能な方法をいくつか提示します。
2) 従来の CV モデルをニューラルネットワークモデルと組み合わせて、時間と空間リソースの制約内で大規模な分類されたデータセットをトレーニングするための実行可能なシナリオを提供します。

要約(オリジナル)

Though current CV models have been able to achieve high levels of accuracy on small-scale images classification dataset with hundreds or thousands of categories, many models become infeasible in computational or space consumption when it comes to large-scale dataset with more than 50,000 categories. In this paper, we provide a viable solution for classifying large-scale species datasets using traditional CV techniques such as.features extraction and processing, BOVW(Bag of Visual Words) and some statistical learning technics like Mini-Batch K-Means,SVM which are used in our works. And then mixed with a neural network model. When applying these techniques, we have done some optimization in time and memory consumption, so that it can be feasible for large-scale dataset. And we also use some technics to reduce the impact of mislabeling data. We use a dataset with more than 50, 000 categories, and all operations are done on common computer with l 6GB RAM and a CPU of 3. OGHz. Our contributions are: 1) analysis what problems may meet in the training processes, and presents several feasible ways to solve these problems. 2) Make traditional CV models combined with neural network models provide some feasible scenarios for training large-scale classified datasets within the constraints of time and spatial resources.

arxiv情報

著者	Wanhong Huang,Rui Geng
発行日	2024-07-09 16:36:23+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Approaches of large-scale images recognition with more than 50,000 categoris

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー