PP-OCRv3: More Attempts for the Improvement of Ultra Lightweight OCR System

要約

図1に示すように、光学式文字認識（OCR）テクノロジーは、さまざまなシーンで広く使用されています。実用的なOCRシステムの設計は、依然として意味のある、しかし困難な作業です。
以前の作業では、効率と精度を考慮して、実用的な超軽量OCRシステム（PP-OCR）と最適化バージョンPP-OCRv2を提案しました。
PP-OCRv2のパフォーマンスをさらに向上させるために、この論文ではより堅牢なOCRシステムPP-OCRv3を提案します。
PP-OCRv3は、PP-OCRv2に基づいて、テキスト検出モデルとテキスト認識モデルを9つの側面でアップグレードします。
テキスト検出器には、LK-PANという大きな受容野を持つPANモジュール、RSE-FPNという残留注意機構を持つFPNモジュール、DML蒸留戦略を紹介します。
テキスト認識機能の場合、基本モデルがCRNNからSVTRに置き換えられ、軽量テキスト認識ネットワークSVTR LCNet、注意によるCTCのガイド付きトレーニング、データ拡張戦略TextConAug、自己監視型TextRotNet、UDML、および
モデルを加速し、効果を向上させるUIM。
実際のデータでの実験は、PP-OCRv3のhmeanが同等の推論速度の下でPP-OCRv2より5％高いことを示しています。
上記のモデルはすべてオープンソースであり、コードはPaddlePaddleを搭載したGitHubリポジトリPaddleOCRで入手できます。

要約(オリジナル)

Optical character recognition (OCR) technology has been widely used in various scenes, as shown in Figure 1. Designing a practical OCR system is still a meaningful but challenging task. In previous work, considering the efficiency and accuracy, we proposed a practical ultra lightweight OCR system (PP-OCR), and an optimized version PP-OCRv2. In order to further improve the performance of PP-OCRv2, a more robust OCR system PP-OCRv3 is proposed in this paper. PP-OCRv3 upgrades the text detection model and text recognition model in 9 aspects based on PP-OCRv2. For text detector, we introduce a PAN module with large receptive field named LK-PAN, a FPN module with residual attention mechanism named RSE-FPN, and DML distillation strategy. For text recognizer, the base model is replaced from CRNN to SVTR, and we introduce lightweight text recognition network SVTR LCNet, guided training of CTC by attention, data augmentation strategy TextConAug, better pre-trained model by self-supervised TextRotNet, UDML, and UIM to accelerate the model and improve the effect. Experiments on real data show that the hmean of PP-OCRv3 is 5% higher than PP-OCRv2 under comparable inference speed. All the above mentioned models are open-sourced and the code is available in the GitHub repository PaddleOCR which is powered by PaddlePaddle.

arxiv情報

著者	Chenxia Li,Weiwei Liu,Ruoyu Guo,Xiaoting Yin,Kaitao Jiang,Yongkun Du,Yuning Du,Lingfeng Zhu,Baohua Lai,Xiaoguang Hu,Dianhai Yu,Yanjun Ma
発行日	2022-06-14 12:00:10+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

PP-OCRv3: More Attempts for the Improvement of Ultra Lightweight OCR System

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー