AttentionHTR: Handwritten Text Recognition Based on Attention Encoder-Decoder Networks

要約

この作業では、手書き単語認識のための注意ベースのシーケンス間モデルを提案し、HTR システムのデータ効率の高いトレーニングのための転移学習を調査します。
トレーニングデータの不足を克服するために、この作業では、手書き認識モデルを調整するための出発点として、シーンテキスト画像で事前トレーニングされたモデルを活用します。
ResNet 特徴抽出と双方向 LSTM ベースのシーケンスモデリングステージが一緒になってエンコーダーを形成します。
予測ステージは、デコーダーとコンテンツベースの注意メカニズムで構成されます。
提案されたエンドツーエンドの HTR システムの有効性は、新しいマルチライターデータセット Imgur5K と IAM データセットで経験的に評価されています。
実験結果は、HTR フレームワークのパフォーマンスを評価し、エラーケースの詳細な分析によってさらにサポートされます。
ソースコードと事前トレーニング済みのモデルは、https://github.com/dmitrijsk/AttentionHTR で入手できます。

要約(オリジナル)

This work proposes an attention-based sequence-to-sequence model for handwritten word recognition and explores transfer learning for data-efficient training of HTR systems. To overcome training data scarcity, this work leverages models pre-trained on scene text images as a starting point towards tailoring the handwriting recognition models. ResNet feature extraction and bidirectional LSTM-based sequence modeling stages together form an encoder. The prediction stage consists of a decoder and a content-based attention mechanism. The effectiveness of the proposed end-to-end HTR system has been empirically evaluated on a novel multi-writer dataset Imgur5K and the IAM dataset. The experimental results evaluate the performance of the HTR framework, further supported by an in-depth analysis of the error cases. Source code and pre-trained models are available at https://github.com/dmitrijsk/AttentionHTR.

arxiv情報

著者	Dmitrijs Kass,Ekta Vats
発行日	2022-09-12 11:47:55+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

AttentionHTR: Handwritten Text Recognition Based on Attention Encoder-Decoder Networks

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー