Pho(SC)-CTC — A Hybrid Approach Towards Zero-shot Word Image Recognition

要約

単語画像認識の目的で歴史的文書画像アーカイブ内の単語に注釈を付けるには、時間と熟練した人材 (歴史家、古学者など) が必要です。
実際のシナリオでは、考えられるすべての単語のサンプル画像を取得することも現実的ではありません。
ただし、ゼロショット学習法を適切に使用して、そのような履歴文書の画像に含まれる目に見えない単語や語彙外の単語を認識することができます。
ゼロショット単語認識 Pho(SC)Net のための以前の最先端の方法に基づいて、CTC フレームワーク (Pho(SC)-CTC) に基づくハイブリッドモデルを提案します。
Pho(SC)Net に続いて、コネクショニストの時間的分類 (CTC) フレームワークを使用して、最終的な分類を実行します。
Pho(SC)-CTC と Pho(SC)Net の有効性を正当化する 2 つの公開されている歴史的文書データセットと 1 つの合成手書きデータセットで有望な結果が得られました。

要約(オリジナル)

Annotating words in a historical document image archive for word image recognition purpose demands time and skilled human resource (like historians, paleographers). In a real-life scenario, obtaining sample images for all possible words is also not feasible. However, Zero-shot learning methods could aptly be used to recognize unseen/out-of-lexicon words in such historical document images. Based on previous state-of-the-art method for zero-shot word recognition Pho(SC)Net, we propose a hybrid model based on the CTC framework (Pho(SC)-CTC) that takes advantage of the rich features learned by Pho(SC)Net followed by a connectionist temporal classification (CTC) framework to perform the final classification. Encouraging results were obtained on two publicly available historical document datasets and one synthetic handwritten dataset, which justifies the efficacy of Pho(SC)-CTC and Pho(SC)Net.

arxiv情報

著者	Ravi Bhatt,Anuj Rai,Narayanan C. Krishnan,Sukalpa Chanda
発行日	2022-12-21 08:21:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Pho(SC)-CTC — A Hybrid Approach Towards Zero-shot Word Image Recognition

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー