Handwritten Text Generation from Visual Archetypes

要約

ライター固有のスタイルで手書きテキストの合成画像を生成することは、特に目に見えないスタイルや新しい単語の場合は困難な作業であり、これらの後者にトレーニング中にめったに遭遇しない文字が含まれている場合はなおさらです。
ライターのスタイルをエミュレートすることは、生成モデルによって最近対処されていますが、まれなキャラクターへの一般化は無視されてきました.
この作業では、Few-Shot スタイルの手書きテキスト生成用の Transformer ベースのモデルを考案し、テキストとスタイルの両方の堅牢で有益な表現を取得することに焦点を当てています。
特に、標準の GNU Unifont グリフとして記述されたシンボルの画像から取得された密なベクトルのシーケンスとして、テキストコンテンツの新しい表現を提案します。これは、視覚的な原型と見なすことができます。
この戦略は、トレーニング中にめったに見られなかったにもかかわらず、頻繁に観察されるものと視覚的な詳細を共有する可能性のあるキャラクターを生成するのにより適しています。
スタイルに関しては、大規模な合成データセットで特定の事前トレーニングを利用することにより、目に見えない作家の書道の堅牢な表現を取得します。
定量的および定性的な結果は、文字の独立したワンホットエンコーディングに依存する既存のアプローチよりも忠実に、目に見えないスタイルの単語を生成し、希少な文字を使用する提案の有効性を示しています。

要約(オリジナル)

Generating synthetic images of handwritten text in a writer-specific style is a challenging task, especially in the case of unseen styles and new words, and even more when these latter contain characters that are rarely encountered during training. While emulating a writer’s style has been recently addressed by generative models, the generalization towards rare characters has been disregarded. In this work, we devise a Transformer-based model for Few-Shot styled handwritten text generation and focus on obtaining a robust and informative representation of both the text and the style. In particular, we propose a novel representation of the textual content as a sequence of dense vectors obtained from images of symbols written as standard GNU Unifont glyphs, which can be considered their visual archetypes. This strategy is more suitable for generating characters that, despite having been seen rarely during training, possibly share visual details with the frequently observed ones. As for the style, we obtain a robust representation of unseen writers’ calligraphy by exploiting specific pre-training on a large synthetic dataset. Quantitative and qualitative results demonstrate the effectiveness of our proposal in generating words in unseen styles and with rare characters more faithfully than existing approaches relying on independent one-hot encodings of the characters.

arxiv情報

著者	Vittorio Pippi,Silvia Cascianelli,Rita Cucchiara
発行日	2023-03-27 14:58:20+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Handwritten Text Generation from Visual Archetypes

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー