LMRPA: Large Language Model-Driven Efficient Robotic Process Automation for OCR

要約

このペーパーでは、光学文字認識（OCR）タスクの効率と速度を大幅に改善するように設計された、新しい大規模なモデル駆動型ロボットプロセス自動化（RPA）モデルであるLMRPAを紹介します。
従来のRPAプラットフォームは、OCRのような大量の繰り返しプロセスを処理する際にパフォーマンスのボトルネックに悩まされることが多く、より効率的で時間のかかるプロセスにつながります。
LMRPAは、大規模な言語モデル（LLMS）の統合を可能にし、抽出されたテキストの精度と読みやすさを改善し、曖昧な文字と複雑なテキスト構造によってもたらされる課題を克服しました。
結果は、LMRPAが優れたパフォーマンスを達成し、処理時間を最大52 \％削減します。
たとえば、Tesseract OCRタスクのバッチ2では、LMRPAは9.8秒でプロセスを完了し、Uipathは18.1秒で終了し、自動化は18.7秒で終了しました。
同様の改善が教義で観察されました。そこでは、LMRPAが12.7秒でタスクを完了することで同じプロセスを実行する他の自動化ツールを上回り、競合他社は同じことをするのに20秒以上かかりました。
これらの調査結果は、LMRPAがOCR駆動型の自動化プロセスに革命をもたらす可能性を強調し、既存の最先端のRPAモデルに対してより効率的で効果的な代替ソリューションを提供します。

要約(オリジナル)

This paper introduces LMRPA, a novel Large Model-Driven Robotic Process Automation (RPA) model designed to greatly improve the efficiency and speed of Optical Character Recognition (OCR) tasks. Traditional RPA platforms often suffer from performance bottlenecks when handling high-volume repetitive processes like OCR, leading to a less efficient and more time-consuming process. LMRPA allows the integration of Large Language Models (LLMs) to improve the accuracy and readability of extracted text, overcoming the challenges posed by ambiguous characters and complex text structures.Extensive benchmarks were conducted comparing LMRPA to leading RPA platforms, including UiPath and Automation Anywhere, using OCR engines like Tesseract and DocTR. The results are that LMRPA achieves superior performance, cutting the processing times by up to 52\%. For instance, in Batch 2 of the Tesseract OCR task, LMRPA completed the process in 9.8 seconds, where UiPath finished in 18.1 seconds and Automation Anywhere finished in 18.7 seconds. Similar improvements were observed with DocTR, where LMRPA outperformed other automation tools conducting the same process by completing tasks in 12.7 seconds, while competitors took over 20 seconds to do the same. These findings highlight the potential of LMRPA to revolutionize OCR-driven automation processes, offering a more efficient and effective alternative solution to the existing state-of-the-art RPA models.

arxiv情報

著者	Osama Hosam Abdellaif,Abdelrahman Nader,Ali Hamdi
発行日	2025-06-10 09:32:11+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

LMRPA: Large Language Model-Driven Efficient Robotic Process Automation for OCR

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー