Towards One-Stage End-to-End Table Structure Recognition with Parallel Regression for Diverse Scenarios

要約

テーブル構造の認識は、構造化されていないデータのテーブルを機械理解可能な形式に解析することを目的としています。
最近の方法は、2段階のプロセスまたは最適化された1段階のアプローチを通じてこの問題に対処しています。
ただし、これらの方法では、複数のネットワークを連続トレーニングし、より時間のかかるシーケンシャルデコードを実行する必要があるか、テーブルの論理構造を解析するために複雑な後処理アルゴリズムに依存する必要があります。
彼らは、クロスセナリオの適応性、堅牢性、計算効率のバランスをとるのに苦労しています。
この論文では、Tablecenternetと呼ばれる1段階のエンドツーエンドテーブル構造解析ネットワークを提案します。
このネットワークは、テーブルの空間的および論理構造の予測を初めて並列回帰タスクに統合し、共有特徴抽出層とタスク固有のデコードの相乗的アーキテクチャを通じて、セルの空間論的位置マッピング法則を暗黙的に学習します。
2段階の方法と比較して、私たちの方法はトレーニングが簡単で、推測が速いです。
ベンチマークデータセットでの実験は、テーブルセンターセットがさまざまなシナリオでテーブル構造を効果的に解析し、Tablegraph-24Kデータセットで最先端のパフォーマンスを達成できることを示しています。
コードはhttps://github.com/dreamy-xay/tablecenternetで入手できます。

要約(オリジナル)

Table structure recognition aims to parse tables in unstructured data into machine-understandable formats. Recent methods address this problem through a two-stage process or optimized one-stage approaches. However, these methods either require multiple networks to be serially trained and perform more time-consuming sequential decoding, or rely on complex post-processing algorithms to parse the logical structure of tables. They struggle to balance cross-scenario adaptability, robustness, and computational efficiency. In this paper, we propose a one-stage end-to-end table structure parsing network called TableCenterNet. This network unifies the prediction of table spatial and logical structure into a parallel regression task for the first time, and implicitly learns the spatial-logical location mapping laws of cells through a synergistic architecture of shared feature extraction layers and task-specific decoding. Compared with two-stage methods, our method is easier to train and faster to infer. Experiments on benchmark datasets show that TableCenterNet can effectively parse table structures in diverse scenarios and achieve state-of-the-art performance on the TableGraph-24k dataset. Code is available at https://github.com/dreamy-xay/TableCenterNet.

arxiv情報

著者	Anyi Xiao,Cihui Yang
発行日	2025-04-24 13:03:13+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Towards One-Stage End-to-End Table Structure Recognition with Parallel Regression for Diverse Scenarios

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー