Aligning benchmark datasets for table structure recognition

要約

テーブル構造認識 (TSR) のベンチマークデータセットは、一貫して注釈が付けられるように慎重に処理する必要があります。
ただし、データセットの注釈が自己一貫性がある場合でも、データセット間で重大な不整合が発生する可能性があり、それらでトレーニングおよび評価されたモデルのパフォーマンスが損なわれる可能性があります。
この作業では、これらのベンチマークを調整することで、$\unicode{x2014}$エラーとそれらの間の不一致の両方を取り除くことで、モデルのパフォーマンスが大幅に向上することを示しています。
これは、データ中心のアプローチを通じて実証されます。このアプローチでは、単一のモデルアーキテクチャであるテーブルトランスフォーマー (TATR) を採用し、全体を通して固定しています。
ICDAR-2013 ベンチマークで評価された TATR のベースライン完全一致精度は、PubTables-1M でトレーニングした場合は 65%、FinTabNet でトレーニングした場合は 42%、合計で 69% です。
注釈の誤りとデータセット間の不一致を減らした後、ICDAR-2013 で評価された TATR のパフォーマンスは、PubTables-1M でトレーニングした場合は 75%、FinTabNet でトレーニングした場合は 65%、合計で 81% に大幅に向上します。
テーブル注釈の正規化がパフォーマンスに非常にプラスの効果をもたらす一方で、ベンチマークデータセットの最終的な構成を決定するときに発生する必要なトレードオフのバランスを他の選択肢がとることを、変更手順のアブレーションを通じて示します。
全体として、私たちの作業は、TSR のベンチマーク設計や、潜在的に他のタスクにも重要な意味を持っていると考えています。
すべてのデータセット処理とトレーニングコードがリリースされます。

要約(オリジナル)

Benchmark datasets for table structure recognition (TSR) must be carefully processed to ensure they are annotated consistently. However, even if a dataset’s annotations are self-consistent, there may be significant inconsistency across datasets, which can harm the performance of models trained and evaluated on them. In this work, we show that aligning these benchmarks$\unicode{x2014}$removing both errors and inconsistency between them$\unicode{x2014}$improves model performance significantly. We demonstrate this through a data-centric approach where we adopt a single model architecture, the Table Transformer (TATR), that we hold fixed throughout. Baseline exact match accuracy for TATR evaluated on the ICDAR-2013 benchmark is 65% when trained on PubTables-1M, 42% when trained on FinTabNet, and 69% combined. After reducing annotation mistakes and inter-dataset inconsistency, performance of TATR evaluated on ICDAR-2013 increases substantially to 75% when trained on PubTables-1M, 65% when trained on FinTabNet, and 81% combined. We show through ablations over the modification steps that canonicalization of the table annotations has a significantly positive effect on performance, while other choices balance necessary trade-offs that arise when deciding a benchmark dataset’s final composition. Overall we believe our work has significant implications for benchmark design for TSR and potentially other tasks as well. All dataset processing and training code will be released.

arxiv情報

著者	Brandon Smock,Rohith Pesala,Robin Abraham
発行日	2023-03-01 18:20:24+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Aligning benchmark datasets for table structure recognition

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー