Unchecked and Overlooked: Addressing the Checkbox Blind Spot in Large Language Models with CheckboxQA

要約

チェックボックスは、ダニの有無がデータの抽出と意思決定プロセスを直接通知する現実世界のドキュメント処理で重要です。
しかし、幅広いタスクにわたる大規模なビジョンモデルと言語モデルの強力なパフォーマンスにもかかわらず、彼らはチェック可能なコンテンツの解釈に苦労しています。
この課題は、見落とされがちなチェックボックスが費用のかかる規制または契約上の監視につながる可能性のある業界で特に差し迫っています。
このギャップに対処するために、チェックボックス関連のタスクのモデルパフォーマンスを評価および改善するために設計されたターゲットリソースであるCheckboxQaデータセットを導入します。
現在のモデルの限界を明らかにし、Legal TechやFinanceなどのセクターでのアプリケーションに大きな意味を持つ、ドキュメント理解システムを進めるための貴重なツールとして機能します。
データセットは、https：//github.com/snowflake-labs/checkboxqaで公開されています

要約(オリジナル)

Checkboxes are critical in real-world document processing where the presence or absence of ticks directly informs data extraction and decision-making processes. Yet, despite the strong performance of Large Vision and Language Models across a wide range of tasks, they struggle with interpreting checkable content. This challenge becomes particularly pressing in industries where a single overlooked checkbox may lead to costly regulatory or contractual oversights. To address this gap, we introduce the CheckboxQA dataset, a targeted resource designed to evaluate and improve model performance on checkbox-related tasks. It reveals the limitations of current models and serves as a valuable tool for advancing document comprehension systems, with significant implications for applications in sectors such as legal tech and finance. The dataset is publicly available at: https://github.com/Snowflake-Labs/CheckboxQA

arxiv情報

著者	Michał Turski,Mateusz Chiliński,Łukasz Borchmann
発行日	2025-04-15 11:41:36+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Unchecked and Overlooked: Addressing the Checkbox Blind Spot in Large Language Models with CheckboxQA

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー