Are Large Language Models Table-based Fact-Checkers?

要約

テーブルベースの事実検証 (TFV) は、ステートメントと構造化テーブルの間の含意関係を抽出することを目的としています。
小規模モデルに基づく既存の TFV 手法には、ラベル付きデータが不十分であり、ゼロショット能力が弱いという問題があります。
最近、Large Language Model (LLM) の出現が研究分野で大きな注目を集めています。
彼らは、いくつかの NLP タスクで強力なゼロショット学習能力とコンテキスト内学習能力を示しましたが、TFV での可能性はまだ不明です。
この作業では、LLM がテーブルベースのファクトチェッカーであるかどうかについての予備調査を実装します。
詳細には、コンテキスト内学習が TFV の LLM、つまりゼロショットおよび少数ショットの TFV 機能にどのように役立つかを調査するためのさまざまなプロンプトを設計します。
さらに、LLM の命令チューニングによってもたらされるパフォーマンスの向上を研究するために、TFV 命令を慎重に設計および構築しています。
実験結果は、LLM が迅速なエンジニアリングによりゼロショットおよび少数ショットの TFV で許容可能な結果を達成できる一方、命令チューニングにより TFV 機能を大幅に刺激できることを示しています。
また、ゼロショットプロンプトの形式とコンテキスト内の例の数について、いくつかの貴重な発見も得られました。
最後に、LLM を介して TFV の精度を高めるために考えられるいくつかの方向性を分析します。これは、表推論のさらなる研究に有益です。

要約(オリジナル)

Table-based Fact Verification (TFV) aims to extract the entailment relation between statements and structured tables. Existing TFV methods based on small-scaled models suffer from insufficient labeled data and weak zero-shot ability. Recently, the appearance of Large Language Models (LLMs) has gained lots of attraction in research fields. They have shown powerful zero-shot and in-context learning abilities on several NLP tasks, but their potential on TFV is still unknown. In this work, we implement a preliminary study about whether LLMs are table-based fact-checkers. In detail, we design diverse prompts to explore how the in-context learning can help LLMs in TFV, i.e., zero-shot and few-shot TFV capability. Besides, we carefully design and construct TFV instructions to study the performance gain brought by the instruction tuning of LLMs. Experimental results demonstrate that LLMs can achieve acceptable results on zero-shot and few-shot TFV with prompt engineering, while instruction-tuning can stimulate the TFV capability significantly. We also make some valuable findings about the format of zero-shot prompts and the number of in-context examples. Finally, we analyze some possible directions to promote the accuracy of TFV via LLMs, which is beneficial to further research of table reasoning.

arxiv情報

著者	Hanwen Zhang,Qingyi Si,Peng Fu,Zheng Lin,Weiping Wang
発行日	2024-11-13 12:37:09+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Are Large Language Models Table-based Fact-Checkers?

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー