Zero-shot Entailment of Leaderboards for Empirical AI Research

要約

特定の認識テキスト含意 (RTE) タスクカテゴリ、つまり経験的 AI 研究のためのリーダーボードの自動マイニングにおけるゼロショット学習現象の大規模な実証的調査を提示します。
非ゼロショット設定で、RTE タスクとして定式化されたリーダーボード抽出の以前に報告された最先端のモデルは、90% を超えるパフォーマンスが報告されており、有望です。
しかし、中心的な研究課題は未検証のままです: モデルは実際に含意を学習したのでしょうか?
したがって、このホワイトペーパーの実験では、トレーニング中に見えなかったリーダーボードラベルを考慮して、2 つの以前に報告された最先端のモデルを、一般化する能力または含意の能力についてすぐにテストします。
モデルが含意を学習した場合、そのゼロショットパフォーマンスも適度に高くなることが予想されます。おそらく、具体的には、偶然よりも優れています。
この作業の結果、リーダーボード抽出 RTE タスクを定式化する遠隔ラベル付けを介して、ゼロショットラベル付きデータセットが作成されます。

要約(オリジナル)

We present a large-scale empirical investigation of the zero-shot learning phenomena in a specific recognizing textual entailment (RTE) task category, i.e. the automated mining of leaderboards for Empirical AI Research. The prior reported state-of-the-art models for leaderboards extraction formulated as an RTE task, in a non-zero-shot setting, are promising with above 90% reported performances. However, a central research question remains unexamined: did the models actually learn entailment? Thus, for the experiments in this paper, two prior reported state-of-the-art models are tested out-of-the-box for their ability to generalize or their capacity for entailment, given leaderboard labels that were unseen during training. We hypothesize that if the models learned entailment, their zero-shot performances can be expected to be moderately high as well–perhaps, concretely, better than chance. As a result of this work, a zero-shot labeled dataset is created via distant labeling formulating the leaderboard extraction RTE task.

arxiv情報

著者	Salomon Kabongo,Jennifer D’Souza,Sören Auer
発行日	2023-03-29 16:28:43+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Zero-shot Entailment of Leaderboards for Empirical AI Research

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー