RECKONING: Reasoning through Dynamic Knowledge Encoding


トランスフォーマーベースの言語モデルに関する最近の研究では、コンテキストの一部として提供される知識に基づいて推論する (つまり、コンテキスト内推論) ことで質問に答えることができることが示されています。
私たちの手法 RECKONING は、逆伝播を通じてパラメトリック知識を更新することで言語モデルに推論を教え、更新されたパラメーターを使用して質問に答えることができる 2 レベル学習アルゴリズムです。
2 つのマルチホップ推論データセットに対する実験では、RECKONING のパフォーマンスがコンテキスト内推論のベースラインよりも向上している (最大 4.5%) ことが示されています。
また、コンテキスト内の推論と比較して、RECKONING はトレーニング中には見ら​​れない長い推論チェーンをよりよく一般化し、コンテキスト内の混乱要因に対してより堅牢であり、同じ知識について複数の質問が行われた場合に計算効率が高いこともわかりました。


Recent studies on transformer-based language models show that they can answer questions by reasoning over knowledge provided as part of the context (i.e., in-context reasoning). However, since the available knowledge is often not filtered for a particular question, in-context reasoning can be sensitive to distractor facts, additional content that is irrelevant to a question but that may be relevant for a different question (i.e., not necessarily random noise). In these situations, the model fails to distinguish the knowledge that is necessary to answer the question, leading to spurious reasoning and degraded performance. This reasoning failure contrasts with the model’s apparent ability to distinguish its contextual knowledge from all the knowledge it has memorized during pre-training. Following this observation, we propose teaching the model to reason more robustly by folding the provided contextual knowledge into the model’s parameters before presenting it with a question. Our method, RECKONING, is a bi-level learning algorithm that teaches language models to reason by updating their parametric knowledge through back-propagation, allowing them to then answer questions using the updated parameters. During training, the inner loop rapidly adapts a copy of the model weights to encode contextual knowledge into its parameters. In the outer loop, the model learns to use the updated weights to reproduce and answer reasoning questions about the memorized knowledge. Our experiments on two multi-hop reasoning datasets show that RECKONING’s performance improves over the in-context reasoning baseline (by up to 4.5%). We also find that compared to in-context reasoning, RECKONING generalizes better to longer reasoning chains unseen during training, is more robust to distractors in the context, and is more computationally efficient when multiple questions are asked about the same knowledge.


著者 Zeming Chen,Gail Weiss,Eric Mitchell,Asli Celikyilmaz,Antoine Bosselut
発行日 2023-05-23 16:20:59+00:00
arxivサイト arxiv_id(pdf)

カテゴリー: cs.AI, cs.CL, cs.LG