ExLM: Rethinking the Impact of [MASK] Tokens in Masked Language Models

要約

マスク言語モデル (MLM) は、多くの自己教師あり表現学習タスクで目覚ましい成功を収めています。
MLM は、入力文内の一部のトークンを [MASK] トークンにランダムに置き換え、残りのコンテキストに基づいて元のトークンを予測することによってトレーニングされます。
このペーパーでは、MLM に対する [MASK] トークンの影響を調査します。
分析研究によると、トークンのマスキングにより意味論の破損の問題が発生する可能性があり、破損したコンテキストが複数のあいまいな意味を伝える可能性があります。
この問題は、下流タスクにおける MLM のパフォーマンスに影響を与える重要な要因でもあります。
これらの発見に基づいて、新しい拡張コンテキスト MLM、ExLM を提案します。
私たちのアプローチは、入力コンテキストで [MASK] トークンを拡張し、これらの拡張された状態間の依存関係をモデル化します。
この拡張により、コンテキストの容量が増加し、モデルがより豊富なセマンティクス情報を取得できるようになり、事前トレーニング中のセマンティクスの破損の問題が効果的に軽減されます。
実験結果は、ExLM がテキストモデリングタスクと SMILES モデリングタスクの両方で大幅なパフォーマンスの向上を達成することを示しています。
さらなる分析により、ExLM がコンテキスト強化を通じて意味論的表現を強化し、MLM で一般的に観察されるマルチモダリティ問題を効果的に軽減することが確認されました。

要約(オリジナル)

Masked Language Models (MLMs) have achieved remarkable success in many self-supervised representation learning tasks. MLMs are trained by randomly replacing some tokens in the input sentences with [MASK] tokens and predicting the original tokens based on the remaining context. This paper explores the impact of [MASK] tokens on MLMs. Analytical studies show that masking tokens can introduce the corrupted semantics problem, wherein the corrupted context may convey multiple, ambiguous meanings. This problem is also a key factor affecting the performance of MLMs on downstream tasks. Based on these findings, we propose a novel enhanced-context MLM, ExLM. Our approach expands [MASK] tokens in the input context and models the dependencies between these expanded states. This expansion increases context capacity and enables the model to capture richer semantic information, effectively mitigating the corrupted semantics problem during pre-training. Experimental results demonstrate that ExLM achieves significant performance improvements in both text modeling and SMILES modeling tasks. Further analysis confirms that ExLM enhances semantic representations through context enhancement, and effectively reduces the multimodality problem commonly observed in MLMs.

arxiv情報

著者	Kangjie Zheng,Junwei Yang,Siyue Liang,Bin Feng,Zequn Liu,Wei Ju,Zhiping Xiao,Ming Zhang
発行日	2025-01-24 10:20:38+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

ExLM: Rethinking the Impact of [MASK] Tokens in Masked Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー