RepCali: High Efficient Fine-tuning Via Representation Calibration in Latent Space for Pre-trained Language Models

要約

微調整前の訓練を受けた言語モデル（PLMS）は、PLMSをダウンストリームタスクに適用する上で支配的なパラダイムになりました。
ただし、限られた微調整により、PLMSはPLMSのエンコーダーから得られた表現とPLMSのデコーダーへの最適な入力との間の矛盾に依然として闘っています。
この論文は、潜在空間でのPLMの表現を調整することを学ぶことにより、この課題に取り組んでいます。
提案された表現キャリブレーション法（RepCALI）では、特定のキャリブレーションブロックをエンコーダの後に潜在スペースに統合し、校正出力をデコーダー入力として使用します。
提案されているRepcaliのメリットには、エンコーダーデコーダーアーキテクチャ、プラグアンドプレイの性質、および実装の容易さを備えたすべてのPLMに対する普遍性が含まれています。
8つのタスク（英語と中国の両方のデータセットを含む）にわたる25のPLMベースのモデルでの広範な実験は、提案されたRepCaliがPLMS（LLMを含む）に望ましい強化を提供し、ダウンストリームタスクのパフォーマンスを大幅に改善することを示しています。
4つのベンチマークタスクにわたる比較実験は、Repcaliが代表的な微調整ベースラインよりも優れていることを示しています。

要約(オリジナル)

Fine-tuning pre-trained language models (PLMs) has become a dominant paradigm in applying PLMs to downstream tasks. However, with limited fine-tuning, PLMs still struggle with the discrepancies between the representation obtained from the PLMs’ encoder and the optimal input to the PLMs’ decoder. This paper tackles this challenge by learning to calibrate the representation of PLMs in the latent space. In the proposed representation calibration method (RepCali), we integrate a specific calibration block to the latent space after the encoder and use the calibrated output as the decoder input. The merits of the proposed RepCali include its universality to all PLMs with encoder-decoder architectures, its plug-and-play nature, and ease of implementation. Extensive experiments on 25 PLM-based models across 8 tasks (including both English and Chinese datasets) demonstrate that the proposed RepCali offers desirable enhancements to PLMs (including LLMs) and significantly improves the performance of downstream tasks. Comparison experiments across 4 benchmark tasks indicate that RepCali is superior to the representative fine-tuning baselines.

arxiv情報

著者	Fujun Zhang,XiangDong Su
発行日	2025-05-13 11:47:00+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

RepCali: High Efficient Fine-tuning Via Representation Calibration in Latent Space for Pre-trained Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー