SpikeBERT: A Language Spikformer Learned from BERT with Knowledge Distillation

要約

スパイキングニューラルネットワーク (SNN) は、よりエネルギー効率の高い方法でディープニューラルネットワークを実装するための有望な手段を提供します。
ただし、言語タスク用の既存の SNN のネットワークアーキテクチャはまだ単純化されており、比較的浅く、深いアーキテクチャは十分に検討されていないため、BERT などの主流のトランスフォーマーベースのネットワークと比較してパフォーマンスに大きなギャップが生じています。
この目的を達成するために、我々は最近提案されたスパイキングトランスフォーマー（Spikformer）を改良して言語タスクの処理を可能にし、それを訓練するための2段階の知識蒸留法を提案します。
ラベルのないテキストの大規模なコレクションと、同じトレーニング例で微調整された BERT からの知識の再蒸留によるタスク固有のインスタンスでの微調整。
広範な実験を通じて、SpikeBERT と呼ばれる私たちの手法でトレーニングされたモデルが最先端の SNN を上回り、英語と中国語の両方のテキスト分類タスクにおいて、より少ないエネルギー消費で BERT と同等の結果を達成できることを示しました。
私たちのコードは https://github.com/Lvchangze/SpikeBERT で入手できます。

要約(オリジナル)

Spiking neural networks (SNNs) offer a promising avenue to implement deep neural networks in a more energy-efficient way. However, the network architectures of existing SNNs for language tasks are still simplistic and relatively shallow, and deep architectures have not been fully explored, resulting in a significant performance gap compared to mainstream transformer-based networks such as BERT. To this end, we improve a recently-proposed spiking Transformer (i.e., Spikformer) to make it possible to process language tasks and propose a two-stage knowledge distillation method for training it, which combines pre-training by distilling knowledge from BERT with a large collection of unlabelled texts and fine-tuning with task-specific instances via knowledge distillation again from the BERT fine-tuned on the same training examples. Through extensive experimentation, we show that the models trained with our method, named SpikeBERT, outperform state-of-the-art SNNs and even achieve comparable results to BERTs on text classification tasks for both English and Chinese with much less energy consumption. Our code is available at https://github.com/Lvchangze/SpikeBERT.

arxiv情報

著者	Changze Lv,Tianlong Li,Jianhan Xu,Chenxi Gu,Zixuan Ling,Cenyuan Zhang,Xiaoqing Zheng,Xuanjing Huang
発行日	2024-02-21 13:20:21+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SpikeBERT: A Language Spikformer Learned from BERT with Knowledge Distillation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー