Enhanced Temporal Processing in Spiking Neural Networks for Static Object Detection Using 3D Convolutions

要約

スパイキングニューラルネットワーク (SNN) は、時空間情報を処理できるネットワークモデルの一種であり、イベント駆動型の特性とエネルギー効率の利点を備えています。
最近、直接トレーニングされた SNN は、分類タスクにおいて従来の人工ニューラルネットワーク (ANN) のパフォーマンスに匹敵するか、それを超える可能性を示しています。
ただし、オブジェクト検出タスクでは、フレームベースの静的オブジェクトデータセット (COCO2017 など) でテストした場合、直接トレーニングされた SNN は依然として ANN と比較して大きなパフォーマンスギャップを示します。
したがって、このパフォーマンスのギャップを埋め、直接トレーニングされた SNN がこれらの静的データセット上で ANN に匹敵するパフォーマンスを達成できるようにすることが、SNN 開発における重要な課題の 1 つとなっています。この課題に対処するために、このホワイトペーパーでは、SNN の独自の処理能力を強化することに焦点を当てています。
時空間情報。
SNN の中核コンポーネントであるスパイクニューロンは、入力浮動小数点データをバイナリスパイク信号に変換するプロセス中に、異なる時間チャネル間の情報交換を容易にします。
ただし、既存のニューロンモデルには、時間情報の伝達において依然として一定の制限があります。
いくつかの研究では、SNN トレーニング中に時間次元のバックプロパゲーションを無効にしても、良好なトレーニング結果が得られる可能性があることさえ示唆しています。
時間情報の SNN 処理を改善するために、この論文では、従来の 2D 畳み込みを 3D 畳み込みに置き換えることにより、時間情報を畳み込みプロセスに直接組み込むことを提案します。
さらに、ニューロン内に時間情報反復メカニズムが導入され、時間情報利用におけるニューロンの効率がさらに向上します。実験結果は、提案された方法により、直接トレーニングされた SNN が COCO2017 および VOC データセット上の ANN に匹敵するパフォーマンスレベルを達成できることを示しています。

要約(オリジナル)

Spiking Neural Networks (SNNs) are a class of network models capable of processing spatiotemporal information, with event-driven characteristics and energy efficiency advantages. Recently, directly trained SNNs have shown potential to match or surpass the performance of traditional Artificial Neural Networks (ANNs) in classification tasks. However, in object detection tasks, directly trained SNNs still exhibit a significant performance gap compared to ANNs when tested on frame-based static object datasets (such as COCO2017). Therefore, bridging this performance gap and enabling directly trained SNNs to achieve performance comparable to ANNs on these static datasets has become one of the key challenges in the development of SNNs.To address this challenge, this paper focuses on enhancing the SNN’s unique ability to process spatiotemporal information. Spiking neurons, as the core components of SNNs, facilitate the exchange of information between different temporal channels during the process of converting input floating-point data into binary spike signals. However, existing neuron models still have certain limitations in the communication of temporal information. Some studies have even suggested that disabling the backpropagation in the time dimension during SNN training can still yield good training results. To improve the SNN handling of temporal information, this paper proposes replacing traditional 2D convolutions with 3D convolutions, thus directly incorporating temporal information into the convolutional process. Additionally, temporal information recurrence mechanism is introduced within the neurons to further enhance the neurons’ efficiency in utilizing temporal information.Experimental results show that the proposed method enables directly trained SNNs to achieve performance levels comparable to ANNs on the COCO2017 and VOC datasets.

arxiv情報

著者	Huaxu He
発行日	2024-12-23 15:32:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Enhanced Temporal Processing in Spiking Neural Networks for Static Object Detection Using 3D Convolutions

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー