NeuroCLIP: Neuromorphic Data Understanding by CLIP and SNN

要約

最近、神経形態視覚センサーへの関心がますます高まっています。
しかし、ニューロモーフィックデータは非同期イベントスパイクで構成されているため、パワー汎用ニューラルネットワークモデルをトレーニングするための大規模なベンチマークを構築することが困難になり、ディープラーニングによる「見えない」オブジェクトのニューロモーフィックデータの理解が制限されます。
一方、フレーム画像の場合は、トレーニングデータが簡単に取得できるため、事前にトレーニングされた大規模なContrastive Vision-Language Pre-training（CLIP）モデルを介した「目に見えない」タスクのゼロショットおよび少数ショット学習が可能になります。
2D の大規模な画像とテキストのペアによる、感動的なパフォーマンスを示しています。
「目に見えない」問題を処理するために、CLIP をニューロモーフィックデータ認識に移行できるかどうか疑問に思います。
この目的を達成するために、この論文では NeuroCLIP を使用してこのアイデアを具体化します。
NeuroCLIP は、2D CLIP と、神経形態データの理解のために特別に設計された 2 つのモジュールで構成されます。
まず、単純な識別戦略を使用してイベントスパイクを連続フレーム画像に変換できるイベントフレームモジュールです。
2 つ目は、タイムステップ間アダプターです。これは、CLIP のビジュアルエンコーダーからのシーケンシャル機能を対象としたスパイキングニューラルネットワーク (SNN) に基づくシンプルな微調整アダプターで、数ショットのパフォーマンスを向上させます。
N-MNIST、CIFAR10-DVS、ES-ImageNet などの神経形態データセットに関するさまざまな実験により、NeuroCLIP の有効性が実証されています。
私たちのコードは https://github.com/yfguo91/NeuroCLIP.git でオープンソース化されています。

要約(オリジナル)

Recently, the neuromorphic vision sensor has received more and more interest. However, the neuromorphic data consists of asynchronous event spikes, which makes it difficult to construct a big benchmark to train a power general neural network model, thus limiting the neuromorphic data understanding for “unseen’ objects by deep learning. While for the frame image, since the training data can be obtained easily, the zero-shot and few-shot learning for “unseen’ task via the large Contrastive Vision-Language Pre-training (CLIP) model, which is pre-trained by large-scale image-text pairs in 2D, have shown inspirational performance. We wonder whether the CLIP could be transferred to neuromorphic data recognition to handle the “unseen’ problem. To this end, we materialize this idea with NeuroCLIP in the paper. The NeuroCLIP consists of 2D CLIP and two specially designed modules for neuromorphic data understanding. First, an event-frame module that could convert the event spikes to the sequential frame image with a simple discrimination strategy. Second, an inter-timestep adapter, which is a simple fine-tuned adapter based on a spiking neural network (SNN) for the sequential features coming from the visual encoder of CLIP to improve the few-shot performance. Various experiments on neuromorphic datasets including N-MNIST, CIFAR10-DVS, and ES-ImageNet demonstrate the effectiveness of NeuroCLIP. Our code is open-sourced at https://github.com/yfguo91/NeuroCLIP.git.

arxiv情報

著者	Yufei Guo,Yuanpei Chen,Zhe Ma
発行日	2023-12-29 02:44:44+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

NeuroCLIP: Neuromorphic Data Understanding by CLIP and SNN

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー