High-Level Parallelism and Nested Features for Dynamic Inference Cost and Top-Down Attention

要約

このペーパーでは、動的な推論コストとトップダウンのアテンションメカニズムをシームレスに統合し、従来の深層学習モデルの 2 つの大きなギャップに対処する新しいネットワークトポロジを紹介します。
人間の知覚からインスピレーションを得て、一般的な低レベル機能の逐次処理と、高レベル機能の並列処理およびネストを組み合わせます。
この設計は、人間の皮質における空間的および文脈的に異なる神経活性化に関する最近の神経科学研究の結果を反映しているだけでなく、新しい「カットアウト」技術も導入しています。つまり、タスクに関連するネットワークのみのネットワークの % セグメントを選択的に活性化する機能です。
タスク関連のカテゴリをセグメント化して推論コストを最適化し、再トレーニングの必要性を排除します。
これにより、軽量で適応性のある将来のネットワーク設計への道が開かれ、コンパクトなエッジデバイスから大規模なクラウドに至るまで、幅広いアプリケーションに適したものになると私たちは信じています。
私たちが提案するトポロジーには、トップダウンの注意メカニズムも組み込まれており、カテゴリ固有の高レベルの機能を強化または抑制することによって処理に直接影響を与えることができ、人間の認知で観察される選択的注意メカニズムと類似しています。
ターゲットを絞った外部信号を使用して、テストされたすべてのモデルにわたって予測を実験的に強化しました。
動的推論コストの観点から、私たちの方法論は最大 $73.48\,\%$ のパラメータの除外と $84.41\,\%$ 少ないギガ積和演算 (GMAC) 操作を達成できます。比較ベースラインに対する分析では、平均でコストの削減が示されています。
評価したケース全体でパラメータが $40\,\%$、GMAC が $8\,\%$ でした。

要約(オリジナル)

This paper introduces a novel network topology that seamlessly integrates dynamic inference cost with a top-down attention mechanism, addressing two significant gaps in traditional deep learning models. Drawing inspiration from human perception, we combine sequential processing of generic low-level features with parallelism and nesting of high-level features. This design not only reflects a finding from recent neuroscience research regarding – spatially and contextually distinct neural activations – in human cortex, but also introduces a novel ‘cutout’ technique: the ability to selectively activate %segments of the network for task-relevant only network segments of task-relevant categories to optimize inference cost and eliminate the need for re-training. We believe this paves the way for future network designs that are lightweight and adaptable, making them suitable for a wide range of applications, from compact edge devices to large-scale clouds. Our proposed topology also comes with a built-in top-down attention mechanism, which allows processing to be directly influenced by either enhancing or inhibiting category-specific high-level features, drawing parallels to the selective attention mechanism observed in human cognition. Using targeted external signals, we experimentally enhanced predictions across all tested models. In terms of dynamic inference cost our methodology can achieve an exclusion of up to $73.48\,\%$ of parameters and $84.41\,\%$ fewer giga-multiply-accumulate (GMAC) operations, analysis against comparative baselines show an average reduction of $40\,\%$ in parameters and $8\,\%$ in GMACs across the cases we evaluated.

arxiv情報

著者	André Peter Kelm,Niels Hannemann,Bruno Heberle,Lucas Schmidt,Tim Rolff,Christian Wilms,Ehsan Yaghoubi,Simone Frintrop
発行日	2024-03-07 16:03:20+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

High-Level Parallelism and Nested Features for Dynamic Inference Cost and Top-Down Attention

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー