LIX: Implicitly Infusing Spatial Geometric Prior Knowledge into Visual Semantic Segmentation for Autonomous Driving

要約

視覚セマンティックセグメンテーションのために二重エンコーダを使用してデータ融合ネットワークによって達成された印象的なパフォーマンスにもかかわらず、空間的な幾何学データが利用できない場合、それらは効果がありません。
データフュージョンの教師ネットワークによって取得された空間的幾何学的事前知識を暗黙的に単一モーダルの学生ネットワークに吹き込むことは、あまり探求されていない研究通りではありますが、実用的です。
この記事はこのトピックを掘り下げ、この問題に対処するために知識蒸留アプローチに頼ります。
「X」（lix）フレームワークを注入するための学習を紹介します。これは、ロジット蒸留と特徴の蒸留面の両方に新しい貢献をします。
分離された知識蒸留で単一の固定重量を使用することの制限を強調し、この問題の解決策としてロジットごとの動的ウェイトコントローラーを導入する数学的証拠を提示します。
さらに、2つの新しい技術を含む、適応的にレカリフォルションされた特徴蒸留アルゴリズムを開発します。カーネル回帰を介した特徴の再調整と、中心カーネルアライメントを介した詳細な特徴の一貫性の定量化です。
さまざまなパブリックデータセットで中間融合および遅延融合ネットワークで実施された広範な実験は、定量的評価と定性的評価の両方を提供し、他の最先端のアプローチと比較した場合、LIXフレームワークの優れたパフォーマンスを実証します。

要約(オリジナル)

Despite the impressive performance achieved by data-fusion networks with duplex encoders for visual semantic segmentation, they become ineffective when spatial geometric data are not available. Implicitly infusing the spatial geometric prior knowledge acquired by a data-fusion teacher network into a single-modal student network is a practical, albeit less explored research avenue. This article delves into this topic and resorts to knowledge distillation approaches to address this problem. We introduce the Learning to Infuse ”X” (LIX) framework, with novel contributions in both logit distillation and feature distillation aspects. We present a mathematical proof that underscores the limitation of using a single, fixed weight in decoupled knowledge distillation and introduce a logit-wise dynamic weight controller as a solution to this issue. Furthermore, we develop an adaptively-recalibrated feature distillation algorithm, including two novel techniques: feature recalibration via kernel regression and in-depth feature consistency quantification via centered kernel alignment. Extensive experiments conducted with intermediate-fusion and late-fusion networks across various public datasets provide both quantitative and qualitative evaluations, demonstrating the superior performance of our LIX framework when compared to other state-of-the-art approaches.

arxiv情報

著者	Sicen Guo,Ziwei Long,Zhiyuan Wu,Qijun Chen,Ioannis Pitas,Rui Fan
発行日	2025-03-14 09:24:22+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

LIX: Implicitly Infusing Spatial Geometric Prior Knowledge into Visual Semantic Segmentation for Autonomous Driving

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー