Poly Kernel Inception Network for Remote Sensing Detection

要約

リモートセンシング画像 (RSI) での物体検出は、物体のスケールの大きな変動や多様なコンテキストなど、いくつかの課題が増加していることがよくあります。
従来の方法では、ラージカーネルコンボリューションまたは拡張コンボリューションを通じてバックボーンの空間受容野を拡張することで、これらの課題に対処しようとしていました。
ただし、前者では通常、かなりのバックグラウンドノイズが発生し、後者では過度にまばらな特徴表現が生成される危険性があります。
このペーパーでは、上記の課題に対処するための Poly Kernel Inception Network (PKINet) を紹介します。
PKINet は、拡張を行わないマルチスケールコンボリューションカーネルを採用して、さまざまなスケールのオブジェクトの特徴を抽出し、ローカルコンテキストをキャプチャします。
さらに、長距離のコンテキスト情報を取得するために、コンテキストアンカーアテンション (CAA) モジュールが並行して導入されています。
これら 2 つのコンポーネントは連携して動作し、DOTA-v1.0、DOTA-v1.5、HRSC2016、DIOR-R という 4 つの困難なリモートセンシング検出ベンチマークにおける PKINet のパフォーマンスを向上させます。

要約(オリジナル)

Object detection in remote sensing images (RSIs) often suffers from several increasing challenges, including the large variation in object scales and the diverse-ranging context. Prior methods tried to address these challenges by expanding the spatial receptive field of the backbone, either through large-kernel convolution or dilated convolution. However, the former typically introduces considerable background noise, while the latter risks generating overly sparse feature representations. In this paper, we introduce the Poly Kernel Inception Network (PKINet) to handle the above challenges. PKINet employs multi-scale convolution kernels without dilation to extract object features of varying scales and capture local context. In addition, a Context Anchor Attention (CAA) module is introduced in parallel to capture long-range contextual information. These two components work jointly to advance the performance of PKINet on four challenging remote sensing detection benchmarks, namely DOTA-v1.0, DOTA-v1.5, HRSC2016, and DIOR-R.

arxiv情報

著者	Xinhao Cai,Qiuxia Lai,Yuwei Wang,Wenguan Wang,Zeren Sun,Yazhou Yao
発行日	2024-03-20 15:31:28+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Poly Kernel Inception Network for Remote Sensing Detection

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー