Heterogeneous Binary Pixel Difference Networks for Remote Sensing Object Detection
Zhan, Jialei; Bai, Liang; Zhang, Jiehua; Liu, Tianpeng; Shi, Fan; Liu, Yongxiang; Liu, Li (2024-12-19)
Zhan, Jialei
Bai, Liang
Zhang, Jiehua
Liu, Tianpeng
Shi, Fan
Liu, Yongxiang
Liu, Li
IEEE
19.12.2024
J. Zhan et al., "Heterogeneous Binary Pixel Difference Networks for Remote Sensing Object Detection," in IEEE Transactions on Geoscience and Remote Sensing, vol. 63, pp. 1-19, 2025, Art no. 5602619, doi: 10.1109/TGRS.2024.3520161
https://rightsstatements.org/vocab/InC/1.0/
© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists,or reuse of any copyrighted component of this work in other works.
https://rightsstatements.org/vocab/InC/1.0/
© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists,or reuse of any copyrighted component of this work in other works.
https://rightsstatements.org/vocab/InC/1.0/
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-202503101937
https://urn.fi/URN:NBN:fi:oulu-202503101937
Tiivistelmä
Abstract
Recent research in remote sensing object detection (RSOD) has significantly advanced the development of vision foundation models. However, deploying these models on resource-constrained edge devices is challenging due to their high computational demands. Binarized detectors utilize binary neural networks (BNNs) to achieve extreme compression by quantizing weights and activations to +1 or −1, which have been extensively studied for generic object detection tasks. In remote sensing images, the objects of interest typically exhibit weak responses, and the images often contain numerous unique local areas. Feature binarization in these images can lead to substantial loss of object contrast and scale prior information, which exacerbates performance issues, particularly for small objects, resulting in significant performance degradation. To address these challenges, we propose a novel binarized detector for RSOD named the heterogeneous binary pixel difference network (HBiPiDiNet). Initially, we developed a binary pixel difference convolution (BiPDC) that integrates local binary patterns (LBPs) to capture local contrast information with traditional binary convolution, thereby enhancing the representation of small objects. Subsequently, we constructed heterogeneous kernel fusion convolution blocks (HKFCB) based on BiPDC and standard binary convolution. The HKFCB comprises multiple BiPDCs at different scales, effectively representing BiPDC under multiscale LBP and multiscale binary convolutions. Extensive experiments demonstrate that our proposed method significantly enhances the performance of state-of-the-art binary detection methods across three remote sensing datasets: AI-TOD, VisDrone2019, and DIOR. We have released our code and models at https://github.com/yuhua666/HBiPiDiNet/tree/main.
Recent research in remote sensing object detection (RSOD) has significantly advanced the development of vision foundation models. However, deploying these models on resource-constrained edge devices is challenging due to their high computational demands. Binarized detectors utilize binary neural networks (BNNs) to achieve extreme compression by quantizing weights and activations to +1 or −1, which have been extensively studied for generic object detection tasks. In remote sensing images, the objects of interest typically exhibit weak responses, and the images often contain numerous unique local areas. Feature binarization in these images can lead to substantial loss of object contrast and scale prior information, which exacerbates performance issues, particularly for small objects, resulting in significant performance degradation. To address these challenges, we propose a novel binarized detector for RSOD named the heterogeneous binary pixel difference network (HBiPiDiNet). Initially, we developed a binary pixel difference convolution (BiPDC) that integrates local binary patterns (LBPs) to capture local contrast information with traditional binary convolution, thereby enhancing the representation of small objects. Subsequently, we constructed heterogeneous kernel fusion convolution blocks (HKFCB) based on BiPDC and standard binary convolution. The HKFCB comprises multiple BiPDCs at different scales, effectively representing BiPDC under multiscale LBP and multiscale binary convolutions. Extensive experiments demonstrate that our proposed method significantly enhances the performance of state-of-the-art binary detection methods across three remote sensing datasets: AI-TOD, VisDrone2019, and DIOR. We have released our code and models at https://github.com/yuhua666/HBiPiDiNet/tree/main.
Kokoelmat
- Avoin saatavuus [38840]