Abstract
Current mainstream object detection methods for large aerial images usually divide large images into patches and then exhaustively detect the objects of interest on all patches, no matter whether there exist objects or not. This paradigm, although effective, is inefficient because the detectors have to go through all patches, severely hindering the inference speed. This paper presents an objectness activation network (OAN) to help detectors focus on fewer patches but achieve more efficient inference and more accurate results, enabling a simple and effective solution to object detection in large images. In brief, OAN is a light fully-convolutional network for judging whether each patch contains objects or not, which can be easily integrated into many object detectors and jointly trained with them end-to-end. We extensively evaluate our OAN with five advanced detectors. Using OAN, all five detectors acquire more than 30.0% speed-up on three large-scale aerial image datasets, meanwhile with consistent accuracy improvements. On extremely large Gaofen-2 images (29200 × 27620 pixels), our OAN improves the detection speed by 70.5%. Moreover, we extend our OAN to driving-scene object detection and 4K video object detection, boosting the detection speed by 112.1% and 75.0%, respectively, without sacrificing the accuracy.
Similar content being viewed by others
References
Gu X, Angelov P P, Zhang C, et al. A semi-supervised deep rule-based approach for complex satellite sensor image analysis. IEEE Trans Pattern Anal Machine Intell, 2022, 44: 2281–2292
Ding J, Xue N, **a G S, et al. Object detection in aerial images: a large-scale benchmark and challenges. IEEE Trans Pattern Anal Mach Intell, 2021, 44: 7778–7796
He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. 7708–778
Hu J, Shen L, Sun G. Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018. 71328–7141
Huang G, Liu Z, van der Maaten L, et al. Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. 22618–2269
Sun K, **ao B, Liu D, et al. Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019. 56868–5696
**a G S, Bai X, Ding J, et al. DOTA: a large-scale dataset for object detection in aerial images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018. 39748–3983
Li K, Wan G, Cheng G, et al. Object detection in optical remote sensing images: a survey and a new benchmark. ISPRS J Photogrammetry Remote Sens, 2020, 159: 296–307
Ding J, Xue N, Long Y, et al. Learning RoI transformer for oriented object detection in aerial images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019. 28448–2853
Xu Y, Fu M, Wang Q, et al. Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Trans Pattern Anal Mach Intell, 2021, 43: 1452–1459
Han J, Ding J, Xue N, et al. ReDet: a rotation-equivariant detector for aerial object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021. 27868–2795
Han J, Ding J, Li J, et al. Align deep features for oriented object detection. IEEE Trans Geosci Remote Sens, 2022, 60: 1–11
**e X, Cheng G, Wang J, et al. Oriented R-CNN for object detection. In: Proceedings of the IEEE International Conference on Computer Vision, 2021. 35208–3529
Yang X, Yan J, Liao W, et al. SCRDet+ +: detecting small, cluttered and rotated objects via instance-level feature denoising and rotation loss smoothing. IEEE Trans Pattern Anal Mach Intell, 2023, 45: 2384–2399
Yang F, Fan H, Chu P, et al. Clustered object detection in aerial images. In: Proceedings of the IEEE International Conference on Computer Vision, 2019. 83108–8319
Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell, 2017, 39: 1137–1149
Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, 2017. 3188–327
Cai Z, Vasconcelos N. Cascade R-CNN: delving into high quality object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 61548–6162
Law H, Deng J. CornerNet: detecting objects as paired keypoints. In: Proceedings of the European Conference on Computer Vision, 2018. 7348–750
Tian Z, Shen C, Chen H, et al. FCOS: fully convolutional one-stage object detection. In: Proceedings of IEEE International Conference on Computer Vision, 2019. 96268–9635
Zhang S, Chi C, Yao Y, et al. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020. 97568–9765
Xu C D, Zhao X R, ** X, et al. Exploring categorical regularization for domain adaptive object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020. 117248–11733
Zhao S, Gao C, Shao Y, et al. GTNet: generative transfer network for zero-shot object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2020. 129678–12974
Feng C, Zhong Y, Gao Y, et al. TOOD: task-aligned one-stage object detection. In: Proceedings of the IEEE International Conference on Computer Vision, 2021. 34908–3499
Tang Y P, Wei X S, Zhao B, et al. QBox: partial transfer learning with active querying for object detection. IEEE Trans Neural Netw Learn Syst, 2023, 34: 3058–3070
Wang B, Hu T, Li B, et al. GaTector: a unified framework for gaze object prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022. 195888–19597
Liu L, Ouyang W, Wang X, et al. Deep learning for generic object detection: a survey. Int J Comput Vis, 2020, 128: 261–318
Cheng G, Lai P J, Gao D C, et al. Class attention network for image recognition. Sci China Inf Sci, 2023, 66: 132105
Cheng G, Zhou P, Han J. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images. IEEE Trans Geosci Remote Sens, 2016, 54: 7405–7415
Long Y, Gong Y, **ao Z, et al. Accurate object localization in remote sensing images based on convolutional neural networks. IEEE Trans Geosci Remote Sens, 2017, 55: 2486–2498
Cheng G, Han J, Zhou P, et al. Learning rotation-invariant and fisher discriminative convolutional neural networks for object detection. IEEE Trans Image Process, 2019, 28: 265–278
Wang B, Zhao Y, Li X. Multiple instance graph learning for weakly supervised remote sensing object detection. IEEE Trans Geosci Remote Sens, 2022, 60: 1–12
Cheng G, Lang C, Wu M, et al. Feature enhancement network for object detection in optical remote sensing images. J Remote Sens, 2021, 2021: 9805389
Cheng G, Yao Y, Li S, et al. Dual-aligned oriented detector. IEEE Trans Geosci Remote Sens, 2022, 60: 1–11
Yang X, Yan J. Arbitrary-oriented object detection with circular smooth label. In: Proceedings of the European Conference on Computer Vision, 2020. 6778–694
Cheng G, Wang J, Li K, et al. Anchor-free oriented proposal generator for object detection. IEEE Trans Geosci Remote Sens, 2022, 60: 1–11
Yang X, Hou L, Zhou Y, et al. Dense label encoding for boundary discontinuity free rotation detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021. 158198–15829
Ji Z, Kong Q, Wang H, et al. Small and dense commodity object detection with multi-scale receptive field attention. In: Proceedings of the ACM International Conference on Multimedia, 2019. 13498–1357
Yang X, Yang X, Yang J, et al. Learning high-precision bounding box for rotated object detection via Kullback-Leibler divergence. In: Proceedings of the Advances in Neural Information Processing Systems, 2021. 183818–18394
Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. 7798–788
Zhang S, Wen L, Bian X, et al. Single-shot refinement neural network for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018. 42038–4212
Cao J, Pang Y, Han J, et al. Hierarchical shot detector. In: Proceedings of the IEEE International Conference on Computer Vision, 2019. 97058–9714
Gonzalez-Garcia A, Vezhnevets A, Ferrari V. An active search strategy for efficient object class detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. 30228–3031
LaLonde R, Zhang D, Shah M. ClusterNet: detecting small objects in large scenes by exploiting spatio-temporal information. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018. 40038–4012
Gao M, Yu R, Li A, et al. Dynamic zoom-in network for fast object detection in large images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018. 6926–6935
Pang J, Li C, Shi J, et al. R2-CNN: fast tiny object detection in large-scale remote sensing images. IEEE Trans Geosci Remote Sens, 2019, 57: 5512–5524
Li C, Yang T, Zhu S, et al. Density map guided object detection in aerial images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2020. 7378–746
Uzkent B, Yeh C, Ermon S. Efficient object detection in large images using deep reinforcement learning. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020. 18138–1822
Najibi M, Singh B, Davis L S. AutoFocus: efficient multi-scale inference. In: Proceedings of the IEEE International Conference on Computer Vision, 2019. 97458–9755
Law H, Teng Y, Russakovsky O, et al. CornerNet-Lite: efficient keypoint based object detection. In: Proceedings of the British Machine Vision Conference, 2020
**, Zhengzhou, 450052, China
Corresponding author
Rights and permissions
About this article
Cite this article
**e, X., Cheng, G., Li, Q. et al. Fewer is more: efficient object detection in large aerial images. Sci. China Inf. Sci. 67, 112106 (2024). https://doi.org/10.1007/s11432-022-3718-5
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-022-3718-5