Abstract
Object detection has recently gained popularity mainly due to the development of deep learning techniques. However, undesirable noise challenges computer vision algorithms in low-light and adverse weather conditions. Existing methods either need help balancing the roles of image enhancement along with object detection, or they frequently need to pay attention to useful latent information. To address this issue, we propose a Low-light Detection Transformer (LDETR), a transformer-based method that enhances images adaptively for improved detection performance. LDETR discovers the intrinsic visual structure by encoding and decoding the realistic illumination-degrading transformation while considering the physical noise model. It uses an attention module to improve the signal-to-noise ratio for object detection in a dark environment. Our proposed LDETR method can process images in standard and adverse conditions and has obtained 51.8% mAP on MS COCO, 55.85% mAP on DAWN, and 79.99% mAP on ExDARK, outperforming state-of-the-art methods. The experimental results on the ExDark, MS COCO and DAWN datasets demonstrate the effectiveness of LDETR in low-light scenarios and adverse weather conditions.
Similar content being viewed by others
Availability of data and materials
Data is available in public domain.
References
Song P, Li P, Dai L, Wang T, Chen Z (2023) Boosting R-CNN: Reweighting R-CNN Samples by RPN’s Error for Underwater Object Detection. Neurocomputing 530:150–164
Xu Y, Sun Y, Yang Z, Miao J, Yang Y (2022) H2FA R-CNN: Holistic and Hierarchical Feature Alignment for Cross-domain Weakly Supervised Object Detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14329–14339
Turay T, Vladimirova T (2022) Toward Performing Image Classification and Object Detection With Convolutional Neural Networks in Autonomous Driving Systems: A Survey. IEEE Access 10:14076–14119
Chandrakar R, Raja R, Miri R, Sinha U, Kumar Singh Kushwaha A, Raja H (2022) Enhanced the moving object detection and object tracking for traffic surveillance using RBF-FDLNN and CBF algorithm. Expert Syst Appl 191:116306. https://doi.org/10.1016/j.eswa.2021.116306
Horváth D, Erdös G, Istenes Z, Horváth T, Földi S, (2023) Object Detection Using Sim2Real Domain Randomization for Robotic Applications. IEEE Trans Robotics 39(2):1225–1243. https://doi.org/10.1109/TRO.2022.3207619
Peng B, Zhang X, Lei J, Zhang Z, Ling N, Huang Q (2022) LVE-S2D: Low-Light Video Enhancement From Static to Dynamic. IEEE Trans Circuits Syst Video Technol 32(12):8342–8352
Li C, Guo C, Han L, Jiang J, Cheng M-M, Gu J, Loy CC (2021) Low-Light Image and Video Enhancement Using Deep Learning: A Survey. IEEE Trans Pattern Anal Mach Intell 44(12):9396–9416
Lv F, Lu F, Wu J, Lim C (2018) MBLLEN: Low-Light Image/Video Enhancement Using CNNs. In: BMVC, vol. 220, p 4
Liu W, Ren G, Yu R, Guo S, Zhu J, Zhang L (2022) Image-adaptive yolo for object detection in adverse weather conditions. Proceedings of the AAAI conference on artificial intelligence 36:1792–1800
Li C, Guo C, Loy CC (2022) Learning to Enhance Low-Light Image via Zero-Reference Deep Curve Estimation. IEEE Trans Pattern Anal Machine Intell 44(8):4225–4238. https://doi.org/10.1109/TPAMI.2021.3063604
Jiang Y, Gong X, Liu D, Cheng Y, Fang C, Shen X, Yang J, Zhou P, Wang Z (2021) EnlightenGAN: Deep Light Enhancement Without Paired Supervision. IEEE Trans Image Process 30:2340–2349. https://doi.org/10.1109/TIP.2021.3051462
Lv F, Li Y, Lu F (2021) Attention Guided Low-Light Image Enhancement with a Large Scale Low-Light Simulation Dataset. Int J Computer Vision 129(7):2175–2193
Tomar AS, Arya KV, Rajput SS (2023) Deep hyfeat based attention in attention model for face super-resolution. IEEE Instrum Meas 72:1–11
Arkin E, Yadikar N, Xu X, Aysa A, Ubul K (2023) A survey: Object detection methods from CNN to transformer. Multimed Tools Appl 82(14):21353–21383
Cui Y, Yan L, Cao Z, Liu D (2021) Tf-blender: Temporal feature blender for video object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8138–8147
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems 28
Yan L, Ma S, Wang Q, Chen Y, Zhang X, Savakis A, Liu D (2022) Video captioning using global-local representation. IEEE Trans Circuits Syst Video Technol 32(10):6642–6656
Bharati P, Pramanik A (2020) Deep learning techniques–r-cnn to mask r-cnn: a survey. Computational Intelligence in Pattern Recognition: Proceedings of CIPR 2019:657–668
Diwan T, Anirudh G, Tembhurne JV (2023) Object detection using yolo: Challenges, architectural successors, datasets and applications. Multimed Tools Appl 82(6):9243–9275
Yan L, Wang Q, Ma S, Wang J, Yu C (2022) Solve the puzzle of instance segmentation in videos: A weakly supervised framework with spatio-temporal collaboration. IEEE Trans Circuits Syst Video Technol 33(1):393–406
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Kaur J, Singh W (2023) A systematic review of object detection from images using deep learning. Multimedia Tools and Applications, 1–86
Kaur J, Singh W (2022) Tools, techniques, datasets and application areas for object detection in an image: a review. Multimed Tools Appl 81(27):38297–38351
Liang W, Xu P, Guo L, Bai H, Zhou Y, Chen F (2021) A survey of 3D object detection. Multimed Tools Appl 80(19):29617–29641
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Jiang P, Ergu D, Liu F, Cai Y, Ma B (2022) A review of yolo algorithm developments. Procedia Computer Sci 199:1066–1073
Cheng G, Wang J, Li K, **e X, Lang C, Yao Y, Han J (2022) Anchor-free oriented proposal generator for object detection. IEEE Trans Geosci Remote Sens 60:1–11
Cui Z, Qi G-J, Gu L, You S, Zhang Z, Harada T (2021) Multitask aet with orthogonal tangent regularity for dark object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2553–2562
Wang W, Xu Z, Huang H, Liu J (2022) Self-aligned concave curve: Illumination enhancement for unsupervised adaptation. In: Proceedings of the 30th ACM international conference on multimedia, pp 2617–2626
Ma T, Ma L, Fan X, Luo Z, Liu R (2022) Pia: Parallel architecture with illumination allocator for joint enhancement and detection in low-light. In: Proceedings of the 30th ACM international conference on multimedia, pp 2070–2078
Wu W, Weng J, Zhang P, Wang X, Yang W, Jiang J (2022) Uretinex-net: Retinex-based deep unfolding network for low-light image enhancement. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5901–5910
Zhang Y, Li Y, Lin Q (2023) Low-light enhancer for uav night tracking based on zero-dce++. J Computer Commun 11(4):1–11
Jiang Z, Shi D, Zhang S (2023) Frse-net: low-illumination object detection network based on feature representation refinement and semantic-aware enhancement. The Visual Computer, 1–15
Al Sobbahi R, Tekli J (2022) Comparing deep learning models for low-light natural scene image enhancement and their impact on object detection and classification: Overview, empirical evaluation, and challenges. Signal Processing: Image Communication, 116848
Vankadari M, Garg S, Majumder A, Kumar S, Behera A (2020) Unsupervised monocular depth estimation for night-time images using adversarial domain feature adaptation. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVIII 16, Springer, pp 443–459
Liu L, Song X, Wang M, Liu Y, Zhang L (2021) Self-supervised monocular depth estimation for all day images using domain separation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 12737–12746
Zhou H, Chang Y, Yan W, Yan L (2023) Unsupervised cumulative domain adaptation for foggy scene optical flow. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9569–9578
Zhou H, Chang Y, Chen G, Yan L (2023) Unsupervised hierarchical domain adaptation for adverse weather optical flow. Proceedings of the AAAI conference on artificial intelligence 37(3):3778–3786. https://doi.org/10.1609/aaai.v37i3.25490
Lee S, Seong H, Lee S, Kim E (2022) Wildnet: Learning domain generalized semantic segmentation from the wild. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9936–9946
Ma X, Wang Z, Zhan Y, Zheng Y, Wang Z, Dai D, Lin C-W (2022) Both style and fog matter: Cumulative domain adaptation for semantic foggy scene understanding. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 18922–18931
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems 30
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S et al (2020) An image is worth 16x16 words: Transformers for image recognition at scale. ar**v:2010.11929
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European Conference on Computer Vision, Springer, pp 213–229
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Sheng W (2022) Qkva grid: Attention in image perspective and stacked detr. ar**v:2207.04313
Zhu X, Su W, Lu L, Li B, Wang X, Dai J (2020) Deformable detr: Deformable transformers for end-to-end object detection. In: International conference on learning representations
Dai J, Qi H, **ong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 764–773
Zhang G, Luo Z, Yu Y, Cui K, Lu S (2022) Accelerating detr convergence via semantic-aligned matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 949–958
Meng D, Chen X, Fan Z, Zeng G, Li H, Yuan Y, Sun L, Wang J (2021) Conditional detr for fast training convergence. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3651–3660
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, Springer, pp 740–755
Kenk MA, Hassaballah M (2020) DAWN: vehicle detection in adverse weather nature dataset. ar**v:2008.05402
Loh YP, Chan CS (2019) Getting to know low-light images with the exclusively dark dataset. Computer Vision Image Understand 178:30–42
Dai Z, Cai B, Lin Y, Chen J (2021) Up-detr: Unsupervised pre-training for object detection with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1601–1610
Wang Y, Zhang X, Yang T, Sun J (2022) Anchor detr: Query design for transformer-based detector. Proceedings of the AAAI conference on artificial intelligence 36:2567–2575
Liu S, Li F, Zhang H, Yang X, Qi X, Su H, Zhu J, Zhang L (2022) Dab-detr: Dynamic anchor boxes are better queries for detr. ar**v:2201.12329
Gao Z, Wang L, Han B, Guo S (2022) Adamixer: A fast-converging query-based object detector. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5364–5373
Marathe A, Ramanan D, Walambe R, Kotecha K (2023) Wedge: A multi-weather autonomous driving dataset built from generative vision-language models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3317–3326
Liu W, Ren G, Yu R, Guo S, Zhu J, Zhang L (2022) Image-adaptive yolo for object detection in adverse weather conditions. Proceedings of the AAAI conference on artificial intelligence 36:1792–1800
Guo C, Li C, Guo J, Loy CC, Hou J, Kwong S, Cong R (2020) Zero-reference deep curve estimation for low-light image enhancement. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1780–1789
Cui Z, Qi G-J, Gu L, You S, Zhang Z, Harada T (2021) Multitask aet with orthogonal tangent regularity for dark object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2553–2562
Jiang Z, Shi D, Zhang S (2023) Frse-net: low-illumination object detection network based on feature representation refinement and semantic-aware enhancement. The Visual Computer, 1–15
Funding
No funding received
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors do not have any conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tiwari, A.K., Pattanaik, M. & Sharma, G.K. Low-light DEtection TRansformer (LDETR): object detection in low-light and adverse weather conditions. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-19087-x
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11042-024-19087-x