LKFormer: large kernel transformer for infrared image super-resolution

Qin, Feiwei; Yan, Kang; Wang, Changmiao; Ge, Ruiquan; Peng, Yong; Zhang, Kai

doi:10.1007/s11042-024-18409-3

LKFormer: large kernel transformer for infrared image super-resolution

Published: 06 February 2024

(2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Feiwei Qin¹,
Kang Yan¹,
Changmiao Wang²,
Ruiquan Ge ORCID: orcid.org/0000-0001-5713-5588¹,
Yong Peng¹ &
…
Kai Zhang³

314 Accesses
1 Altmetric
Explore all metrics

Abstract

Given the broad application of infrared technology across diverse fields, there is an increasing emphasis on investigating super-resolution techniques for infrared images within the realm of deep learning. Despite the impressive results of current Transformer-based methods in image super-resolution tasks, their reliance on the self-attention mechanism intrinsic to the Transformer architecture results in images being treated as one-dimensional sequences, thereby neglecting their inherent two-dimensional structure. Moreover, infrared images exhibit a uniform pixel distribution and a limited gradient range, posing challenges for the model to capture effective feature information. Consequently, we suggest a potent Transformer model, termed Large Kernel Transformer (LKFormer), to address this issue. Specifically, we have designed a Large Kernel Residual Depth-wise Convolutional Attention (LKRDA) module with linear complexity. This mainly employs depth-wise convolution with large kernels to execute non-local feature modeling, thereby substituting the standard self-attention layer. Additionally, we have devised a novel feed-forward network structure called Gated-Pixel Feed-Forward Network (GPFN) to augment the LKFormer’s capacity to manage the information flow within the network. Comprehensive experimental results reveal that our method surpasses the most advanced techniques available, using fewer parameters and yielding considerably superior performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

A Novel Attention Enhanced Dense Network for Image Super-Resolution

Infrared Image Super-Resolution via Heterogeneous Convolutional WGAN

Exploring high-quality image deraining Transformer via effective large kernel attention

Article 02 July 2024

Data availability statement

The authors confirm that the data supporting the findings of this study are available in a public repository. These data were derived from the following resources available in the public domain (https://figshare.com/s/2121562561211c0a8101, https://github.com/rafariva/ThermalDatasets).

References

Sousa E, Vardasca R, Teixeira S, Seixas A, Mendes J, Costa-Ferreira A (2017) A review on the application of medical infrared thermal imaging in hands. Infrared Phys & Technol 85:315–323
Article ADS Google Scholar
Lopez-Perez D, Antonino-Daviu J (2017) Application of infrared thermography to failure detection in industrial induction motors: case stories. IEEE Trans Ind Appl 53(3):1901–1908
Article Google Scholar
Kirimtat A, Krejcar O (2018) A review of infrared thermography for the investigation of building envelopes: Advances and prospects. Energy and Buildings 176:390–406
Article Google Scholar
Lim B, Son S, Kim H, Nah S, Mu Lee K (2017) Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 136–144
Wang X, Yu K, Wu S, Gu J, Liu Y, Dong C, Qiao Y, Change Loy C (2018) ESRGAN: Enhanced super-resolution generative adversarial networks. In: Proceedings of the European conference on computer vision (ECCV) workshops, pp 701–710
Zhang Y, Li K, Li K, Wang L, Zhong B, Fu Y (2018) Image super-resolution using very deep residual channel attention networks. In: Proceedings of the European conference on computer vision (ECCV), pp 286–301
Zhang K, Li Y, Zuo W, Zhang L, Van Gool L, Timofte R (2021) Plug-and-play image restoration with deep denoiser prior. IEEE Trans Pattern Anal Mach Intell 44(10):6360–6376
Article Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inform Process Syst 30
Liang J, Cao J, Sun G, Zhang K, Van Gool L, Timofte R (2021) SwinIR: Image restoration using swin transformer. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1833–1844
Dong C, Loy CC, He K, Tang X (2015) Image super-resolution using deep convolutional networks. IEEE Trans Pattern Anal Mach Intell 38(2):295–307
Article Google Scholar
Kim J, Lee JK, Lee KM (2016) Deeply-recursive convolutional network for image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1637–1645
Zhang K, Zuo W, Gu S, Zhang L (2017) Learning deep CNN denoiser prior for image restoration. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3929–3938
Kim J, Lee JK, Lee KM (2016) Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1646–1654
Cavigelli L, Hager P, Benini L (2017) CAS-CNN: A deep convolutional neural network for image compression artifact suppression. In: 2017 International joint conference on neural networks (IJCNN), pp 752–759
Zhang Y, Tian Y, Kong Y, Zhong B, Fu Y (2018) Residual dense network for image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2472–2481
Zhang Y, Tian Y, Kong Y, Zhong B, Fu Y (2020) Residual dense network for image restoration. IEEE Trans Pattern Anal Mach Intell 43(7):2480–2495
Article Google Scholar
Dai T, Cai J, Zhang Y, **a S-T, Zhang L (2019) Second-order attention network for single image super-resolution. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11065–11074
Niu B, Wen W, Ren W, Zhang X, Yang L, Wang S, Zhang K, Cao X, Shen H (2020) Single image super-resolution via a holistic attention network. In: Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XII 16, Springer, pp 191–207
Zhao H, Kong X, He J, Qiao Y, Dong C (2020) Efficient image super-resolution using pixel attention. In: Computer vision–ECCV 2020 workshops: Glasgow, UK, Proceedings, Part III 16, Springer, pp 56–72. Accessed 23–28 Aug 2020
Mei Y, Fan Y, Zhou Y (2021) Image super-resolution with non-local sparse attention. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3517–3526
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al (2020) An image is worth 16x16 words: Transformers for image recognition at scale. ar**v preprint ar**v:2010.11929
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10012–10022
Fang J, Lin H, Chen X, Zeng K (2022) A hybrid network of CNN and Transformer for lightweight image super-resolution. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1103–1112
Chen X, Wang X, Zhou J, Qiao Y, Dong C (2023) Activating more pixels in image super-resolution transformer. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 22367–22377
Zamir SW, Arora A, Khan S, Hayat M, Khan FS, Yang M-H (2022) Restormer: Efficient transformer for high-resolution image restoration. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5728–5739
Si T, He F, Li P, Gao X (2023) Tri-modality consistency optimization with heterogeneous augmented images for visible-infrared person re-identification. Neurocomputing 523:170–181
Article Google Scholar
Tang W, He F, Liu Y (2023) Tccfusion: An infrared and visible image fusion method based on transformer and cross correlation. Pattern Recogn 137:109295
Article Google Scholar
Wang J, Ralph JF, Goulermas JY (2009) An analysis of a robust super resolution algorithm for infrared imaging. In: 2009 Proceedings of 6th international symposium on image and signal processing and analysis, pp 158–163
He Z, Tang S, Yang J, Cao Y, Yang MY, Cao Y (2018) Cascaded deep networks with multiple receptive fields for infrared image super-resolution. IEEE Trans Circuits Syst Video Technol 29(8):2310–2322
Article Google Scholar
Zou Y, Zhang L, Liu C, Wang B, Hu Y, Chen Q (2021) Super-resolution reconstruction of infrared images based on a convolutional neural network with skip connections. Opt Lasers Eng 146:106717
Article Google Scholar
Huang Y, Jiang Z, Lan R, Zhang S, Pi K (2021) Infrared image super-resolution via transfer learning and PSRGAN. IEEE Signal Process Lett 28:982–986
Article ADS Google Scholar
Huang Y, Jiang Z, Wang Q, Jiang Q, Pang G (2021) Infrared image super-resolution via Heterogeneous Convolutional WGAN. In: Pacific rim international conference on artificial intelligence, pp 461–472
Wu W, Wang T, Wang Z, Cheng L, Wu H (2022) Meta transfer learning-based super-resolution infrared imaging. Digital Signal Processing 131:103730
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Article Google Scholar
Peng C, Zhang X, Yu G, Luo G, Sun J (2017) Large kernel matters–improve semantic segmentation by global convolutional network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4353–4361
Asher T, Zico KJ (2022) Patches are all you need? In: Proceedings of the IEEE international conference on learning representations (ICLR)
Tolstikhin IO, Houlsby N, Kolesnikov A, Beyer L, Zhai X, Unterthiner T, Yung J, Steiner A, Keysers D, Uszkoreit J et al (2021) Mlp-mixer: An all-mlp architecture for vision. Adv Neural Inf Process Syst 34:24261–24272
Google Scholar
Liu Z, Mao H, Wu C-Y, Feichtenhofer C, Darrell T, **e S (2022) A convnet for the 2020s. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11976–11986
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Ding X, Zhang X, Han J, Ding G (2022) Scaling up your kernels to 31x31: Revisiting large kernel design in CNNs. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11963–11975
Liu S, Chen T, Chen X, Chen X, **ao Q, Wu B, Pechenizkiy M, Mocanu D, Wang Z (2022) More convnets in the 2020s: Scaling up kernels beyond 51x51 using sparsity. ar**v preprint ar**v:2207.03620
Zou Y, Zhang L, Liu C, Wang B, Hu Y, Chen Q (2021) Super-resolution reconstruction of infrared images based on a convolutional neural network with skip connections. Opt Lasers Eng 146:106717
Article Google Scholar
Liu Y, Chen X, Cheng J, Peng H, Wang Z (2018) Infrared and visible image fusion with convolutional neural networks. Int J Wavelets Multiresolut Inf Process 16(03):1850018
Article MathSciNet Google Scholar
Danaci KI, Akagunduz E (2022) A survey on infrared image and video sets. ar**v preprint ar**v:2203.08581
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Article ADS PubMed Google Scholar
Gu J, Dong C (2021) Interpreting super-resolution networks with local attribution maps. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9199–9208

Download references

Funding

This work was supported by National Key Research and Development Program of China (No. 2023YFE0114900), Aeronautical Science Foundation of China (No. 2022Z0710T5001), GuangDong Basic and Applied Basic Research Foundation (No. 2022A1515110570), Innovation teams of youth innovation in science and technology of high education institutions of Shandong province (No. 2021KJ088), the Open Project Program of the State Key Laboratory of CAD &CG (No. A2304), Zhejiang University. The authors would like to thank the reviewers in advance for their comments and suggestions.

Author information

Authors and Affiliations

School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, China
Feiwei Qin, Kang Yan, Ruiquan Ge & Yong Peng
Shenzhen Research Institute of Big Data, Shenzhen, China
Changmiao Wang
CVL, ETH Zurich, Zurich, Switzerland
Kai Zhang

Authors

Feiwei Qin
View author publications
You can also search for this author in PubMed Google Scholar
Kang Yan
View author publications
You can also search for this author in PubMed Google Scholar
Changmiao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ruiquan Ge
View author publications
You can also search for this author in PubMed Google Scholar
Yong Peng
View author publications
You can also search for this author in PubMed Google Scholar
Kai Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ruiquan Ge.

Ethics declarations

Source code

The source code will be available at https://github.com/sad192/large-kernel-Transformer

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Qin, F., Yan, K., Wang, C. et al. LKFormer: large kernel transformer for infrared image super-resolution. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18409-3

Download citation

Received: 14 November 2023
Revised: 18 December 2023
Accepted: 19 January 2024
Published: 06 February 2024
DOI: https://doi.org/10.1007/s11042-024-18409-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

LKFormer: large kernel transformer for infrared image super-resolution

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Novel Attention Enhanced Dense Network for Image Super-Resolution

Infrared Image Super-Resolution via Heterogeneous Convolutional WGAN

Exploring high-quality image deraining Transformer via effective large kernel attention

Data availability statement

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Source code

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

LKFormer: large kernel transformer for infrared image super-resolution

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Novel Attention Enhanced Dense Network for Image Super-Resolution

Infrared Image Super-Resolution via Heterogeneous Convolutional WGAN

Exploring high-quality image deraining Transformer via effective large kernel attention

Data availability statement

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Source code

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation