Supervised deep learning for content-aware image retargeting with Fourier Convolutions

Givkashi, MohammadHossein; Naderi, MohammadReza; Karimi, Nader; Shirani, Shahram; Samavi, Shadrokh

doi:10.1007/s11042-024-18876-8

Supervised deep learning for content-aware image retargeting with Fourier Convolutions

Published: 19 March 2024

(2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

MohammadHossein Givkashi¹,
MohammadReza Naderi¹,
Nader Karimi ORCID: orcid.org/0000-0001-8904-1607¹,
Shahram Shirani² &
…
Shadrokh Samavi^1,2,3

57 Accesses
1 Altmetric
Explore all metrics

Abstract

Image retargeting aims to alter the size of the image with attention to the contents. One of the main obstacles to training deep learning models for image retargeting is the need for a vast labeled dataset. Labeled datasets are unavailable for training deep learning models in the image retargeting tasks. As a result, we present a new supervised approach for training deep learning models. We use the original images as ground truth and create inputs for the model by resizing and crop** the original images. A second challenge is generating different image sizes in inference time. However, normal convolutional neural networks cannot generate images of different sizes than the input image. To address this issue, we introduced a new method for supervised learning. In our approach, a mask is generated to show the desired size and location of the object. Then the mask and the input image are fed to the network. Comparing image retargeting methods and our proposed method demonstrates the model’s ability to produce high-quality retargeted images. Afterward, we compute the image quality assessment score for each output image based on different techniques and illustrate the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Context-aware saliency detection for image retargeting using convolutional neural networks

Article 07 January 2021

Deep Image Retargeting Network with Multi-loss Functions

CRIST900: A Fully-Labeled Natural Image Dataset for Multi-Operator Content Aware Image Retargeting

Data availability

The dataset used during the current study is available in the GitHub repository, https://github.com/givkashi/CAIR.

References

Avidan S, Shamir A (2007) Seam carving for content-aware image resizing. In: ACM SIGGRAPH 2007 Pap, pp 10-es
Chen Y, Pan Y, Song M, Wang M (2015) Image retargeting with a 3D saliency model. Sig Process 112:53–63
Article Google Scholar
Shocher A, Bagon S, Isola P, Irani M (2019) Ingan: capturing and retargeting the dna of a natural image. In: Proc IEEE/CVF Int Conf Comput Vis, pp 4492–4501
Suh B, Ling H, Bederson BB, Jacobs DW (2003) Automatic thumbnail crop** and its effectiveness. In: Proc 16th Annu ACM Symp User Interface Softw Technol, pp 95–104
Chen L-Q, **e X, Fan X, Ma W-Y, Zhang H-J, Zhou H-Q (2003) A visual attention model for adapting images on small displays. Multimed Syst 9:353–364
Article Google Scholar
Zhang M, Zhang L, Sun Y, Feng L, Ma W (2005) Auto crop** for digital photographs. In: 2005 IEEE Int Conf Multimed Expo IEEE, pp 4–pp
Cavalcanti CSVC, Gomes HM, de Queiroz JER (2010) Combining multiple image features to guide automatic portrait crop** for rendering different aspect ratios. In: 2010 Sixth Int Conf Signal-Image Technol Internet Based Syst IEEE, pp 66–73
Li X, Ling H (2009) Learning based thumbnail crop**. In: 2009 IEEE Int Conf Multimed Expo, IEEE, pp 558–561
Ciocca G, Cusano C, Gasparini F, Schettini R (2007) Self-adaptive image crop** for small displays. IEEE Trans Consum Electron 53:1622–1627
Article Google Scholar
Santella A, Agrawala M, DeCarlo D, Salesin D, Cohen M (2006) Gaze-based interaction for semi-automatic photo crop**. In: Proc SIGCHI Conf Hum Factors Comput Syst, pp 771–780
Luo Y, Yuan J, Xue P, Tian Q (2011) Saliency density maximization for efficient visual objects discovery. IEEE Trans Circuits Syst Video Technol 21:1822–1834
Article Google Scholar
Setlur V, Takagi S, Raskar R, Gleicher M, Gooch B (2005) Automatic image retargeting. In: Proc 4th Int Conf Mob Ubiquitous Multimed, pp 59–68
Asheghi B, Salehpour P, Khiavi AM, Hashemzadeh M (2022) A comprehensive review on content-aware image retargeting: from classical to state-of-the-art methods. Sig Process 195:108496
Shafieyan F, Karimi N, Mirmahboub B, Samavi S, Shirani S (2017) Image retargeting using depth assisted saliency map, signal process. Image Commun 50:34–43
Google Scholar
Rubinstein M, Shamir A, Avidan S (2008) Improved seam carving for video retargeting. ACM Trans Graph 27:1–9
Article Google Scholar
Yoon J-C, Lee S-Y, Lee I-K, Kang H (2014) Optimized image resizing using flow-guided seam carving and an interactive genetic algorithm. Multimed Tools Appl 71:1013–1031
Article Google Scholar
Wu L, Gong Y, Yuan X, Zhang X, Cao L (2014) Semantic aware sport image resizing jointly using seam carving and war**. Multimed Tools Appl 70:721–739
Article Google Scholar
Cui J, Cai Q, Lu H, Jia Z, Tang M (2020) Distortion-aware image retargeting based on continuous seam carving model. Sig Process 166:107242
Article Google Scholar
Zhang Y, Sun Z, Jiang P, Huang Y, Peng J (2017) Hybrid image retargeting using optimized seam carving and scaling. Multimed Tools Appl 76:8067–8085
Article Google Scholar
Razzaghi P, Samavi S (2015) Image retargeting using nonparametric semantic segmentation. Multimed Tools Appl 74:11517–11536
Article Google Scholar
Hashemzadeh M, Asheghi B, Farajzadeh N (2019) Content-aware image resizing: an improved and shadow-preserving seam carving method. Sig Process 155:233–246
Article Google Scholar
Pritch Y, Kav-Venaki E, Peleg S (2009) Shift-map image editing. In: IEEE 2009 12th Int Conf Comput Vis, IEEE, pp 151–158
Hu Y, Rajan D (2010) Hybrid shift map for video retargeting. In: 2010 IEEE Comput Soc Conf Comput Vis Pattern Recognit IEEE, pp 577–584
Nakashima R, Utsugi K, Takahashi K, Naemura T (2011) Stereo image retargeting with shift-map. IEICE Trans Inf Syst 94:1345–1348
Article Google Scholar
Ahmadi M, Karimi N, Samavi S (2021) Context-aware saliency detection for image retargeting using convolutional neural networks. Multimed Tools Appl 80:11917–11941
Article Google Scholar
Yan B, Li K, Yang X, Hu T (2014) Seam searching-based pixel fusion for image retargeting. IEEE Trans Circuits Syst Video Technol 25:15–23
Article Google Scholar
Tan W, Yan B, Lin C, Niu X (2019) Cycle-IR: deep cyclic image retargeting. IEEE Trans Multimed 22:1730–1743
Article Google Scholar
Lin J, Zhou T, Chen Z (2019) DeepIR: a deep semantics driven framework for image retargeting. In: 2019 IEEE Int Conf Multimed Expo Work IEEE, pp 54–59
Cho D, Park J, Oh T-H, Tai Y-W, So Kweon I (2017) Weakly-and self-supervised learning for content-aware deep image retargeting. In: Proc IEEE Int Conf Comput Vis, pp 4558–4567
Kajiura N, Kosugi S, Wang X, Yamasaki T (2020) Self-play reinforcement learning for fast image retargeting. In: Proc 28th ACM Int Conf Multimed, pp 1755–1763
Rubinstein M, Shamir A, Avidan S (2009) Multi-operator media retargeting. ACM Trans Graph 28:1–11
Article Google Scholar
Zhou Y, Chen Z, Li W (2020) Weakly supervised reinforced multi-operator image retargeting. IEEE Trans Circuits Syst Video Technol 31:126–139
Article Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2020) Generative adversarial networks. Commun ACM 63:139–144
Article Google Scholar
Mastan ID, Raman S (2020) Dcil: Deep contextual internal learning for image restoration and image retargeting. In: Proc IEEE/CVF Winter Conf Appl Comput Vis, pp 2366–2375
Naderi MR, Givkashi MH, Karimi N, Shirani S, Samavi S (2022) OAIR: object-aware image retargeting using PSO and aesthetic quality assessment. Ar**v Prepr. Ar**v2209.04804
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proc ICNN’95-International Conf Neural Networks, IEEE, pp 1942–1948
Granot N, Feinstein B, Shocher A, Bagon S, Irani M (2022) Drop the gan: in defense of patches nearest neighbors as single image generative models. In: Proc IEEE/CVF Conf Comput Vis Pattern Recognit, pp 13460–13469
Elnekave A, Weiss Y (2022) Generating natural images with direct patch distributions matching. Ar**v Prepr. Ar**v2203.11862
Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proc Eur Conf Comput Vis, pp 801–818
Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: optimal speed and accuracy of object detection. Ar**v Prepr. Ar**v2004.10934
Chi L, Jiang B, Mu Y (2020) Fast fourier convolution. Adv Neural Inf Process Syst 33:4479–4488
Google Scholar
Nussbaumer HJ (1981) The fast Fourier transform. In: Fast fourier transform convolution algorithms. Springer Ser Inf Sci 2:80–111
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proc IEEE Conf Comput Vis Pattern Recognit, pp 770–778
Isola P, Zhu J-Y, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proc IEEE Conf Comput Vis Pattern Recognit, pp 1125–1134
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. Ar**v Prepr. Ar**v1412.6980
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution In: Eur Conf Comput Vis, Springer, pp 694–711
Suvorov R, Logacheva E, Mashikhin A, Remizova A, Ashukha A, Silvestrov A, Kong N, Goka H, Park K, Lempitsky V (2022) Resolution-robust large mask inpainting with fourier convolutions. In: Proc IEEE/CVF Winter Conf Appl Comput Vis, pp 2149–2159
Mescheder L, Geiger A, Nowozin S (2018) Which training methods for GANs do actually converge? In: Int Conf Mach Learn, PMLR, pp 3481–3490
Drucker H, Le Cun Y (1992) Improving generalization performance using double backpropagation. IEEE Trans Neural Networks 3:991–997
Article CAS PubMed Google Scholar
Ross A, Doshi-Velez F (2018) Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In: Proc AAAI Conf Artif Intell
Wang T-C, Liu M-Y, Zhu J-Y, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: Proc IEEE Conf Comput Vis Pattern Recognit, pp 8798–8807
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset. Technical report CNS-TR-2011-001, California Institute of Technology
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L (2019) Pytorch: an imperative style, high-performance deep learning library. In: 33rd annual conference on neural information processing systems, pp 1–12
Hosu V, Lin H, Sziranyi T, Saupe D (2020) KonIQ-10k: an ecologically valid database for deep learning of blind image quality assessment. IEEE Trans Image Process 29:4041–4056
Article Google Scholar
Su S, Yan Q, Zhu Y, Zhang C, Ge X, Sun J, Zhang Y (2020) Blindly assess image quality in the wild guided by a self-adaptive hyper network. In: Proc IEEE/CVF Conf Comput Vis Pattern Recognit, pp 3667–3676
Hosu V, Goldlucke B, Saupe D (2019) Effective aesthetics prediction with multi-level spatially pooled features. In: Proc IEEE/CVF Conf Comput Vis Pattern Recognit, pp 9375–9383

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, 84156-83111, Iran
MohammadHossein Givkashi, MohammadReza Naderi, Nader Karimi & Shadrokh Samavi
Department of Electrical and Computer Engineering, McMaster University, Hamilton, ON, L8S 4L8, Canada
Shahram Shirani & Shadrokh Samavi
Computer Science Department, Seattle University, Seattle, WA, 98122, USA
Shadrokh Samavi

Authors

MohammadHossein Givkashi
View author publications
You can also search for this author in PubMed Google Scholar
MohammadReza Naderi
View author publications
You can also search for this author in PubMed Google Scholar
Nader Karimi
View author publications
You can also search for this author in PubMed Google Scholar
Shahram Shirani
View author publications
You can also search for this author in PubMed Google Scholar
Shadrokh Samavi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nader Karimi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Givkashi, M., Naderi, M., Karimi, N. et al. Supervised deep learning for content-aware image retargeting with Fourier Convolutions. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18876-8

Download citation

Received: 18 June 2023
Revised: 16 January 2024
Accepted: 11 March 2024
Published: 19 March 2024
DOI: https://doi.org/10.1007/s11042-024-18876-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Supervised deep learning for content-aware image retargeting with Fourier Convolutions

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Context-aware saliency detection for image retargeting using convolutional neural networks

Deep Image Retargeting Network with Multi-loss Functions

CRIST900: A Fully-Labeled Natural Image Dataset for Multi-Operator Content Aware Image Retargeting

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Supervised deep learning for content-aware image retargeting with Fourier Convolutions

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Context-aware saliency detection for image retargeting using convolutional neural networks

Deep Image Retargeting Network with Multi-loss Functions

CRIST900: A Fully-Labeled Natural Image Dataset for Multi-Operator Content Aware Image Retargeting

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation