Abstract
Image retargeting aims to alter the size of the image with attention to the contents. One of the main obstacles to training deep learning models for image retargeting is the need for a vast labeled dataset. Labeled datasets are unavailable for training deep learning models in the image retargeting tasks. As a result, we present a new supervised approach for training deep learning models. We use the original images as ground truth and create inputs for the model by resizing and crop** the original images. A second challenge is generating different image sizes in inference time. However, normal convolutional neural networks cannot generate images of different sizes than the input image. To address this issue, we introduced a new method for supervised learning. In our approach, a mask is generated to show the desired size and location of the object. Then the mask and the input image are fed to the network. Comparing image retargeting methods and our proposed method demonstrates the model’s ability to produce high-quality retargeted images. Afterward, we compute the image quality assessment score for each output image based on different techniques and illustrate the effectiveness of our approach.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18876-8/MediaObjects/11042_2024_18876_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18876-8/MediaObjects/11042_2024_18876_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18876-8/MediaObjects/11042_2024_18876_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18876-8/MediaObjects/11042_2024_18876_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18876-8/MediaObjects/11042_2024_18876_Fig5_HTML.png)
Similar content being viewed by others
Data availability
The dataset used during the current study is available in the GitHub repository, https://github.com/givkashi/CAIR.
References
Avidan S, Shamir A (2007) Seam carving for content-aware image resizing. In: ACM SIGGRAPH 2007 Pap, pp 10-es
Chen Y, Pan Y, Song M, Wang M (2015) Image retargeting with a 3D saliency model. Sig Process 112:53–63
Shocher A, Bagon S, Isola P, Irani M (2019) Ingan: capturing and retargeting the dna of a natural image. In: Proc IEEE/CVF Int Conf Comput Vis, pp 4492–4501
Suh B, Ling H, Bederson BB, Jacobs DW (2003) Automatic thumbnail crop** and its effectiveness. In: Proc 16th Annu ACM Symp User Interface Softw Technol, pp 95–104
Chen L-Q, **e X, Fan X, Ma W-Y, Zhang H-J, Zhou H-Q (2003) A visual attention model for adapting images on small displays. Multimed Syst 9:353–364
Zhang M, Zhang L, Sun Y, Feng L, Ma W (2005) Auto crop** for digital photographs. In: 2005 IEEE Int Conf Multimed Expo IEEE, pp 4–pp
Cavalcanti CSVC, Gomes HM, de Queiroz JER (2010) Combining multiple image features to guide automatic portrait crop** for rendering different aspect ratios. In: 2010 Sixth Int Conf Signal-Image Technol Internet Based Syst IEEE, pp 66–73
Li X, Ling H (2009) Learning based thumbnail crop**. In: 2009 IEEE Int Conf Multimed Expo, IEEE, pp 558–561
Ciocca G, Cusano C, Gasparini F, Schettini R (2007) Self-adaptive image crop** for small displays. IEEE Trans Consum Electron 53:1622–1627
Santella A, Agrawala M, DeCarlo D, Salesin D, Cohen M (2006) Gaze-based interaction for semi-automatic photo crop**. In: Proc SIGCHI Conf Hum Factors Comput Syst, pp 771–780
Luo Y, Yuan J, Xue P, Tian Q (2011) Saliency density maximization for efficient visual objects discovery. IEEE Trans Circuits Syst Video Technol 21:1822–1834
Setlur V, Takagi S, Raskar R, Gleicher M, Gooch B (2005) Automatic image retargeting. In: Proc 4th Int Conf Mob Ubiquitous Multimed, pp 59–68
Asheghi B, Salehpour P, Khiavi AM, Hashemzadeh M (2022) A comprehensive review on content-aware image retargeting: from classical to state-of-the-art methods. Sig Process 195:108496
Shafieyan F, Karimi N, Mirmahboub B, Samavi S, Shirani S (2017) Image retargeting using depth assisted saliency map, signal process. Image Commun 50:34–43
Rubinstein M, Shamir A, Avidan S (2008) Improved seam carving for video retargeting. ACM Trans Graph 27:1–9
Yoon J-C, Lee S-Y, Lee I-K, Kang H (2014) Optimized image resizing using flow-guided seam carving and an interactive genetic algorithm. Multimed Tools Appl 71:1013–1031
Wu L, Gong Y, Yuan X, Zhang X, Cao L (2014) Semantic aware sport image resizing jointly using seam carving and war**. Multimed Tools Appl 70:721–739
Cui J, Cai Q, Lu H, Jia Z, Tang M (2020) Distortion-aware image retargeting based on continuous seam carving model. Sig Process 166:107242
Zhang Y, Sun Z, Jiang P, Huang Y, Peng J (2017) Hybrid image retargeting using optimized seam carving and scaling. Multimed Tools Appl 76:8067–8085
Razzaghi P, Samavi S (2015) Image retargeting using nonparametric semantic segmentation. Multimed Tools Appl 74:11517–11536
Hashemzadeh M, Asheghi B, Farajzadeh N (2019) Content-aware image resizing: an improved and shadow-preserving seam carving method. Sig Process 155:233–246
Pritch Y, Kav-Venaki E, Peleg S (2009) Shift-map image editing. In: IEEE 2009 12th Int Conf Comput Vis, IEEE, pp 151–158
Hu Y, Rajan D (2010) Hybrid shift map for video retargeting. In: 2010 IEEE Comput Soc Conf Comput Vis Pattern Recognit IEEE, pp 577–584
Nakashima R, Utsugi K, Takahashi K, Naemura T (2011) Stereo image retargeting with shift-map. IEICE Trans Inf Syst 94:1345–1348
Ahmadi M, Karimi N, Samavi S (2021) Context-aware saliency detection for image retargeting using convolutional neural networks. Multimed Tools Appl 80:11917–11941
Yan B, Li K, Yang X, Hu T (2014) Seam searching-based pixel fusion for image retargeting. IEEE Trans Circuits Syst Video Technol 25:15–23
Tan W, Yan B, Lin C, Niu X (2019) Cycle-IR: deep cyclic image retargeting. IEEE Trans Multimed 22:1730–1743
Lin J, Zhou T, Chen Z (2019) DeepIR: a deep semantics driven framework for image retargeting. In: 2019 IEEE Int Conf Multimed Expo Work IEEE, pp 54–59
Cho D, Park J, Oh T-H, Tai Y-W, So Kweon I (2017) Weakly-and self-supervised learning for content-aware deep image retargeting. In: Proc IEEE Int Conf Comput Vis, pp 4558–4567
Kajiura N, Kosugi S, Wang X, Yamasaki T (2020) Self-play reinforcement learning for fast image retargeting. In: Proc 28th ACM Int Conf Multimed, pp 1755–1763
Rubinstein M, Shamir A, Avidan S (2009) Multi-operator media retargeting. ACM Trans Graph 28:1–11
Zhou Y, Chen Z, Li W (2020) Weakly supervised reinforced multi-operator image retargeting. IEEE Trans Circuits Syst Video Technol 31:126–139
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2020) Generative adversarial networks. Commun ACM 63:139–144
Mastan ID, Raman S (2020) Dcil: Deep contextual internal learning for image restoration and image retargeting. In: Proc IEEE/CVF Winter Conf Appl Comput Vis, pp 2366–2375
Naderi MR, Givkashi MH, Karimi N, Shirani S, Samavi S (2022) OAIR: object-aware image retargeting using PSO and aesthetic quality assessment. Ar**v Prepr. Ar**v2209.04804
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proc ICNN’95-International Conf Neural Networks, IEEE, pp 1942–1948
Granot N, Feinstein B, Shocher A, Bagon S, Irani M (2022) Drop the gan: in defense of patches nearest neighbors as single image generative models. In: Proc IEEE/CVF Conf Comput Vis Pattern Recognit, pp 13460–13469
Elnekave A, Weiss Y (2022) Generating natural images with direct patch distributions matching. Ar**v Prepr. Ar**v2203.11862
Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proc Eur Conf Comput Vis, pp 801–818
Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: optimal speed and accuracy of object detection. Ar**v Prepr. Ar**v2004.10934
Chi L, Jiang B, Mu Y (2020) Fast fourier convolution. Adv Neural Inf Process Syst 33:4479–4488
Nussbaumer HJ (1981) The fast Fourier transform. In: Fast fourier transform convolution algorithms. Springer Ser Inf Sci 2:80–111
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proc IEEE Conf Comput Vis Pattern Recognit, pp 770–778
Isola P, Zhu J-Y, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proc IEEE Conf Comput Vis Pattern Recognit, pp 1125–1134
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. Ar**v Prepr. Ar**v1412.6980
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution In: Eur Conf Comput Vis, Springer, pp 694–711
Suvorov R, Logacheva E, Mashikhin A, Remizova A, Ashukha A, Silvestrov A, Kong N, Goka H, Park K, Lempitsky V (2022) Resolution-robust large mask inpainting with fourier convolutions. In: Proc IEEE/CVF Winter Conf Appl Comput Vis, pp 2149–2159
Mescheder L, Geiger A, Nowozin S (2018) Which training methods for GANs do actually converge? In: Int Conf Mach Learn, PMLR, pp 3481–3490
Drucker H, Le Cun Y (1992) Improving generalization performance using double backpropagation. IEEE Trans Neural Networks 3:991–997
Ross A, Doshi-Velez F (2018) Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In: Proc AAAI Conf Artif Intell
Wang T-C, Liu M-Y, Zhu J-Y, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: Proc IEEE Conf Comput Vis Pattern Recognit, pp 8798–8807
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset. Technical report CNS-TR-2011-001, California Institute of Technology
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L (2019) Pytorch: an imperative style, high-performance deep learning library. In: 33rd annual conference on neural information processing systems, pp 1–12
Hosu V, Lin H, Sziranyi T, Saupe D (2020) KonIQ-10k: an ecologically valid database for deep learning of blind image quality assessment. IEEE Trans Image Process 29:4041–4056
Su S, Yan Q, Zhu Y, Zhang C, Ge X, Sun J, Zhang Y (2020) Blindly assess image quality in the wild guided by a self-adaptive hyper network. In: Proc IEEE/CVF Conf Comput Vis Pattern Recognit, pp 3667–3676
Hosu V, Goldlucke B, Saupe D (2019) Effective aesthetics prediction with multi-level spatially pooled features. In: Proc IEEE/CVF Conf Comput Vis Pattern Recognit, pp 9375–9383
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Givkashi, M., Naderi, M., Karimi, N. et al. Supervised deep learning for content-aware image retargeting with Fourier Convolutions. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18876-8
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11042-024-18876-8