Log in

Supervised deep learning for content-aware image retargeting with Fourier Convolutions

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Image retargeting aims to alter the size of the image with attention to the contents. One of the main obstacles to training deep learning models for image retargeting is the need for a vast labeled dataset. Labeled datasets are unavailable for training deep learning models in the image retargeting tasks. As a result, we present a new supervised approach for training deep learning models. We use the original images as ground truth and create inputs for the model by resizing and crop** the original images. A second challenge is generating different image sizes in inference time. However, normal convolutional neural networks cannot generate images of different sizes than the input image. To address this issue, we introduced a new method for supervised learning. In our approach, a mask is generated to show the desired size and location of the object. Then the mask and the input image are fed to the network. Comparing image retargeting methods and our proposed method demonstrates the model’s ability to produce high-quality retargeted images. Afterward, we compute the image quality assessment score for each output image based on different techniques and illustrate the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

The dataset used during the current study is available in the GitHub repository, https://github.com/givkashi/CAIR.

References

  1. Avidan S, Shamir A (2007) Seam carving for content-aware image resizing. In: ACM SIGGRAPH 2007 Pap, pp 10-es

  2. Chen Y, Pan Y, Song M, Wang M (2015) Image retargeting with a 3D saliency model. Sig Process 112:53–63

    Article  Google Scholar 

  3. Shocher A, Bagon S, Isola P, Irani M (2019) Ingan: capturing and retargeting the dna of a natural image. In: Proc IEEE/CVF Int Conf Comput Vis, pp 4492–4501

  4. Suh B, Ling H, Bederson BB, Jacobs DW (2003) Automatic thumbnail crop** and its effectiveness. In: Proc 16th Annu ACM Symp User Interface Softw Technol, pp 95–104

  5. Chen L-Q, **e X, Fan X, Ma W-Y, Zhang H-J, Zhou H-Q (2003) A visual attention model for adapting images on small displays. Multimed Syst 9:353–364

    Article  Google Scholar 

  6. Zhang M, Zhang L, Sun Y, Feng L, Ma W (2005) Auto crop** for digital photographs. In: 2005 IEEE Int Conf Multimed Expo IEEE, pp 4–pp

  7. Cavalcanti CSVC, Gomes HM, de Queiroz JER (2010) Combining multiple image features to guide automatic portrait crop** for rendering different aspect ratios. In: 2010 Sixth Int Conf Signal-Image Technol Internet Based Syst IEEE, pp 66–73

  8. Li X, Ling H (2009) Learning based thumbnail crop**. In: 2009 IEEE Int Conf Multimed Expo, IEEE, pp 558–561

  9. Ciocca G, Cusano C, Gasparini F, Schettini R (2007) Self-adaptive image crop** for small displays. IEEE Trans Consum Electron 53:1622–1627

    Article  Google Scholar 

  10. Santella A, Agrawala M, DeCarlo D, Salesin D, Cohen M (2006) Gaze-based interaction for semi-automatic photo crop**. In: Proc SIGCHI Conf Hum Factors Comput Syst, pp 771–780

  11. Luo Y, Yuan J, Xue P, Tian Q (2011) Saliency density maximization for efficient visual objects discovery. IEEE Trans Circuits Syst Video Technol 21:1822–1834

    Article  Google Scholar 

  12. Setlur V, Takagi S, Raskar R, Gleicher M, Gooch B (2005) Automatic image retargeting. In: Proc 4th Int Conf Mob Ubiquitous Multimed, pp 59–68

  13. Asheghi B, Salehpour P, Khiavi AM, Hashemzadeh M (2022) A comprehensive review on content-aware image retargeting: from classical to state-of-the-art methods. Sig Process 195:108496

  14. Shafieyan F, Karimi N, Mirmahboub B, Samavi S, Shirani S (2017) Image retargeting using depth assisted saliency map, signal process. Image Commun 50:34–43

    Google Scholar 

  15. Rubinstein M, Shamir A, Avidan S (2008) Improved seam carving for video retargeting. ACM Trans Graph 27:1–9

    Article  Google Scholar 

  16. Yoon J-C, Lee S-Y, Lee I-K, Kang H (2014) Optimized image resizing using flow-guided seam carving and an interactive genetic algorithm. Multimed Tools Appl 71:1013–1031

    Article  Google Scholar 

  17. Wu L, Gong Y, Yuan X, Zhang X, Cao L (2014) Semantic aware sport image resizing jointly using seam carving and war**. Multimed Tools Appl 70:721–739

    Article  Google Scholar 

  18. Cui J, Cai Q, Lu H, Jia Z, Tang M (2020) Distortion-aware image retargeting based on continuous seam carving model. Sig Process 166:107242

    Article  Google Scholar 

  19. Zhang Y, Sun Z, Jiang P, Huang Y, Peng J (2017) Hybrid image retargeting using optimized seam carving and scaling. Multimed Tools Appl 76:8067–8085

    Article  Google Scholar 

  20. Razzaghi P, Samavi S (2015) Image retargeting using nonparametric semantic segmentation. Multimed Tools Appl 74:11517–11536

    Article  Google Scholar 

  21. Hashemzadeh M, Asheghi B, Farajzadeh N (2019) Content-aware image resizing: an improved and shadow-preserving seam carving method. Sig Process 155:233–246

    Article  Google Scholar 

  22. Pritch Y, Kav-Venaki E, Peleg S (2009) Shift-map image editing. In: IEEE 2009 12th Int Conf Comput Vis, IEEE, pp 151–158

  23. Hu Y, Rajan D (2010) Hybrid shift map for video retargeting. In: 2010 IEEE Comput Soc Conf Comput Vis Pattern Recognit IEEE, pp 577–584

  24. Nakashima R, Utsugi K, Takahashi K, Naemura T (2011) Stereo image retargeting with shift-map. IEICE Trans Inf Syst 94:1345–1348

    Article  Google Scholar 

  25. Ahmadi M, Karimi N, Samavi S (2021) Context-aware saliency detection for image retargeting using convolutional neural networks. Multimed Tools Appl 80:11917–11941

    Article  Google Scholar 

  26. Yan B, Li K, Yang X, Hu T (2014) Seam searching-based pixel fusion for image retargeting. IEEE Trans Circuits Syst Video Technol 25:15–23

    Article  Google Scholar 

  27. Tan W, Yan B, Lin C, Niu X (2019) Cycle-IR: deep cyclic image retargeting. IEEE Trans Multimed 22:1730–1743

    Article  Google Scholar 

  28. Lin J, Zhou T, Chen Z (2019) DeepIR: a deep semantics driven framework for image retargeting. In: 2019 IEEE Int Conf Multimed Expo Work IEEE, pp 54–59

  29. Cho D, Park J, Oh T-H, Tai Y-W, So Kweon I (2017) Weakly-and self-supervised learning for content-aware deep image retargeting. In: Proc IEEE Int Conf Comput Vis, pp 4558–4567

  30. Kajiura N, Kosugi S, Wang X, Yamasaki T (2020) Self-play reinforcement learning for fast image retargeting. In: Proc 28th ACM Int Conf Multimed, pp 1755–1763

  31. Rubinstein M, Shamir A, Avidan S (2009) Multi-operator media retargeting. ACM Trans Graph 28:1–11

    Article  Google Scholar 

  32. Zhou Y, Chen Z, Li W (2020) Weakly supervised reinforced multi-operator image retargeting. IEEE Trans Circuits Syst Video Technol 31:126–139

    Article  Google Scholar 

  33. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2020) Generative adversarial networks. Commun ACM 63:139–144

    Article  Google Scholar 

  34. Mastan ID, Raman S (2020) Dcil: Deep contextual internal learning for image restoration and image retargeting. In: Proc IEEE/CVF Winter Conf Appl Comput Vis, pp 2366–2375

  35. Naderi MR, Givkashi MH, Karimi N, Shirani S, Samavi S (2022) OAIR: object-aware image retargeting using PSO and aesthetic quality assessment. Ar**v Prepr. Ar**v2209.04804

  36. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proc ICNN’95-International Conf Neural Networks, IEEE, pp 1942–1948

  37. Granot N, Feinstein B, Shocher A, Bagon S, Irani M (2022) Drop the gan: in defense of patches nearest neighbors as single image generative models. In: Proc IEEE/CVF Conf Comput Vis Pattern Recognit, pp 13460–13469

  38. Elnekave A, Weiss Y (2022) Generating natural images with direct patch distributions matching. Ar**v Prepr. Ar**v2203.11862

  39. Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proc Eur Conf Comput Vis, pp 801–818

  40. Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: optimal speed and accuracy of object detection. Ar**v Prepr. Ar**v2004.10934

  41. Chi L, Jiang B, Mu Y (2020) Fast fourier convolution. Adv Neural Inf Process Syst 33:4479–4488

    Google Scholar 

  42. Nussbaumer HJ (1981) The fast Fourier transform. In: Fast fourier transform convolution algorithms. Springer Ser Inf Sci 2:80–111

  43. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proc IEEE Conf Comput Vis Pattern Recognit, pp 770–778

  44. Isola P, Zhu J-Y, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proc IEEE Conf Comput Vis Pattern Recognit, pp 1125–1134

  45. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. Ar**v Prepr. Ar**v1412.6980

  46. Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution In: Eur Conf Comput Vis, Springer, pp 694–711

  47. Suvorov R, Logacheva E, Mashikhin A, Remizova A, Ashukha A, Silvestrov A, Kong N, Goka H, Park K, Lempitsky V (2022) Resolution-robust large mask inpainting with fourier convolutions. In: Proc IEEE/CVF Winter Conf Appl Comput Vis, pp 2149–2159

  48. Mescheder L, Geiger A, Nowozin S (2018) Which training methods for GANs do actually converge? In: Int Conf Mach Learn, PMLR, pp 3481–3490

  49. Drucker H, Le Cun Y (1992) Improving generalization performance using double backpropagation. IEEE Trans Neural Networks 3:991–997

    Article  CAS  PubMed  Google Scholar 

  50. Ross A, Doshi-Velez F (2018) Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In: Proc AAAI Conf Artif Intell

  51. Wang T-C, Liu M-Y, Zhu J-Y, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: Proc IEEE Conf Comput Vis Pattern Recognit, pp 8798–8807

  52. Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset. Technical report CNS-TR-2011-001, California Institute of Technology

  53. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L (2019) Pytorch: an imperative style, high-performance deep learning library. In: 33rd annual conference on neural information processing systems, pp 1–12

  54. Hosu V, Lin H, Sziranyi T, Saupe D (2020) KonIQ-10k: an ecologically valid database for deep learning of blind image quality assessment. IEEE Trans Image Process 29:4041–4056

    Article  Google Scholar 

  55. Su S, Yan Q, Zhu Y, Zhang C, Ge X, Sun J, Zhang Y (2020) Blindly assess image quality in the wild guided by a self-adaptive hyper network. In: Proc IEEE/CVF Conf Comput Vis Pattern Recognit, pp 3667–3676

  56. Hosu V, Goldlucke B, Saupe D (2019) Effective aesthetics prediction with multi-level spatially pooled features. In: Proc IEEE/CVF Conf Comput Vis Pattern Recognit, pp 9375–9383

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nader Karimi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Givkashi, M., Naderi, M., Karimi, N. et al. Supervised deep learning for content-aware image retargeting with Fourier Convolutions. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18876-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11042-024-18876-8

Keywords

Navigation