Log in

MCLGAN: a multi-style cartoonization method based on style condition information

  • Research
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Image cartoonization, a special kind of style transformation, is a challenging image processing task. Most existing cartoonization methods aim at single-style transformation. While multiple models are trained to achieve multi-style transformation, which is time-consuming and resource-consuming. Meanwhile, existing multi-style cartoonization methods based on generative adversarial network require multiple discriminators to handle different styles, which increases the complexity of the network. To solve the above issues, this paper proposes an image cartoonization method for multi-style transformation based on style condition information, called MCLGAN. This approach integrates two key components for promoting multi-style image cartoonization. Firstly, we design a conditional generator and a multi-style learning discriminator to embed the style condition information into the feature space, so as to enhance the ability of the model in realizing different cartoon styles. Then the new loss mechanism, the conditional contrastive loss, is used strategically to strengthen the difference between different styles, thus effectively realizing multi-style image cartoonization. At the same time, MCLGAN simplifies the cartoonization process of different styles images, and only needs to train the model once, which significantly improves the efficiency. Numerous experiments verify the validity of our method as well as demonstrate the superiority of our method compared to previous methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

The datasets used during and/or analyzed during the current study are publicly available. The corresponding papers are cited accordingly.

References

  1. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., WardeFarley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. Adv. Neural Inf. Process Syst. 27 (2014)

  2. Chen, Y. C., Xu, X., Jia, J.: Domain adaptive image-to-image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5274–5283 (2020)

  3. Wang, Y., Khan, S., Gonzalez-Garcia, A., et al.: Semi-supervised learning for few-shot image-to-image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4453–4462(2020)

  4. Zhang, P., Zhang, B., Chen, D., et al.: Cross-domain correspondence learning for exemplar-based image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5143–5153(2020)

  5. Lin, L., Zhang, S., Ji, S., et al.: TMGAN: two-stage multi-domain generative adversarial network for landscape image translation. Vis. Comput. 1–17 (2023)

  6. Wu, H., Sun, Z., Zhang, Y., et al.: Direction-aware neural style transfer with texture enhancement. Neurocomputing 370, 39–55 (2019)

    Article  Google Scholar 

  7. Ye, W., Zhu, X., Liu, Y.: Multi-semantic preserving neural style transfer based on Y channel information of image. Vis. Comput. 39(2), 609–623 (2023)

    Article  Google Scholar 

  8. Yu, X., Zhou, G.: Arbitrary style transfer via content consistency and style consistency. Vis. Comput. 1–14 (2023)

  9. Xu, L., Yuan, Q., Sun, Y., et al.: Image neural style transfer combining global and local optimization. Vis. Comput. 1–15 (2024)

  10. Chen, Y., Lai, Y. K., Liu, Y. J.: Cartoongan: Generative adversarial networks for photo cartoonization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9465–9474 (2018)

  11. Shu, Y., Yi, R., **a, M., et al.: Gan-based multi-style photo cartoonization. IEEE Trans. Vis. Comput. Graphics. 28(10), 3376–3390 (2021)

    Article  Google Scholar 

  12. Mei, H., Chen, Z. J.: Cartoonish rendering of images based on mean shift and FDoG. Comput. Eng. Appl. 52 (10): 213–217 (2016) (in Chinese)

  13. Liu, X.: Image cartoon processing based on Mean Shift in OpencCV. Information and Computer (Theory Edition). 32(20): 54–57(2020)(in Chinese)

  14. Comaniciu, D., Meer, P.: Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)

    Article  Google Scholar 

  15. Xue, R., Song, H., Wu, Z., et al.: An extended flow-based difference-of-Gaussians method of line drawing for polyhedral image. Optik 125(17), 4624–4628 (2014)

    Article  Google Scholar 

  16. Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 6, 679–698 (1986)

    Article  Google Scholar 

  17. Chen, Y., Lai, Y. K., Liu, Y. J.: Transforming photos to comics using convolutional neural networks. In:Proceedings of the IEEE/CVF Conference on Image Processing, pp. 2010–2014 (2017)

  18. Li, W., **ong, W., Liao, H., et al.: Carigan: Caricature generation through weakly paired adversarial learning. Neural Netw. 132, 66–74 (2016)

    Article  Google Scholar 

  19. Chen, J., Liu, G., Chen, X.: Animegan: A novel lightweight GAN for photo animation. In: Artificial Intelligence Algorithms and Applications: 11th International Symposium, ISICA 2019, Guangzhou, China, November 16–17, 2019, Revised Selected Papers 11. Springer Singapore, pp. 242–256 (2020)

  20. Gatys, L. A., Ecker, A. S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)

  21. Li, R., Wu, C.H., Liu, S., et al.: SDP-GAN: Saliency detail preservation generative adversarial networks for high perceptual quality style transfer. IEEE Trans. Image Process. 30, 374–385 (2020)

    Article  Google Scholar 

  22. Dong, Y., Tan, W., Tao, D., et al.: CartoonLossGAN: Learning surface and coloring of images for cartoonization. IEEE Trans. Image Process. 31, 485–498 (2021)

    Article  Google Scholar 

  23. Gao X., Zhang, Y., Tian, Y.: Learning to incorporate texture saliency adaptive attention to image cartoonization. In: ICML, 2(3): 6 (2022)

  24. Chen, Y., Chen, M., Song, C., et al.: Cartoonrenderer: An instance-based multi-style cartoon image translator. In: MultiMedia Modeling: 26th International Conference, MMM 2020, Daejeon, South Korea, January 5-8, 2020, Proceedings, Part I 26. Springer International Publishing, pp. 176–187 (2020)

  25. Wang, X., Yu, J.: Learning to cartoonize using white-box cartoon representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8090–8099 (2020)

  26. Zhang, Z.J., Sun, J.K., Chen, J.F., et al.: Caster: cartoon style transfer via dynamic cartoon style casting. Neurocomputing 556, 126654 (2023)

    Article  Google Scholar 

  27. Mirza, M., Osindero, S.: Conditional generative adversarial nets. ar**v:1411.1784 (2014)

  28. Odena, A., Olah, C., Shlens, J.: Conditional image synthesis with auxiliary classifier GANs. In: International Conference on Machine Learning. PMLR, pp. 2642–2651 (2017)

  29. Miyato T., Koyama M.: cGANs with projection discriminator. In: International Conference on Learning Representations, (2018)

  30. Kang, M., Park, J.: Contragan: contrastive learning for conditional image generation. Adv Neural Inf. Process Syst. 33, 21357–21369 (2020)

    Google Scholar 

  31. Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-assisted Intervention-MICCAI 2015: 18th International Conference, Munich, Germany, 2015, proceedings, part III 18. Springer International Publishing, pp. 234–241 (2015)

  32. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)

  33. Dumoulin, V., Shlens, J., Kudlur, M.: A learned representation for artistic style. CoRR, abs/1610.07629, (2016)

  34. Miyato, T., Kataoka, T., Koyama, M., et al.: Spectral normalization for generative adversarial networks. In: International Conference on Learning Representations (2018)

  35. Zhu, J. Y., Park, T., Isola, P., et al.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International conference on computer vision, pp. 2223–2232 (2017)

  36. Rao, J., Ke, A., Liu, G., et al.: MS-GAN: multi-scale GAN with parallel class activation maps for image reconstruction. Visual Comput. 39(5), 2111–2126 (2023)

    Article  Google Scholar 

  37. Chen, Z., Zhang, Y.: CA-GAN: the synthesis of Chinese art paintings using generative adversarial networks. Visual Comput. 1–13 (2023)

  38. Kingma, D. P., Ba, J.: Adam: A method for stochastic optimization. In: Yoshua Bengio and Yann LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 2015, Conference Track Proceedings (2015)

  39. Luo, X., Han, Z., Yang, L.: Progressive attentional manifold alignment for arbitrary style transfer. In: Proceedings of the Asian Conference on Computer Vision, pp. 3206–3222 (2022)

  40. Heusel, M., Ramsauer, H., Unterthiner, T., et al.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. Adv. Neural Inf. Process Syst. 30 (2017)

  41. Szegedy, C., Vanhoucke, V., Ioffe, S., et al.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)

  42. Sutherland, J. D., Arbel, M., Gretton, A.: Demystifying MMD GANs. In: International Conference for Learning Representations, pp. 1–36 (2018)

Download references

Acknowledgements

We would like to express our sincere appreciation to the anonymous reviewers for their insightful comments, which have greatly aided us in improving the quality of the paper. This work was supported in part by the Science and Technology Planning Project of Henan Province under Grant 242102211003 and in part by the National Natural Science Foundation of China under Grant 62302297 and 61972157.

Author information

Authors and Affiliations

Authors

Contributions

Canlin Li: Writing—original draft, Writing—review & editing, Methodology, Validation, Supervision, Project administration, Funding acquisition. **nyue Wang: Writing— original draft, Writing - review & editing, Conceptualization, Methodology, Validation, Visualization, Software. Ran Yi: Writing—review & editing, Methodology, Formal analysis, Funding acquisition. Wenjiao Zhang: Investigation, Validation. Lihua Bi: Data curation, Resources. Lizhuang Ma: Supervision, Funding acquisition.

Corresponding author

Correspondence to Canlin Li.

Ethics declarations

Conflict of interest

The authors declare that they have no Conflict of interest to this work.

Ethical approval

This chapter does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, C., Wang, X., Yi, R. et al. MCLGAN: a multi-style cartoonization method based on style condition information. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03550-9

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00371-024-03550-9

Keywords

Navigation