MCLGAN: a multi-style cartoonization method based on style condition information

Li, Canlin; Wang, **nyue; Yi, Ran; Zhang, Wenjiao; Bi, Lihua; Ma, Lizhuang

doi:10.1007/s00371-024-03550-9

MCLGAN: a multi-style cartoonization method based on style condition information

Research
Published: 21 June 2024

(2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Canlin Li¹,
**nyue Wang¹,
Ran Yi²,
Wenjiao Zhang¹,
Lihua Bi³ &
…
Lizhuang Ma²

24 Accesses
Explore all metrics

Abstract

Image cartoonization, a special kind of style transformation, is a challenging image processing task. Most existing cartoonization methods aim at single-style transformation. While multiple models are trained to achieve multi-style transformation, which is time-consuming and resource-consuming. Meanwhile, existing multi-style cartoonization methods based on generative adversarial network require multiple discriminators to handle different styles, which increases the complexity of the network. To solve the above issues, this paper proposes an image cartoonization method for multi-style transformation based on style condition information, called MCLGAN. This approach integrates two key components for promoting multi-style image cartoonization. Firstly, we design a conditional generator and a multi-style learning discriminator to embed the style condition information into the feature space, so as to enhance the ability of the model in realizing different cartoon styles. Then the new loss mechanism, the conditional contrastive loss, is used strategically to strengthen the difference between different styles, thus effectively realizing multi-style image cartoonization. At the same time, MCLGAN simplifies the cartoonization process of different styles images, and only needs to train the model once, which significantly improves the efficiency. Numerous experiments verify the validity of our method as well as demonstrate the superiority of our method compared to previous methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

FD-GAN: Generalizable and Robust Forgery Detection via Generative Adversarial Networks

Article 26 June 2024

Data availability

The datasets used during and/or analyzed during the current study are publicly available. The corresponding papers are cited accordingly.

References

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., WardeFarley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. Adv. Neural Inf. Process Syst. 27 (2014)
Chen, Y. C., Xu, X., Jia, J.: Domain adaptive image-to-image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5274–5283 (2020)
Wang, Y., Khan, S., Gonzalez-Garcia, A., et al.: Semi-supervised learning for few-shot image-to-image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4453–4462(2020)
Zhang, P., Zhang, B., Chen, D., et al.: Cross-domain correspondence learning for exemplar-based image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5143–5153(2020)
Lin, L., Zhang, S., Ji, S., et al.: TMGAN: two-stage multi-domain generative adversarial network for landscape image translation. Vis. Comput. 1–17 (2023)
Wu, H., Sun, Z., Zhang, Y., et al.: Direction-aware neural style transfer with texture enhancement. Neurocomputing 370, 39–55 (2019)
Article Google Scholar
Ye, W., Zhu, X., Liu, Y.: Multi-semantic preserving neural style transfer based on Y channel information of image. Vis. Comput. 39(2), 609–623 (2023)
Article Google Scholar
Yu, X., Zhou, G.: Arbitrary style transfer via content consistency and style consistency. Vis. Comput. 1–14 (2023)
Xu, L., Yuan, Q., Sun, Y., et al.: Image neural style transfer combining global and local optimization. Vis. Comput. 1–15 (2024)
Chen, Y., Lai, Y. K., Liu, Y. J.: Cartoongan: Generative adversarial networks for photo cartoonization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9465–9474 (2018)
Shu, Y., Yi, R., **a, M., et al.: Gan-based multi-style photo cartoonization. IEEE Trans. Vis. Comput. Graphics. 28(10), 3376–3390 (2021)
Article Google Scholar
Mei, H., Chen, Z. J.: Cartoonish rendering of images based on mean shift and FDoG. Comput. Eng. Appl. 52 (10): 213–217 (2016) (in Chinese)
Liu, X.: Image cartoon processing based on Mean Shift in OpencCV. Information and Computer (Theory Edition). 32(20): 54–57(2020)(in Chinese)
Comaniciu, D., Meer, P.: Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)
Article Google Scholar
Xue, R., Song, H., Wu, Z., et al.: An extended flow-based difference-of-Gaussians method of line drawing for polyhedral image. Optik 125(17), 4624–4628 (2014)
Article Google Scholar
Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 6, 679–698 (1986)
Article Google Scholar
Chen, Y., Lai, Y. K., Liu, Y. J.: Transforming photos to comics using convolutional neural networks. In:Proceedings of the IEEE/CVF Conference on Image Processing, pp. 2010–2014 (2017)
Li, W., **ong, W., Liao, H., et al.: Carigan: Caricature generation through weakly paired adversarial learning. Neural Netw. 132, 66–74 (2016)
Article Google Scholar
Chen, J., Liu, G., Chen, X.: Animegan: A novel lightweight GAN for photo animation. In: Artificial Intelligence Algorithms and Applications: 11th International Symposium, ISICA 2019, Guangzhou, China, November 16–17, 2019, Revised Selected Papers 11. Springer Singapore, pp. 242–256 (2020)
Gatys, L. A., Ecker, A. S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)
Li, R., Wu, C.H., Liu, S., et al.: SDP-GAN: Saliency detail preservation generative adversarial networks for high perceptual quality style transfer. IEEE Trans. Image Process. 30, 374–385 (2020)
Article Google Scholar
Dong, Y., Tan, W., Tao, D., et al.: CartoonLossGAN: Learning surface and coloring of images for cartoonization. IEEE Trans. Image Process. 31, 485–498 (2021)
Article Google Scholar
Gao X., Zhang, Y., Tian, Y.: Learning to incorporate texture saliency adaptive attention to image cartoonization. In: ICML, 2(3): 6 (2022)
Chen, Y., Chen, M., Song, C., et al.: Cartoonrenderer: An instance-based multi-style cartoon image translator. In: MultiMedia Modeling: 26th International Conference, MMM 2020, Daejeon, South Korea, January 5-8, 2020, Proceedings, Part I 26. Springer International Publishing, pp. 176–187 (2020)
Wang, X., Yu, J.: Learning to cartoonize using white-box cartoon representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8090–8099 (2020)
Zhang, Z.J., Sun, J.K., Chen, J.F., et al.: Caster: cartoon style transfer via dynamic cartoon style casting. Neurocomputing 556, 126654 (2023)
Article Google Scholar
Mirza, M., Osindero, S.: Conditional generative adversarial nets. ar**v:1411.1784 (2014)
Odena, A., Olah, C., Shlens, J.: Conditional image synthesis with auxiliary classifier GANs. In: International Conference on Machine Learning. PMLR, pp. 2642–2651 (2017)
Miyato T., Koyama M.: cGANs with projection discriminator. In: International Conference on Learning Representations, (2018)
Kang, M., Park, J.: Contragan: contrastive learning for conditional image generation. Adv Neural Inf. Process Syst. 33, 21357–21369 (2020)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-assisted Intervention-MICCAI 2015: 18th International Conference, Munich, Germany, 2015, proceedings, part III 18. Springer International Publishing, pp. 234–241 (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)
Dumoulin, V., Shlens, J., Kudlur, M.: A learned representation for artistic style. CoRR, abs/1610.07629, (2016)
Miyato, T., Kataoka, T., Koyama, M., et al.: Spectral normalization for generative adversarial networks. In: International Conference on Learning Representations (2018)
Zhu, J. Y., Park, T., Isola, P., et al.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International conference on computer vision, pp. 2223–2232 (2017)
Rao, J., Ke, A., Liu, G., et al.: MS-GAN: multi-scale GAN with parallel class activation maps for image reconstruction. Visual Comput. 39(5), 2111–2126 (2023)
Article Google Scholar
Chen, Z., Zhang, Y.: CA-GAN: the synthesis of Chinese art paintings using generative adversarial networks. Visual Comput. 1–13 (2023)
Kingma, D. P., Ba, J.: Adam: A method for stochastic optimization. In: Yoshua Bengio and Yann LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 2015, Conference Track Proceedings (2015)
Luo, X., Han, Z., Yang, L.: Progressive attentional manifold alignment for arbitrary style transfer. In: Proceedings of the Asian Conference on Computer Vision, pp. 3206–3222 (2022)
Heusel, M., Ramsauer, H., Unterthiner, T., et al.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. Adv. Neural Inf. Process Syst. 30 (2017)
Szegedy, C., Vanhoucke, V., Ioffe, S., et al.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Sutherland, J. D., Arbel, M., Gretton, A.: Demystifying MMD GANs. In: International Conference for Learning Representations, pp. 1–36 (2018)

Download references

Acknowledgements

We would like to express our sincere appreciation to the anonymous reviewers for their insightful comments, which have greatly aided us in improving the quality of the paper. This work was supported in part by the Science and Technology Planning Project of Henan Province under Grant 242102211003 and in part by the National Natural Science Foundation of China under Grant 62302297 and 61972157.

Author information

Authors and Affiliations

School of Computer Science and Technology, Zhengzhou University of Light Industry, Zhengzhou, 450002, China
Canlin Li, **nyue Wang & Wenjiao Zhang
Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
Ran Yi & Lizhuang Ma
College of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou, 450002, China
Lihua Bi

Authors

Canlin Li
View author publications
You can also search for this author in PubMed Google Scholar
**nyue Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ran Yi
View author publications
You can also search for this author in PubMed Google Scholar
Wenjiao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Lihua Bi
View author publications
You can also search for this author in PubMed Google Scholar
Lizhuang Ma
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Canlin Li: Writing—original draft, Writing—review & editing, Methodology, Validation, Supervision, Project administration, Funding acquisition. **nyue Wang: Writing— original draft, Writing - review & editing, Conceptualization, Methodology, Validation, Visualization, Software. Ran Yi: Writing—review & editing, Methodology, Formal analysis, Funding acquisition. Wenjiao Zhang: Investigation, Validation. Lihua Bi: Data curation, Resources. Lizhuang Ma: Supervision, Funding acquisition.

Corresponding author

Correspondence to Canlin Li.

Ethics declarations

Conflict of interest

The authors declare that they have no Conflict of interest to this work.

Ethical approval

This chapter does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, C., Wang, X., Yi, R. et al. MCLGAN: a multi-style cartoonization method based on style condition information. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03550-9

Download citation

Accepted: 11 June 2024
Published: 21 June 2024
DOI: https://doi.org/10.1007/s00371-024-03550-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MCLGAN: a multi-style cartoonization method based on style condition information

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

FD-GAN: Generalizable and Robust Forgery Detection via Generative Adversarial Networks

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

MCLGAN: a multi-style cartoonization method based on style condition information

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

FD-GAN: Generalizable and Robust Forgery Detection via Generative Adversarial Networks

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation