Log in

Real-time self-supervised achromatic face colorization

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Recent deep learning-based 2D face image colorization techniques demonstrate significant improvement in colorization accuracy and detail preservation. However, the generation of a 3D counterpart is beyond the scope of these methods despite having extensive applications. Moreover, these approaches require a significant amount of inference time, thus posing a challenge for real-time applications. Besides, monocular 3D face reconstruction methods produce skin color consistent with the achromatic 2D face resulting in gray-scale 3D face texture. Therefore, we propose a novel real-time Self-Supervised COoperative COloriza Tion of Achromatic Faces (COCOTA) framework, which estimates colored 3D faces from both monocular color and achromatic face images without posing additional dependencies. The proposed network contains (1) Chromatic Pipeline to obtain 3D face alignment and geometric details for color face images and (2) Achromatic Pipeline for recovering texture from achromatic images. The proposed dual pipeline feature loss and parameter sharing technique aid in cooperation between COCOTA pipelines for facilitating knowledge transfer between them. We compare color accuracy of our method with several 3D face reconstruction approaches on the challenging CelebA-test and FairFace datasets. COCOTA outperforms the current state-of-the-art method by a large margin (e.g., an improvement of \(25.3\%\), \({39.6}\%\), and \(17\%\) is obtained on perceptual error, 3D color-based error, and 2D pixel-level error metrics, respectively). Also, we show the improvement in the proposed method’s inference time compared to 2D image colorization techniques, demonstrating the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Availability of data and materials

All the data and materials are freely available in the public domain.

References

  1. Ye, D., Fuh, C.-S.: 3d morphable face model for face animation. Int. J. Image Graph. 20(01), 2050003 (2020)

    Article  Google Scholar 

  2. Lin, C., **ong, S., Lu, X.: Disentangled face editing via individual walk in personalized facial semantic field. Vis. Comput. (2022). https://doi.org/10.1007/s00371-022-02708-7

    Article  Google Scholar 

  3. Chen, L., Cao, C., De la Torre, F., Saragih, J., Xu, C., Sheikh, Y.: High-fidelity face tracking for ar/vr via deep lighting adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13059–13069 (2021)

  4. Xu, J., Lu, K., Shi, X., Qin, S., Wang, H., Ma, J.: A denseunet generative adversarial network for near-infrared face image colorization. Signal Process. 183, 108007 (2021)

    Article  Google Scholar 

  5. **, X., Li, Z., Liu, K., Zou, D., Li, X., Zhu, X., Zhou, Z., Sun, Q., Liu, Q.: Focusing on persons: Colorizing old images learning from modern historical movies. In: Proceedings of the 29th ACM International Conference on Multimedia (2021)

  6. Blanz, V., Vetter, T.: A morphable model for the synthesis of 3d faces. In: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, pp. 187–194 (1999)

  7. Zhong, X., Lu, T., Huang, W., Ye, M., Jia, X., Lin, C.-W.: Grayscale enhancement colorization network for visible-infrared person re-identification. IEEE Trans. Circuits Syst. Video Technol. 32(3), 1418–30 (2021)

    Article  Google Scholar 

  8. Deng, Y., Yang, J., Xu, S., Chen, D., Jia, Y., Tong, X.: Accurate 3d face reconstruction with weakly-supervised learning: From single image to image set. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 0–0 (2019)

  9. Tewari, A., Zollhofer, M., Kim, H., Garrido, P., Bernard, F., Perez, P., Theobalt, C.: Mofa: Model-based deep convolutional face autoencoder for unsupervised monocular reconstruction. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 1274–1283 (2017)

  10. Genova, K., Cole, F., Maschinot, A., Sarna, A., Vlasic, D., Freeman, W.T.: Unsupervised training for 3d morphable model regression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8377–8386 (2018)

  11. Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of International Conference on Computer Vision (ICCV) (2015)

  12. Karkkainen, K., Joo, J.: Fairface: Face attribute dataset for balanced race, gender, and age for bias measurement and mitigation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1548–1558 (2021)

  13. Tang, J., Li, Z., Lai, H., Zhang, L., Yan, S., et al.: Personalized age progression with bi-level aging dictionary learning. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 905–917 (2017)

    Google Scholar 

  14. Shu, X., Tang, J., Lai, H., Liu, L., Yan, S.: Personalized age progression with aging dictionary. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3970–3978 (2015)

  15. Deng, Q., Ma, L., **, A., Bi, H., Le, B.H., Deng, Z.: Plausible 3d face wrinkle generation using variational autoencoders. IEEE Trans. Vis. Comput. Graph., 1–1 (2021)

  16. Zielonka, W., Bolkart, T., Thies, J.: Towards metrical reconstruction of human faces. ar**v preprint ar**v:2204.06607 (2022)

  17. Richardson, E., Sela, M., Kimmel, R.: 3d face reconstruction by learning from synthetic data. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 460–469 (2016). IEEE

  18. Sela, M., Richardson, E., Kimmel, R.: Unrestricted facial geometry reconstruction using image-to-image translation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1576–1585 (2017)

  19. Tiwari, H., Kurmi, V.K., Venkatesh, K., Chen, Y.-S.: Occlusion resistant network for 3d face reconstruction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 813–822 (2022)

  20. Lin, J., Yuan, Y., Shao, T., Zhou, K.: Towards high-fidelity 3d face reconstruction from in-the-wild images using graph convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5891–5900 (2020)

  21. Gecer, B., Ploumpis, S., Kotsia, I., Zafeiriou, S.: Ganfit: Generative adversarial network fitting for high fidelity 3d face reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1155–1164 (2019)

  22. Tewari, A., Zollhöfer, M., Garrido, P., Bernard, F., Kim, H., Pérez, P., Theobalt, C.: Self-supervised multi-level face model learning for monocular reconstruction at over 250 hz. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2549–2559 (2018)

  23. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y..: In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27. Curran Associates Inc. (2014)

  24. Feng, Y., Feng, H., Black, M.J., Bolkart, T.: Learning an animatable detailed 3d face model from in-the-wild images. ACM Trans. Graph. (TOG) 40(4), 1–13 (2021)

    Article  Google Scholar 

  25. Limmer, M., Lensch, H.P.: Infrared colorization using deep convolutional neural networks. In: 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 61–68 (2016). IEEE

  26. Zhang, R., Zhu, J.-Y., Isola, P., Geng, X., Lin, A.S., Yu, T., Efros, A.A.: Real-time user-guided image colorization with learned deep priors. ar**v preprint ar**v:1705.02999 (2017)

  27. Isola, P., Zhu, J.-Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)

  28. Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention, pp. 234–241 (2015). Springer

  29. Antic, J.: DeOldify–a deep learning based project for colorizing and restoring old images (and video!) (2019)

  30. Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)

  31. **ao, Y., Jiang, A., Liu, C., Wang, M.: Single image colorization via modified cyclegan. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 3247–3251 (2019). IEEE

  32. Cao, Q., Shen, L., **e, W., Parkhi, O.M., Zisserman, A.: Vggface2: A dataset for recognising faces across pose and age. In: 2018 13th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2018), pp. 67–74 (2018). IEEE

  33. Dou, H., Chen, C., Hu, X., Jia, L., Peng, S.: Asymmetric cyclegan for image-to-image translations with uneven complexities. Neurocomputing 415, 114–122 (2020)

    Article  Google Scholar 

  34. Vitoria, P., Raad, L., Ballester, C.: Chromagan: Adversarial picture colorization with semantic class distribution. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2445–2454 (2020)

  35. Feng, Y., Wu, F., Shao, X., Wang, Y., Zhou, X.: Joint 3d face reconstruction and dense alignment with position map regression network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 534–551 (2018)

  36. Tiwari, H., Chen, M.-H., Tsai, Y.-M., Kuo, H.-K., Chen, H.-J., Jou, K., Venkatesh, K., Chen, Y.-S.: Self-supervised robustifying guidance for monocular 3d face reconstruction. ar**v preprint ar**v:2112.14382 (2021)

  37. **ao, Y., Jiang, A., Liu, C., Wang, M.: Semantic-aware automatic image colorization via unpaired cycle-consistent self-supervised network. Int. J. Intell. Syst. 37(2), 1222–1238 (2022)

    Article  Google Scholar 

  38. Treneska, S., Zdravevski, E., Pires, I.M., Lameski, P., Gievska, S.: Gan-based image colorization for self-supervised visual feature learning. Sensors 22(4), 1599 (2022)

    Article  Google Scholar 

  39. Larsson, G., Maire, M., Shakhnarovich, G.: Learning representations for automatic colorization. In: European Conference on Computer Vision, pp. 577–593 (2016). Springer

  40. Paysan, P., Knothe, R., Amberg, B., Romdhani, S., Vetter, T.: A 3d face model for pose and illumination invariant face recognition. In: 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 296–301 (2009). Ieee

  41. Cao, C., Weng, Y., Zhou, S., Tong, Y., Zhou, K.: Facewarehouse: a 3d facial expression database for visual computing. IEEE Trans. Vis. Comput. Graphics 20(3), 413–425 (2013)

    Google Scholar 

  42. King, D.E.: Dlib-ml: A machine learning toolkit. J. Mach. Learn. Res. 10, 1755–1758 (2009)

    Google Scholar 

  43. Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)

  44. Zhu, X., Lei, Z., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: A 3d solution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 146–155 (2016)

  45. Huang, G.B., Mattar, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. In: Workshop on Faces in’Real-Life’Images: Detection, Alignment, and Recognition (2008)

  46. Koestinger, M., Wohlhart, P., Roth, P.M., Bischof, H.: Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 2144–2151 (2011). IEEE

  47. Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3730–3738 (2015)

  48. Amos, B., Ludwiczuk, B., Satyanarayanan, M., et al.: Openface: a general-purpose face recognition library with mobile applications. CMU School Comput. Sci. 6(2), 20 (2016)

    Google Scholar 

  49. Serengil, S.I.: tensorflow-101. https://github.com/serengil/tensorflow-101 (2021)

  50. Deng, J., Guo, J., Xue, N., Zafeiriou, S.: Arcface: Additive angular margin loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4690–4699 (2019)

  51. Zhong, Y., Deng, W., Hu, J., Zhao, D., Li, X., Wen, D.: Sface: sigmoid-constrained hypersphere loss for robust face recognition. IEEE Trans. Image Process. 30, 2587–2598 (2021)

    Article  Google Scholar 

  52. Chen, D., Hua, G., Wen, F., Sun, J.: Supervised transformer network for efficient face detection. In: European Conference on Computer Vision, pp. 122–138 (2016). Springer

  53. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  54. Kingma, D.P., Ba, J.: Adam: A methodfor stochastic optimization. In: International Conference onLearning Representations (ICLR) (2015)

  55. Su, J.-W., Chu, H.-K., Huang, J.-B.: Instance-aware image colorization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7968–7977 (2020)

  56. Zhang, R., Isola, P., Efros, A.A.: Colorful image colorization. In: European Conference on Computer Vision, pp. 649–666 (2016). Springer

  57. Iizuka, S., Simo-Serra, E., Ishikawa, H.: Let there be color! joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Trans. Graph. (ToG) 35(4), 1–11 (2016)

    Article  Google Scholar 

Download references

Funding

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Contributions

We summarize the author’s contributions as (1) HT: conceptualization, data curation, formal analysis, investigation, methodology, project administration, validation, visualization and writing original draft, (2) VKS: analysis, investigation, supervision, validation and draft revision, and (3) Y-SC: supervision.

Corresponding author

Correspondence to Hitika Tiwari.

Ethics declarations

Ethical approval

This research uses freely available face datasets. Therefore, ethical approval is not required.

Conflict of interest

I declare that the authors have no competing interests as defined by Springer or other interests that might be perceived to influence the results and/or discussion reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (mp4 293 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tiwari, H., Subramanian, V.K. & Chen, YS. Real-time self-supervised achromatic face colorization. Vis Comput 39, 6521–6536 (2023). https://doi.org/10.1007/s00371-022-02746-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-022-02746-1

Keywords

Navigation