Abstract
Facial expression recognition is in general a challenging problem, especially in the presence of weak expression. Most recently, deep neural networks have been emerging as a powerful tool for expression recognition. However, due to the lack of training samples, existing deep network-based methods cannot fully capture the critical and subtle details of weak expression, resulting in unsatisfactory results. In this paper, we propose Deeper Cascaded Peak-piloted Network (DCPN) for weak expression recognition. The technique of DCPN has three main aspects: (1) Peak-piloted feature transformation, which utilizes the peak expression (easy samples) to supervise the non-peak expression (hard samples) of the same type and subject; (2) the back-propagation algorithm is specially designed such that the intermediate-layer feature maps of non-peak expression are close to those of the corresponding peak expression; and (3) an novel integration training method, cascaded fine-tune, is proposed to prevent the network from overfitting. Experimental results on two popular facial expression databases, CK\(+\) and Oulu-CASIA, show the superiority of the proposed DCPN over state-of-the-art methods.
Similar content being viewed by others
References
Agarwal, S., Santra, B., Mukherjee, D.P.: Anubhav : recognizing emotions through facial expression. Vis. Comput. 1–15 (2016)
Bargal, S.A., Barsoum, E., Ferrer, C.C., Zhang, C.: Emotion recognition in the wild from videos using images. In: ACM International Conference on Multimodal Interaction, pp. 433–436 (2016)
Bartlett, M.S., Littlewort, G., Frank, M., Lainscsek, C., Fasel, I., Movellan, J.: Recognizing facial expression: Machine learning and application to spontaneous behavior. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005. vol. 2, pp. 568–573 (2005)
Chi, J., Tu, C., Zhang, C.: Dynamic 3d facial expression modeling using Laplacian smooth and multi-scale mesh matching. Vis. Comput. 30(6–8), 649–659 (2014)
Chopra, S., Hadsell, R., Lecun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005. vol. 1, pp. 539–546 (2005)
Danelakis, A., Theoharis, T., Pratikakis, I.: A spatio-temporal wavelet-based descriptor for dynamic 3d facial expression retrieval and recognition. Vis. Comput. 32(6–8), 1–11 (2016)
Dhall, A., Goecke, R., Joshi, J., Hoey, J., Gedeon, T.: Emotiw 2016: video and group-level emotion recognition challenges. In: ACM International Conference on Multimodal Interaction, pp. 427–432 (2016)
Fan, Y., Lu, X., Li, D., Liu, Y.: Video-based emotion recognition using cnn-rnn and c3d hybrid networks. In: ACM International Conference on Multimodal Interaction, pp. 445–450 (2016)
Guo, Y., Zhao, G., Pietikainen, M.: Dynamic Facial Expression Recognition Using Longitudinal Facial Expression Atlases. Springer, Berlin (2012)
Han, S., Meng, Z., KHAN, A.S., Tong, Y.: Incremental boosting convolutional neural network for facial action unit recognition. Adv. Neural Inf. Process. Syst. 29, 109–117 (2016)
He, J., Hu, J.F., Lu, X., Zheng, W.S.: Multi-task mid-level feature learning for micro-expression recognition. Pattern Recognit. 66, 44–52 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hung, A.P., Wu, T., Hunter, P., Mithraratne, K.: A framework for generating anatomically detailed subject-specific human facial models for biomechanical simulations. Vis. Comput. 31(5), 527–539 (2015)
Jaiswal, S., Valstar, M.: Deep learning the dynamic appearance and shape of facial action units. In: Winter Applications in Computer Vision, pp. 1–8 (2016)
Jung, H., Lee, S., Yim, J., Park, S.: Joint fine-tuning in deep neural networks for facial expression recognition. In: IEEE International Conference on Computer Vision, pp. 2983–2991 (2015)
Klaser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3d-gradients. In: British Machine Vision Conference 2008, Leeds, September (2008)
Li, X., Mori, G., Zhang, H.: Expression-invariant face recognition with expression classification. In: The Canadian Conference on Computer and Robot Vision, p. 77 (2006)
Liu, M., Li, S., Shan, S., Wang, R., Chen, X.: Deeply Learning Deformable Facial Action Parts Model for Dynamic Expression Analysis. Springer International Publishing, Berlin (2014)
Liu, M., Shan, S., Wang, R., Chen, X.: Learning expression lets on spatio-temporal manifold for dynamic facial expression recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1749–1756 (2014)
Liu, P., Han, S., Meng, Z., Tong, Y.: Facial expression recognition via a boosted deep belief network. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1805–1812 (2014)
Liu, Y.J., Zhang, J.K., Yan, W.J., Wang, S.J., Zhao, G., Fu, X.: A main directional mean optical flow feature for spontaneous micro-expression recognition. IEEE Trans. Affect. Comput. 7(4), 1–1 (2016)
Lucey, P., Cohn, J.F., Kanade, T., Saragih, J.: The extended Cohn–Kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: Computer Vision and Pattern Recognition Workshops, pp. 94–101 (2010)
Metaxas, D.N., Huang, J., Liu, B., Yang, P., Liu, Q., Zhong, L.: Learning active facial patches for expression analysis. In: Computer Vision and Pattern Recognition, pp. 2562–2569 (2012)
Shan, C., Gong, S., Mcowan, P.W.: Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis. Comput. 27(6), 803–816 (2009)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). ar**v:1409.1556
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Workshop Track International Conference on Learning Representations, pp. 1–12 (2016)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–13 (2014)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Taini, M., Zhao, G., Li, S.Z., Pietikainen, M.: Facial expression recognition from near-infrared video sequences. In: International Conference on Pattern Recognition, pp. 1–4 (2011)
Valstar, M.F., Almaev, T., Girard, J.M., Mckeown, G.: Fera 2015 second facial expression recognition and analysis challenge. In: IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, pp. 1–8 (2015)
Yao, A., Cai, D., Hu, P., Wang, S., Sha, L., Chen, Y.: Holonet: towards robust emotion recognition in the wild. In: The ACM International Conference, pp. 472–478 (2016)
Yu, Z., Zhang, C.: Image based static facial expression recognition with multiple deep network learning. In: ACM on International Conference on Multimodal Interaction, pp. 435–442 (2015)
Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multi-task cascaded convolutional networks. IEEE J. Solid State Circuits 23(99), 1161–1173 (2016)
Zhang, Z., Luo, P., Chen, C.L., Tang, X.: Facial landmark detection by deep multi-task learning. In: European Conference on Computer Vision, pp. 94–108 (2014)
Zhao, R., Gan, Q., Wang, S., Ji, Q.: Facial expression intensity estimation using ordinal information. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3466–3474 (2016)
Zhao, X., Liang, X., Liu, L., Li, T., Han, Y., Vasconcelos, N., Yan, S.: Peak-piloted deep network for facial expression recognition. In: European Conference on Computer Vision, pp. 425–442 (2016)
Acknowledgements
The work of Qingshan Liu is supported by National Natural Science Foundation of China (NSFC) under Grant 61532009. The work of Guangcan Liu is supported in part by NSFC under Grant 61622305 and Grant 61502238, and in part by the Natural Science Foundation of Jiangsu Province of China (NSFJPC) under Grant BK20160040.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yu, Z., Liu, Q. & Liu, G. Deeper cascaded peak-piloted network for weak expression recognition. Vis Comput 34, 1691–1699 (2018). https://doi.org/10.1007/s00371-017-1443-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-017-1443-0