Abstract
Offline Handwritten Mathematical Expression Recognition (HMER) has been dramatically advanced recently by employing tree decoders as part of the encoder-decoder method. Despite the tree decoder-based methods regard the expressions as a tree and parse 2D spatial structure to the tree nodes sequence, the performance of existing works is still poor due to the inevitable tree nodes prediction errors. Besides, they lack syntax rules to regulate the output of expressions. In this paper, we propose a novel model called Spatial Attention and Syntax Rule Enhanced Tree Decoder (SS-TD), which is equipped with spatial attention mechanism to alleviate the prediction error of tree structure and use syntax masks (obtained from the transformation of syntax rules) to constrain the occurrence of ungrammatical mathematical expression. In this way, our model can effectively describe tree structure and increase the accuracy of output expression. Experiments show that SS-TD achieves better recognition performance than prior models on CROHME 14/16/19 datasets, demonstrating the effectiveness of our model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alvarez-Melis, D., Jaakkola, T.S.: Tree-structured decoding with doubly-recurrent neural networks. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017, Conference Track Proceedings. OpenReview.net (2017). https://openreview.net/forum?id=HkYhZDqxg
Álvaro, F., Sánchez, J.A., Benedí, J.M.: An integrated grammar-based approach for mathematical expression recognition. Pattern Recogn. 51, 135–147 (2016)
Awal, A., Mouchère, H., Viard-Gaudin, C.: A global learning approach for an online handwritten mathematical expression recognition system. Pattern Recogn. Lett. 35, 68–77 (2014). https://doi.org/10.1016/j.patrec.2012.10.024
Chakraborty, S., Allamanis, M., Ray, B.: Tree2tree neural translation model for learning source code changes. CoRR abs/1810.00314 (2018). http://arxiv.org/abs/1810.00314
Chan, K., Yeung, D.: Elastic structural matching for online handwritten alphanumeric character recognition. In: Jain, A.K., Venkatesh, S., Lovell, B.C. (eds.) Fourteenth International Conference on Pattern Recognition, ICPR 1998, Brisbane, Australia, 16–20 August 1998, pp. 1508–1511. IEEE Computer Society (1998), https://doi.org/10.1109/ICPR.1998.711993
Chen, X., Liu, C., Song, D.: Tree-to-tree neural networks for program translation. In: Bengio, S., Wallach, H.M., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems (NeurIPS) 2018, pp. 2552–2562 (2018). https://proceedings.neurips.cc/paper/2018/hash/d759175de8ea5b1d9a2660894f-Abstract.html
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Ding, H., Chen, K., Huo, Q.: An encoder-decoder approach to handwritten mathematical expression recognition with multi-head attention and stacked decoder. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12822, pp. 602–616. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86331-9_39
Dyer, C., Kuncoro, A., Ballesteros, M., Smith, N.A.: Recurrent neural network grammars. In: Knight, K., Nenkova, A., Rambow, O. (eds.) NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 199–209. The Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/n16-1024
Harer, J., Reale, C.P., Chin, P.: Tree-transformer: a transformer-based method for correction of tree-structured data. CoRR abs/1908.00449 (2019). http://arxiv.org/abs/1908.00449
Le, A.D.: Recognizing handwritten mathematical expressions via paired dual loss attention network and printed mathematical expressions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2020)
Le, A.D., Indurkhya, B., Nakagawa, M.: Patten generation strategies for improving recognition of handwritten mathematical expression. Pattern Recogn. Lett. 128, 255–262 (2019). https://doi.org/10.1016/j.patrec.2019.09.002
Li, Z., **, L., Lai, S., Zhu, Y.: Improving attention-based handwritten mathematical expression recognition with scale augmentation and drop attention. In: 17th International Conference on Frontiers in Handwriting Recognition, ICFHR, pp. 175–180. IEEE (2020). https://doi.org/10.1109/ICFHR2020.2020.00041
Mahdavi, M., Zanibbi, R., Mouchere, H., Viard-Gaudin, C., Garain, U.: ICDAR 2019 CROHME + TFD: competition on recognition of handwritten mathematical expressions and typeset formula detection. In: 2019 International Conference on Document Analysis and Recognition ICDAR, pp. 1533–1538. IEEE (2016). https://doi.org/10.1109/ICDAR.2019.00247
Mouchere, H., Viard-Gaudin, C., Zanibbi, R., Garain, U.: ICFHR 2016 CROHME: competition on recognition of online handwritten mathematical expressions. In: 2016 15th International Conference on Frontiers in Handwriting Recognition ICFHR, pp. 607–612. IEEE (2019). https://doi.org/10.1109/ICFHR.2016.0116
Mouchère, H., Zanibbi, R., Garain, U., Viard-Gaudin, C.: Advancing the state of the art for handwritten math recognition: the CROHME competitions, 2011–2014. Int. J. Docu. Anal. Recogn. (IJDAR) 19(2), 173–189 (2016). https://doi.org/10.1007/s10032-016-0263-5
Okamoto, M., Imai, H., Takagi, K.: Performance evaluation of a robust method for mathematical expression recognition. In: 6th International Conference on Document Analysis and Recognition ICDAR 2001), 10–13 September 2001, Seattle, WA, USA, pp. 121–128. IEEE Computer Society (2001). https://doi.org/10.1109/ICDAR.2001.953767
Qian, R.J., Huang, T.S.: Optimal edge detection in two-dimensional images. IEEE Trans. Image Process. 5(7), 1215–1220 (1996). https://doi.org/10.1109/83.502412
Truong, T., Nguyen, C.T., Phan, K.M., Nakagawa, M.: Improvement of end-to-end offline handwritten mathematical expression recognition by weakly supervised learning. In: 17th International Conference on Frontiers in Handwriting Recognition, ICFHR 2020, Dortmund, Germany, September 8–10, 2020, pp. 181–186. IEEE (2020). https://doi.org/10.1109/ICFHR2020.2020.00042
Truong, T., Ung, H.Q., Nguyen, H.T., Nguyen, C.T., Nakagawa, M.: Relation-based representation for handwritten mathematical expression recognition. In: ICDAR 2021: 16th International Conference Document Analysis and Recognition, pp. 7–19 (2021). https://doi.org/10.1007/978-3-030-86198-8_1
Wang, J., Sun, Y., Wang, S.: Image to latex with densenet encoder and joint attention. In: Bie, R., Sun, Y., Yu, J. (eds.) 2018 International Conference on Identification, Information and Knowledge in the Internet of Things, IIKI 2018, Bei**g, China, 19–21 October 2018. Procedia Computer Science, vol. 147, pp. 374–380. Elsevier (2018). https://doi.org/10.1016/j.procs.2019.01.246
Wu, J., Yin, F., Zhang, Y., Zhang, X., Liu, C.: Handwritten mathematical expression recognition via paired adversarial learning. Int. J. Comput. Vis. 128(10), 2386–2401 (2020). https://doi.org/10.1007/s11263-020-01291-5
Wu, J., Yin, F., Zhang, Y., Zhang, X., Liu, C.: Graph-to-graph: towards accurate and interpretable online handwritten mathematical expression recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 2925–2933 (2021). https://doi.org/10.1609/aaai.v35i4.16399
Yuan, Y., et al.: Syntax-aware network for handwritten mathematical expression recognition. CoRR abs/2203.01601 (2022). https://doi.org/10.48550/ar**v.2203.01601
Zhang, J., Du, J., Dai, L.: Multi-scale attention with dense encoder for handwritten mathematical expression recognition. In: 24th International Conference on Pattern Recognition, ICPR 2018, Bei**g, China, 20–24 August 2018, pp. 2245–2250. IEEE Computer Society (2018). https://doi.org/10.1109/ICPR.2018.8546031
Zhang, J., Du, J., Yang, Y., Song, Y., Dai, L.: SRD: a tree structure based decoder for online handwritten mathematical expression recognition. IEEE Trans. Multim. 23, 2471–2480 (2021). https://doi.org/10.1109/TMM.2020.3011316
Zhang, J., Du, J., Yang, Y., Song, Y., Wei, S., Dai, L.: A tree-structured decoder for image-to-markup generation. In: Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13–18 July 2020, Virtual Event. Proceedings of Machine Learning Research, vol. 119, pp. 11076–11085. PMLR (2020). http://proceedings.mlr.press/v119/zhang20g.html
Zhang, J.: Watch, attend and parse: an end-to-end neural network based approach to handwritten mathematical expression recognition. Pattern Recogn. 71, 196–206 (2017). https://doi.org/10.1016/j.patcog.2017.06.017
Acknowledgement
This work has been supported by the National Natural Science Foundation of China (No.62176093, 61673182), the Key Realm Research and Development Program of Guangzhou (No.202206030001), and the GuangDong Basic and Applied Basic Research Foundation (No.2021A1515012282).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Lin, Z. et al. (2022). Spatial Attention and Syntax Rule Enhanced Tree Decoder for Offline Handwritten Mathematical Expression Recognition. In: Porwal, U., Fornés, A., Shafait, F. (eds) Frontiers in Handwriting Recognition. ICFHR 2022. Lecture Notes in Computer Science, vol 13639. Springer, Cham. https://doi.org/10.1007/978-3-031-21648-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-21648-0_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21647-3
Online ISBN: 978-3-031-21648-0
eBook Packages: Computer ScienceComputer Science (R0)