Hangul Fonts Dataset: A Hierarchical and Compositional Dataset for Investigating Learned Representations

Livezey, Jesse A.; Hwang, Ahyeon; Yeung, Jacob; Bouchard, Kristofer E.

doi:10.1007/978-3-031-06433-3_1

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13233))

Included in the following conference series:

International Conference on Image Analysis and Processing

1294 Accesses

Abstract

Hierarchy and compositionality are common latent properties in many natural and scientific image datasets. Determining when a deep network’s hidden activations represent hierarchy and compositionality is important both for understanding deep representation learning and for applying deep networks in domains where interpretability is crucial. However, current benchmark machine learning datasets either have little hierarchical or compositional structure, or the structure is not known. This gap impedes precise analysis of a network’s representations and thus hinders development of new methods that can learn such properties. To address this gap, we developed a new benchmark dataset with known hierarchical and compositional structure. The Hangul Fonts Dataset (HFD) is comprised of 35 fonts from the Korean writing system (Hangul), each with 11,172 blocks (syllables) composed from the product of initial, medial, and final glyphs. All blocks can be grouped into a few geometric types which induces a hierarchy across blocks. In addition, each block is composed of individual glyphs with rotations, translations, scalings, and naturalistic style variation across fonts. We find that both shallow and deep unsupervised methods show only modest evidence of hierarchy and compositionality in their representations of the HFD compared to supervised deep networks. Thus, HFD enables the identification of shortcomings in existing methods, a critical first step toward develo** new machine learning algorithms to extract hierarchical and compositional structure in the context of naturalistic variability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Learning typographic style: from discrimination to synthesis

Article 09 May 2017

Fostering Compositionality in Latent, Generative Encodings to Solve the Omniglot Challenge

HWNet v2: an efficient word image representation for handwritten documents

Article 31 July 2019

References

Bell, A.J., Sejnowski, T.J.: The “independent components” of natural scenes are edge filters. Vision Res. 37(23), 3327–3338 (1997). https://doi.org/10.1016/S0042-6989(97)00121-1
Burgess, C., Kim, H.: 3D shapes dataset (2018). https://github.com/deepmind/3dshapes-dataset/
Burgess, C.P., et al.: Understanding disentangling in \(\beta \)-VAE. ar**v preprint ar**v:1804.03599 (2018)
Cheung, B., Livezey, J.A., Bansal, A.K., Olshausen, B.A.: Discovering hidden factors of variation in deep networks. ar**v preprint ar**v:1412.6583 (2014)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009). https://doi.org/10.1109/CVPR.2009.5206848
Denton, E., Chintala, S., Szlam, A., Fergus, R.: Deep generative image models using a laplacian pyramid of adversarial networks. ar**v preprint ar**v:1506.05751 (2015)
Higgins, I., et al.: Towards a definition of disentangled representations. ar**v preprint ar**v:1812.02230 (2018)
Higgins, I., et al.: beta-VAE: learning basic visual concepts with a constrained variational framework. In: International Conference on Learning Representations, vol. 3 (2017)
Google Scholar
Kell, A.J., Yamins, D.L., Shook, E.N., Norman-Haignere, S.V., McDermott, J.H.: A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy. Neuron 98(3), 630–644 (2018). https://doi.org/10.1016/j.neuron.2018.03.044
Article Google Scholar
Kim, I.-J., Choi, C., Lee, S.-H.: Improving discrimination ability of convolutional neural networks by hybrid learning. Int. J. Doc. Anal. Recognit. (IJDAR) 19(1), 1–9 (2015). https://doi.org/10.1007/s10032-015-0256-9
Article Google Scholar
Kim, I.J., **e, X.: Handwritten hangul recognition using deep convolutional neural networks. Int. J. Doc. Anal. Recognit. (IJDAR) 18(1), 1–13 (2015). https://doi.org/10.1007/s10032-014-0229-4
Article Google Scholar
Kim, S., et al.: Deep-hurricane-tracker: tracking and forecasting extreme climate events. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1761–1769. IEEE (2019). https://doi.org/10.1109/WACV.2019.00192
Ko, D.H., Lee, H., Suk, J., Hassan, A.U., Choi, J.: Hangul font dataset for Korean font research based on deep learning. KIPS Trans. Softw. Data Eng. 10(2), 73–78 (2021)
Google Scholar
Krizhevsky, A., Nair, V., Hinton, G.: The CIFAR-10 dataset, vol. 55 (2014). http://www.cs.toronto.edu/kriz/cifar.html
Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: Human-level concept learning through probabilistic program induction. Science 350(6266), 1332–1338 (2015). https://doi.org/10.1126/science.aab3050
Article MathSciNet MATH Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999). https://doi.org/10.1038/44565
Article MATH Google Scholar
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of International Conference on Computer Vision (ICCV), December 2015. https://doi.org/10.1109/ICCV.2015.425
Livezey, J.A., Bouchard, K.E., Chang, E.F.: Deep learning as a tool for neural data analysis: speech classification and cross-frequency coupling in human sensorimotor cortex. PLoS Comput. Biol. 15(9), e1007091 (2019). https://doi.org/10.1371/journal.pcbi.1007091
Article Google Scholar
Livezey, J.A., Glaser, J.I.: Deep learning approaches for neural decoding across architectures and recording modalities. Brief. Bioinform. 22(2), 1577–1591 (2021). https://doi.org/10.1093/bib/bbaa355
Article Google Scholar
Mathuriya, A., et al.: Cosmoflow: using deep learning to learn the universe at scale. In: SC 2018: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 819–829. IEEE (2018). https://doi.org/10.1109/SC.2018.00068
Matthey, L., Higgins, I., Hassabis, D., Lerchner, A.: dSprites: disentanglement testing sprites dataset (2017). https://github.com/deepmind/dsprites-dataset/
Nguyen, A., Yosinski, J., Clune, J.: Multifaceted feature visualization: uncovering the different types of features learned by each neuron in deep neural networks. ar**v preprint ar**v:1602.03616 (2016)
Nickel, M., Kiela, D.: Poincaré embeddings for learning hierarchical representations. In: Advances in Neural Information Processing Systems 30 (2017)
Google Scholar
Oktaviani, S., Sari, C.A., Rachmawanto, E.H., et al.: Optical character recognition for hangul character using artificial neural network. In: 2020 International Seminar on Application for Technology of Information and Communication (iSemantic), pp. 34–39. IEEE (2020). https://doi.org/10.1109/iSemantic50169.2020.9234215
Park, G.R., Kim, I.J., Liu, C.L.: An evaluation of statistical methods in handwritten hangul recognition. Int. J. Doc. Anal. Recognit. (IJDAR) 16(3), 273–283 (2013). https://doi.org/10.1007/s10032-012-0191-y
Article Google Scholar
Purnamawati, S., Rachmawati, D., Lumanauw, G., Rahmat, R., Taqyuddin, R.: Korean letter handwritten recognition using deep convolutional neural network on android platform. In: Journal of Physics: Conference Series, vol. 978, p. 012112. IOP Publishing (2018). https://doi.org/10.1088/1742-6596/978/1/012112
Schmidhuber, J.: Learning factorial codes by predictability minimization. Neural Comput. 4(6), 863–879 (1992). https://doi.org/10.1162/neco.1992.4.6.863
Article Google Scholar
Stevens, R., Taylor, V., Nichols, J., Maccabe, A.B., Yelick, K., Brown, D.: AI for science (2020)
Google Scholar
Korea University: HanDB: PE92 and SERI95 (2017). https://github.com/callee2006/HangulDB
Van Eck, P.: Handwritten Korean character recognition with tensorflow and android (2017). https://developer.ibm.com/patterns/create-mobile-handwritten-hangul-translation-app/
Yamins, D.L., Hong, H., Cadieu, C.F., Solomon, E.A., Seibert, D., DiCarlo, J.J.: Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc. Natl. Acad. Sci. 111(23), 8619–8624 (2014). https://doi.org/10.1073/pnas.1403112111
Article Google Scholar

Download references

Acknowledgements

JAL, AH, and KEB were supported by the Deep Learning for Science LBNL LDRD. We are grateful for the feedback on the project from the Neural Systems and Data Science Lab.

Author information

Authors and Affiliations

Biological Sciences and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Jesse A. Livezey, Jacob Yeung & Kristofer E. Bouchard
Redwood Center for Theoretical Neuroscience, University of California, Berkeley, CA, USA
Jesse A. Livezey & Kristofer E. Bouchard
Mathematical, Computational and Systems Biology, University of California, Irvine, CA, USA
Ahyeon Hwang
Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA
Kristofer E. Bouchard
Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Kristofer E. Bouchard

Authors

Jesse A. Livezey
View author publications
You can also search for this author in PubMed Google Scholar
Ahyeon Hwang
View author publications
You can also search for this author in PubMed Google Scholar
Jacob Yeung
View author publications
You can also search for this author in PubMed Google Scholar
Kristofer E. Bouchard
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jesse A. Livezey .

Editor information

Editors and Affiliations

Boston University, Boston, MA, USA
Stan Sclaroff
National Research Council, Lecce, Italy
Cosimo Distante
National Research Council, Lecce, Italy
Marco Leo
University of Catania, Catania, Italy
Giovanni M. Farinella
Technische Universität München, Garching, Germany
Federico Tombari

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2797 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Livezey, J.A., Hwang, A., Yeung, J., Bouchard, K.E. (2022). Hangul Fonts Dataset: A Hierarchical and Compositional Dataset for Investigating Learned Representations. In: Sclaroff, S., Distante, C., Leo, M., Farinella, G.M., Tombari, F. (eds) Image Analysis and Processing – ICIAP 2022. ICIAP 2022. Lecture Notes in Computer Science, vol 13233. Springer, Cham. https://doi.org/10.1007/978-3-031-06433-3_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-06433-3_1
Published: 15 May 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-06432-6
Online ISBN: 978-3-031-06433-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Hangul Fonts Dataset: A Hierarchical and Compositional Dataset for Investigating Learned Representations

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Learning typographic style: from discrimination to synthesis

Fostering Compositionality in Latent, Generative Encodings to Solve the Omniglot Challenge

HWNet v2: an efficient word image representation for handwritten documents

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 2797 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Hangul Fonts Dataset: A Hierarchical and Compositional Dataset for Investigating Learned Representations

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Learning typographic style: from discrimination to synthesis

Fostering Compositionality in Latent, Generative Encodings to Solve the Omniglot Challenge

HWNet v2: an efficient word image representation for handwritten documents

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 2797 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation