Enhancing Representation Learning of EEG Data with Masked Autoencoders

Zhou, Yifei; Liu, Sitong

doi:10.1007/978-3-031-61572-6_7

Yifei Zhou²⁶ &
Sitong Liu²⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14695))

Included in the following conference series:

International Conference on Human-Computer Interaction

146 Accesses

Abstract

Self-supervised learning has been a powerful training paradigm to facilitate representation learning. In this study, we design a masked autoencoder (MAE) to guide deep learning models to learn electroencephalography (EEG) signal representation. Our MAE includes an encoder and a decoder. A certain proportion of input EEG signals are randomly masked and sent to our MAE. The goal is to recover these masked signals. After this self-supervised pre-training, the encoder is fine-tuned on downstream tasks. We evaluate our MAE on EEGEyeNet gaze estimation task. We find that the MAE is an effective brain signal learner. It also significantly improves learning efficiency. Compared to the model without MAE pre-training, the pre-trained one achieves equal performance with 1/3 the time of training and outperforms it in half the training time. Our study shows that self-supervised learning is a promising research direction for EEG-based applications as other fields (natural language processing, computer vision, robotics, etc.), and thus we expect foundation models to be successful in EEG domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We also experiment with mean squared error (MSE) loss function, the performance increase brought by it is not obvious.
2.
Here “EEGViT" is equivalent to “EEGViT Pre-trained" in Table 4 of [29]. This applies to the following mentions as well.

References

Altaheri, H., et al.: Deep learning techniques for classification of electroencephalogram (eeg) motor imagery (mi) signals: a review. Neural Comput. Appl. 35(20), 14681–14722 (2023)
Article Google Scholar
Bao, H., Dong, L., Piao, S., Wei, F.: Beit: bert pre-training of image transformers. ar**v preprint ar**v:2106.08254 (2021)
Bashivan, P., Rish, I., Yeasin, M., Codella, N.: Learning representations from EEG with deep recurrent-convolutional neural networks. ar**v preprint ar**v:1511.06448 (2015)
Bommasani, R., et al.: On the opportunities and risks of foundation models. ar**v preprint ar**v:2108.07258 (2021)
Brown, T., et al.: Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33, 1877–1901 (2020)
Google Scholar
Chen, M., et al.: Generative pretraining from pixels. In: International Conference on Machine Learning. pp. 1691–1703. PMLR (2020)
Google Scholar
Chien, H.Y.S., Goh, H., Sandino, C.M., Cheng, J.Y.: Maeeg: masked auto-encoder for EEG representation learning. ar**v preprint ar**v:2211.02625 (2022)
Craik, A., He, Y., Contreras-Vidal, J.L.: Deep learning for electroencephalogram (EEG) classification tasks: a review. J. Neural Eng. 16(3), 031001 (2019)
Article Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. ar**v preprint ar**v:2010.11929 (2020)
Firoozi, R., et al.: Foundation models in robotics: applications, challenges, and the future. ar**v preprint ar**v:2312.07843 (2023)
He, K., Chen, X., **e, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009 (2022)
Google Scholar
Kastrati, A., et al.: EEGEyenet: a simultaneous electroencephalography and eye-tracking dataset and benchmark for eye movement prediction. In: Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1) (2021)
Google Scholar
Kenton, J.D.M.W.C., Toutanova, L.K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of naacl-HLT, vol. 1, p. 2 (2019)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. ar**v preprint ar**v:1412.6980 (2014)
Kostas, D., Aroca-Ouellette, S., Rudzicz, F.: Bendr: using transformers and a contrastive self-supervised learning task to learn from massive amounts of eeg data. Front. Hum. Neurosci. 15, 653659 (2021)
Article Google Scholar
Lawhern, V.J., Solon, A.J., Waytowich, N.R., Gordon, S.M., Hung, C.P., Lance, B.J.: Eegnet: a compact convolutional neural network for EEG-based brain-computer interfaces. J. Neural Eng. 15(5), 056013 (2018)
Article Google Scholar
Li, C., et al.: Multimodal foundation models: from specialists to general-purpose assistants, vol. 1, no. 2, p. 2 (2023). ar**v preprint ar**v:2309.10020
Mao, W., Fathurrahman, H., Lee, Y., Chang, T.: EEG dataset classification using CNN method. In: Journal of Physics: Conference Series, vol. 1456, p. 012017. IOP Publishing (2020)
Google Scholar
Murungi, N.K., Pham, M.V., Dai, X.C., Qu, X.: Empowering computer science students in electroencephalography (EEG) analysis: a review of machine learning algorithms for EEG datasets (2023)
Google Scholar
OpenAI, R.: Gpt-4 technical report. ar**v, pp. 2303–08774 (2023)
Google Scholar
Peng, R., et al.: Wavelet2vec: a filter bank masked autoencoder for EEG-based seizure subtype classification. In: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5. IEEE (2023)
Google Scholar
Pulver, D., Angkan, P., Hungler, P., Etemad, A.: EEG-based cognitive load classification using feature masked autoencoding and emotion transfer learning. In: Proceedings of the 25th International Conference on Multimodal Interaction, pp. 190–197 (2023)
Google Scholar
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training (2018)
Google Scholar
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019)
Google Scholar
Roy, Y., Banville, H., Albuquerque, I., Gramfort, A., Falk, T.H., Faubert, J.: Deep learning-based electroencephalography analysis: a systematic review. J. Neural Eng. 16(5), 051001 (2019)
Article Google Scholar
Weng, N., Płomecka, M.B., Kaufmann, M., Kastrati, A., Wattenhofer, R., Langer, N.: An interpretable attention-based method for gaze estimation using electroencephalography (2023)
Google Scholar
**ao, G., Shi, M., Ye, M., Xu, B., Chen, Z., Ren, Q.: 4d attention-based neural network for EEG emotion recognition. Cogn. Neurodyn. 1–14 (2022)
Google Scholar
Yang, R., Modesitt, E.: Vit2eeg: leveraging hybrid pretrained vision transformers for eeg data. ar**v preprint ar**v:2308.00454 (2023)
Yang, S., Nachum, O., Du, Y., Wei, J., Abbeel, P., Schuurmans, D.: Foundation models for decision making: problems, methods, and opportunities. ar**v preprint ar**v:2303.04129 (2023)
Yi, L., Qu, X.: Attention-based CNN capturing EEG recording’s average voltage and local change. In: Degen, H., Ntoa, S. (eds.) HCII 2022. LNCS, vol. 13336, pp. 448–459. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-05643-7_29
Chapter Google Scholar
Zhou, C., et al.: A comprehensive survey on pretrained foundation models: a history from bert to chatGPT. ar**v preprint ar**v:2302.09419 (2023)

Download references

Author information

Authors and Affiliations

George Washington University, Washington, DC, 20052, USA
Yifei Zhou & Sitong Liu

Authors

Yifei Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Sitong Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yifei Zhou .

Editor information

Editors and Affiliations

Soar Technology Inc., Orlando, FL, USA
Dylan D. Schmorrow
Katmai Government Services, Orlando, FL, USA
Cali M. Fidopiastis

Ethics declarations

Disclosure of Interests

The authors have no competing interests to declare that are relevant to the content of this article.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, Y., Liu, S. (2024). Enhancing Representation Learning of EEG Data with Masked Autoencoders. In: Schmorrow, D.D., Fidopiastis, C.M. (eds) Augmented Cognition. HCII 2024. Lecture Notes in Computer Science(), vol 14695. Springer, Cham. https://doi.org/10.1007/978-3-031-61572-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-61572-6_7
Published: 01 June 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-61571-9
Online ISBN: 978-3-031-61572-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Enhancing Representation Learning of EEG Data with Masked Autoencoders