Abstract
Medical image processing on edge devices is the key to local and efficient data processing. In the last decade, convolutional neural networks (CNNs) have dominated and achieved top performance in various medical imaging applications. However, CNNs are limited in their performance due to their inability to understand long-distance spatial relationships. The recently proposed vision transformer (ViT) learns long-distance spatial relationships of images based on self-attention, but these require large datasets for training. Hence, ViT-based architectures can be combined with CNNs to solve this problem. Yet, their use of edge devices has been barely explored. In this work, we investigate compact convolutional transformers (CCTs) and their ability to be deployed to edge devices. Using strategic design decisions, we were able to deploy CCT to Google Edge TPUs. In comparison to a reference CNN (ResNet50) that was also deployed to the Edge TPU, we reduce the model parameters by a factor of 35 and obtain a 7× inference time speed-up while obtaining competitive accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cao K, Liu Y, Meng G, Sun Q. An overview on edge computing research. IEEE access. 2020;8:85714–28.
Dong P, Ning Z, Obaidat MS, Jiang X, Guo Y, Hu X et al. Edge computing based healthcare systems: enabling decentralized health monitoring in internet of medical things. IEEE Network. 2020;34(5):254–61.
Sun Y, Kist AM. Deep learning on edge TPUs. 2021.
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T et al. An image is worth 16x16 words: transformers for image recognition at scale. ar** the big data paradigm with compact transformers. ar**v preprint ar**v:2104.05704. 2021.
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T et al. Mobilenets: efficient convolutional neural networks for mobile vision applications. ar**v preprint ar**v:1704.04861. 2017.
Valanarasu JMJ, Patel VM. UNeXt: MLP-based rapid medical image segmentation network. ar**v preprint ar**v:2203.04967. 2022.
Zagoruyko S, Komodakis N. Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. ar**v preprint ar**v:1612.03928. 2016.
apolanco3225. Medical MNIST classification. https://github.com/apolanco3225/Medical-MNIST-Classification. 2017.
Kermany DS, Goldbaum M, Cai W, Valentim CC, Liang H, Baxter SL et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell. 2018;172(5):1122–31.
Al-Dhabyani W, Gomaa M, Khaled H, Fahmy A. Dataset of breast ultrasound images. Data Brief. 2020;28:104863.
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-cam: visual explanations from deep networks via gradient-based localization. Proceedings of the IEEE international conference on computer vision. 2017:618–26.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Der/die Autor(en), exklusiv lizenziert an Springer Fachmedien Wiesbaden GmbH, ein Teil von Springer Nature
About this paper
Cite this paper
Sun, Y., Kist, A.M. (2023). Compact Convolutional Transformers on Edge TPUs. In: Deserno, T.M., Handels, H., Maier, A., Maier-Hein, K., Palm, C., Tolxdorff, T. (eds) Bildverarbeitung für die Medizin 2023. BVM 2023. Informatik aktuell. Springer Vieweg, Wiesbaden. https://doi.org/10.1007/978-3-658-41657-7_32
Download citation
DOI: https://doi.org/10.1007/978-3-658-41657-7_32
Published:
Publisher Name: Springer Vieweg, Wiesbaden
Print ISBN: 978-3-658-41656-0
Online ISBN: 978-3-658-41657-7
eBook Packages: Computer Science and Engineering (German Language)