Abstract
Event cameras are bio-inspired sensors. They have outstanding properties compared to frame-based cameras: high dynamic range (120 vs 60), low latency, and no motion blur. Event cameras are appropriate to use in challenging scenarios such as vision systems in self-driving cars and they have been used for high-level computer vision tasks such as semantic segmentation and depth estimation. In this work, we worked on semantic segmentation using an event camera for self-driving cars. i) This work introduces a new event-based semantic segmentation network and we evaluate our model on DDD17 dataset and Event-Scape dataset which was produced using Carla simulator. ii) Event-based networks are robust to lighting conditions but their accuracy is low compared to common frame-based networks, for boosting the accuracy we propose a novel event-frame-based semantic segmentation network that it uses both images and events. We also introduce a novel training method (blurring module), and results show our training method boosts the performance of the network in recognition of small and far objects, and also the network could work when images suffer from blurring.
GitHub Page: https://github.com/mehdighasemzadeh/Event-Frame-Based-Semantic-Segmentation.git and https://github.com/mehdighasemzadeh/Event-Based-Semantic-Segmentation.git.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
More Tests: https://youtu.be/AL911t6QpBA.
- 2.
More Tests: https://youtu.be/o8nz3FxwzZg.
- 3.
More Tests: https://youtu.be/Q1pNcZDNzos.
- 4.
More Tests: https://youtu.be/K6tkeT32Yi8.
References
Alonso, I., Murillo, A.C.: EV-SegNet: semantic segmentation for event-based cameras. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2019)
Binas, J., Neil, D., Liu, S.-C., Delbruck, T.: Ddd17: end-to-end Davis driving dataset. ar**v preprint ar**v:1711.01458 (2017)
Brandli, C., Berner, R., Yang, M., Liu, S.C., Delbruck, T.: A 240x180 130dB 3µs latency global shutter spatiotemporal vision sensor. IEEE J. Solid-State Circuits 49(10), 2333–2341 (2014). https://doi.org/10.1109/JSSC.2014.2342715
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800–1807 (2017). https://doi.org/10.1109/CVPR.2017.195
Gehrig, M., Aarents, W., Gehrig, D., Scaramuzza, D.: Dsec: a stereo event camera dataset for driving scenarios. IEEE Robot. Autom. Lett. (2021). https://doi.org/10.1109/LRA.2021.3068942
Gehrig, D., Gehrig, M., Hidalgo-Carrio, J., Scaramuzza, D.: Video to events: recycling video datasets for event cameras. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3586–3595 (2020)
Wang, L., Chae, Y., Yoon, S.H., Kim, T.K., Yoon, K.J.: Evdistill: asynchronous events to end-task learning via bidirectional reconstruction-guided cross-modal knowledge distillation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Wang, L., Chae, Y., Yoon, K.J.: Dual transfer learning for event-based end-task prediction via pluggable event to image translation. In: International Conference on Computer Vision (ICCV), pp. 2135–2145 (2021)
Sun, Z., Messikommer, N., Gehrig, D., Scaramuzza, D.: ESS: learning event-based semantic segmentation from stillimages. ar**v preprint ar**v:2203.10016 (2022)
Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V.: CARLA: an open urban driving simulator. In: Conference on Robotics Learning (CoRL) (2017)
Moeys, D.P., et al.: Steering a predator robot using a mixed frame/event-driven convolutional neural network. In: 2016 Second International Conference on Event-Based Control, Communication, and Signal Processing (EBCCSP), pp. 1–8. IEEE (2016)
Maqueda, A.I., Loquercio, A., Gallego, G., GarcÃa, N., Scaramuzza, D.: Event-based vision meets deep learning on steering prediction for self-driving cars. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5419–5427 (2018)
Lagorce, X., Orchard, G., Galluppi, F., Shi, B.E., Benosman, R.B.: Hots: a hierarchy of event-based time-surfaces for pattern recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(7), 1346–1359 (2017)
Zhu, A.Z., Yuan, L., Chaney, K., Daniilidis, K.: Unsupervised event based learning of optical flow, depth, and egomotion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 989–997 (2019)
Biswas, S.D., Kosta, A., Liyanagedera, C., Apolinario, M., Roy, K.: HALSIE – hybrid approach to learning segmentation by simultaneously exploiting image and event modalities. https://arxiv.org/abs/2211.10754 (2022)
Gallego, G., et al.: Event-based vision: a survey. IEEE Trans. Pattern Anal. Mach. Intell. (2020). https://doi.org/10.1109/TPAMI.2020.3008413
Bardow, P., Davison, A.J., Leutenegger, S.: Simultaneous optical flow and intensity estimation from an event camera. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 884–892 (2016). https://doi.org/10.1109/CVPR.2016.102
Zhu, A.Z., Yuan, L., Chaney, K., Daniilidis, K.: Unsupervised event-based learning of optical flow, depth, and egomotion. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Rebecq, H., Ranftl, R., Koltun, V., Scaramuzza, D.: High speed and high dynamic range video with an event camera. IEEE Trans. Pattern Anal. Mach. Intell. (2019). https://doi.org/10.1109/TPAMI.2019.2963386
Reinbacher, C., Graber, G., Pock, T.: Real-time intensity-image reconstruction for event cameras using manifold regularisation. In: British Machine Vision Conference (BMVC) (2016). https://doi.org/10.5244/C.30.9
Gehrig, D., Rüegg, M., Gehrig, M., Hidalgo-Carrio, J., Scaramuzza, D.: Combining events and frames using recurrent asynchronous multimodal networks for monocular depth prediction. IEEE Robot. Autom. Lett. (RA-L) (2021)
Oršić, M., Krešo, I., Bevandić, P., Šegvić,. S.: In defense of pre-trained imagenet architectures for real-time semantic segmentation of road-driving images. In: CVPR 2019 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/cvpr.2016.90
Sudre, C.H., Li, W., Vercauteren, T., Ourselin, S., Jorge Cardoso, M.: Generalized Dice Overlap as a Deep Learning Loss Function for Highly Unbalanced Segmentations. In: Cardoso, M., et al. (eds.) DLMIA ML-CDS 2017. LNCS, vol. 10553, pp. 240–248. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67558-9_28
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ghasemzadeh, M., Shouraki, S.B. (2023). Semantic Segmentation Using Events and Combination of Events and Frames. In: Ghatee, M., Hashemi, S.M. (eds) Artificial Intelligence and Smart Vehicles. ICAISV 2023. Communications in Computer and Information Science, vol 1883. Springer, Cham. https://doi.org/10.1007/978-3-031-43763-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-43763-2_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43762-5
Online ISBN: 978-3-031-43763-2
eBook Packages: Computer ScienceComputer Science (R0)