High-resolution radar road segmentation using weakly supervised learning

Orr, Itai; Cohen, Moshik; Zalevsky, Zeev

doi:10.1038/s42256-020-00288-6

High-resolution radar road segmentation using weakly supervised learning

Article
Published: 01 February 2021

Volume 3, pages 239–246, (2021)
Cite this article

From

View current issue Submit your manuscript

2797 Accesses
19 Citations
14 Altmetric
Explore all metrics

An Author Correction to this article was published on 10 February 2021

This article has been updated

Abstract

Autonomous driving has recently gained lots of attention due to its disruptive potential and impact on the global economy; however, these high expectations are hindered by strict safety requirements for redundant sensing modalities that are each able to independently perform complex tasks to ensure reliable operation. At the core of an autonomous driving algorithmic stack is road segmentation, which is the basis for numerous planning and decision-making algorithms. Radar-based methods fail in many driving scenarios, mainly as various common road delimiters barely reflect radar signals, coupled with a lack of analytical models for road delimiters and the inherit limitations in radar angular resolution. Our approach is based on radar data in the form of a two-dimensional complex range-Doppler array as input into a deep neural network (DNN) that is trained to semantically segment the drivable area using weak supervision from a camera. Furthermore, guided back propagation was utilized to analyse radar data and design a novel perception filter. Our approach creates the ability to perform road segmentation in common driving scenarios based solely on radar data and we propose to utilize this method as an enabler for redundant sensing modalities for autonomous driving.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

**Fig. 1: Sample frames from the dataset collected.**

**Fig. 2: Conventional CFAR filter limitation.**

**Fig. 3: Sample results from the validation dataset.**

**Fig. 4: Evaluation metrics used for assessing performance.**

**Fig. 5: Sensing modalities failure cases.**

**Fig. 6: Radar perception filter and comparison to conventional CFAR-based filtering.**

ARF-YOLOv8: a novel real-time object detection model for UAV-captured images detection

Article 04 June 2024

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers

A performance comparison of YOLOv8 models for traffic sign detection in the Robotaxi-full scale autonomous vehicle competition

Article 12 August 2023

Data availability

The data generated to support the findings of this study are available from the corresponding author upon reasonable request and for non-commercial purposes only.

Code availability

The code that supports the findings of this study is available at https://doi.org/10.5281/zenodo.4318829

Change history

10 February 2021
A Correction to this paper has been published: https://doi.org/10.1038/s42256-021-00314-1

References

Clements, L. M. & Kockelman, K. M. Economic effects of automated vehicles. Transp. Res. Rec. 2606, 106–114 (2017).
Article Google Scholar
Road Vehicles—Functional Safety—Part 1: Vocabulary (International Organization for Standardization, 2018); https://www.iso.org/standard/68383.html
Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles 1–5 (SAE International, 2018).
Yurtsever, E., Lambert, J., Carballo, A. & Takeda, K. A survey of autonomous driving: common practices and emerging technologies. IEEE Access 8, 58443–58469 (2020).
Article Google Scholar
Divakarla, K. P., Emadi, A. & Razavi, S. A cognitive advanced driver assistance systems architecture for autonomous-capable electrified vehicles. IEEE Trans. Transp. Electrif. 5, 48–58 (2019).
Article Google Scholar
Zhu, H., Yuen, K., Mihaylova, L. & Leung, H. Overview of environment perception for intelligent vehicles. IEEE Trans. Intell. Transp. Syst. 18, 2584–2601 (2017)..
Pendleton, S. D. et al. Perception, planning, control, and coordination for autonomous vehicles. Machines 5, 1–54 (2017).
Article Google Scholar
Graves, D., Rezaee, K. & Scheideman, S. Perception as prediction using general value functions in autonomous driving applications. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems 1202–1209 (IEEE, 2019); https://doi.org/10.1109/IROS40897.2019.8968293
Zong, W., Zhang, C., Wang, Z., Zhu, J. & Chen, Q. Architecture design and implementation of an autonomous vehicle. IEEE Access 6, 21956–21970 (2018).
Article Google Scholar
Yang, D., Jiao, X., Jiang, K. & Cao, Z. Driving space for autonomous vehicles. Automot. Innov. 2, 241–253 (2019).
Article Google Scholar
Alvarez, J. M., Gevers, T., LeCun, Y. & Lopez, A. M. Road scene segmentation from a single image. In 12th European Conference on Computer Vision Vol. 7578, 376–389 (Springer, 2012)..
Shelhamer, E., Long, J. & Darrell, T. Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 640–651 (2017).
Article Google Scholar
Ronneberger, O., Fischer, P. & Brox, T. U-net: convolutional networks for biomedical image segmentation. In 18th International Conference on Medical Image Computing and Computer-assisted Intervention Vol. 9351, 234–241 (Springer, 2015).
Jegou, S., Drozdzal, M., Vazquez, D., Romero, A. & Bengio, Y. The one hundred layers Tiramisu: fully convolutional densenets for semantic segmentation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops 1175–1183 (IEEE, 2017).
Zhao, H., Shi, J., Qi, X., Wang, X. & Jia, J. Pyramid scene parsing network. In 2017 IEEE Conference on Computer Vision and Pattern Recognition 6230–6239 (IEEE, 2017).
Lin, G., Milan, A., Shen, C. & Reid, I. RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition 5168–5177 (IEEE, 2017).
Chen, L.-C., Papandreou, G., Schroff, F. & Adam, H. Rethinking atrous convolution for semantic image segmentation. Preprint at https://arxiv.org/abs/1706.05587 (2017).
Felzenszwalb, P. F. & Huttenlocher, D. P. Efficient graph-based image segmentation. Int. J. Comput. Vis. 59, 167–181 (2004).
Article Google Scholar
Tsutsui, S., Kerola, T., Saito, S. & Crandall, D. J. Minimizing supervision for free-space segmentation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops 1101–1110 (IEEE, 2018).
Tsutsui, S., Saito, S. & Kerola, T. Distantly supervised road segmentation. In 2017 IEEE International Conference on Computer Vision Workshops 174–181 (IEEE, 2017).
Chen, Y. H. et al. No more discrimination: cross city adaptation of road scene segmenters. In 2017 IEEE International Conference on Computer Vision 2011–2020 (IEEE, 2017).
Topudurti, K., Keefe, M., Wooliever, P. & Lewis, N. PointNet: deep learning on point sets for 3D classification and segmentation. Water Sci. Technol. 30, 95–104 (2017).
Article Google Scholar
Badrinarayanan, V., Kendall, A. & Cipolla, R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2481–2495 (2017).
Article Google Scholar
Prophet, R., Li, G., Sturm, C. & Vossiek, M. Semantic segmentation on automotive radar maps. In 2019 IEEE Intelligent Vehicles Symposium 756–763 (IEEE, 2019).
Feng, Z., Zhang, S., Kunert, M. & Wiesbeck, W. Point cloud segmentation with a high-resolution automotive radar. In AmE 2019—Automotive meets Electronics 10th GMM Symposium 1–5 (IEEE, 2019).
Schumann, O., Hahn, M., Dickmann, J. & Wöhler, C. Semantic segmentation on radar point clouds. In 2018 21st International Conference on Information Fusion 2179–2186 (IEEE, 2018).
Sless, L., Cohen, G., Shlomo, B. El, Oron, S. Road scene understanding by occupancy grid learning. In 2019 IEEE/CVF International Conference on Computer Vision Workshop 1–9 (IEEE, 2019).
Lombacher, J., Laudt, K., Hahn, M., Dickmann, J. & Wohler, C. Semantic radar grids. In 2017 IEEE Intelligent Vehicles Symposium 1170–1175 (IEEE, 2017); https://doi.org/10.1109/IVS.2017.7995871
Barnes, D., Gadd, M., Murcutt, P., Newman, P. & Posner, I. The Oxford Radar RobotCar Dataset: A Radar Extension to the Oxford RobotCar Dataset (University of Oxford, 2019); https://oxford-robotics-institute.github.io/radar-robotcar-dataset/
Williams, D., De Martini, D., Gadd, M., Marchegiani, L. & Newman, P. Keep off the Grass: Permissible Driving Routes from Radar with Weak Audio Supervision (University of Oxford, 2020).
Esteves, C., Allen-Blanchette, C., Zhou, X. & Daniilidis, K. Polar transformer networks. In 6th International Conference on Learning Representations 1–14 (DBLP, 2018).
Weston, R., Cen, S., Newman, P. & Posner, I. Probably unknown: deep inverse sensor modelling radar. In 2019 International Conference on Robotics and Automation 5446–5452 (IEEE, 2019).
Kaul, P., De Martini, D., Gadd, M. & Newman, P. RSS-Net: Weakly-supervised Multi-Class Semantic Segmentation with FMCW Radar (University of Oxford, 2020).
Nowruzi, F. E. et al. Deep Open Space Segmentation Using Automotive Radar 2–5 (IEEE, 2020).
Engelhardt, N., Perez, R. & Rao, Q. Occupancy grids generation using deep radar network for autonomous driving. In 2019 IEEE Intelligent Transportation Systems Conference 2866–2871 (IEEE, 2019)..
Li, J. & Stoica, P. MIMO RADAR (Wiley, 2008).
Springenberg, J. T., Dosovitskiy, A., Brox, T. & Riedmiller, M. Striving for simplicity: the all convolutional net. Preprint at https://arxiv.org/abs/1412.6806 (2014).
Ostyakov, P. et al. Label denoising with large ensembles of heterogeneous neural networks. In 2018 European Conference on Computer Vision 250–261 (2019).
Geyer, J. et al. A2D2: Audi Autonomous Driving Dataset (Audi, 2020); https://www.a2d2.audi/a2d2/en/download.html
Oktay, O. et al. Attention U-Net: learning where to look for the pancreas. Preprint at https://arxiv.org/abs/1804.03999 (2018)
Milletari, F., Navab, N. & Ahmadi, S. A. V-Net: Fully convolutional neural networks for volumetric medical image segmentation. In 2016 4th International Conference on 3D Vision 565–571 (2016)..
Wang, Y. et al. Symmetric cross entropy for robust learning with noisy labels. In 2019 IEEE/CVF International Conference on Computer Vision 322–330 (2019).
Liao, W. MUSIC for multidimensional spectral estimation: stability and super-resolution. IEEE Trans. Signal Process. 63, 6395–6406 (2015).
Article MathSciNet Google Scholar

Download references

Acknowledgements

We thank H. Damari for assembling the dataset and H. Omer, Z. Iluz, Y. Avargel, L. Korkidi, M. Raifel, K. Twizer, P. Fidelman and N. Orr for their insights and advice.

Author information

Authors and Affiliations

Faculty of Engineering and the Institute for Nanotechnology and Advanced Materials, Bar-Ilan University, Ramat-Gan, Israel
Itai Orr & Zeev Zalevsky
Wisense Technologies, Tel Aviv, Israel
Itai Orr & Moshik Cohen

Authors

Itai Orr
View author publications
You can also search for this author in PubMed Google Scholar
Moshik Cohen
View author publications
You can also search for this author in PubMed Google Scholar
Zeev Zalevsky
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

I.O. conceived the study and conducted training. All authors contributed to the design of the study, interpreting the results and writing the manuscript.

Corresponding author

Correspondence to Itai Orr.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Machine Intelligence thanks Zdenka Babić and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Additional sample results from the validation dataset.

Blurry images were caused by rain droplets on the camera lens. Left column shows range-Doppler maps in dB. Middle column shows the suggested radar-based DNN prediction overlayed on a camera image, values are confidence level on a scale of (0,1). Right column shows the corresponding camera pseudo label generated from a camera-based DNN, values are confidence level on a scale of (0,1).

Extended Data Fig. 2 Camera label projection to RADAR coordinate frame.

A sample of urban scene from the validation dataset showing the camera pseudo label projected on to cartesian coordinates. (a); camera image overlayed with a camera pseudo label, values are confidence level on a scale of (0,1). (b); displays the associated radar data in cartesian representation with values displayed in dB. (c); radar data in cartesian coordinates with values displayed in dB. Overlayed on top (in black) is the projected camera pseudo label. This sample frame further illustrates the lack of distinguishable features associated with common road delimiters such as sidewalks and curbstones. Note that the projected label minimum range is 4.5m due to the camera’s ground clearance.

Extended Data Fig. 3 Filter correlation heatmap.

Comparison between conventional CFAR and the suggested perception filter. The results were averaged over the validation dataset. Y axis represents the conventional CFAR threshold in dB and X axis represents the perception filter on a scale of (0,1). High correlation between the two filters would have created a diagonal heatmap with IoU values close to 1. However, these results show low correlation with low IoU values between the two methods which further suggests the perception filter eliminates data based on context as well as SNR.

Extended Data Fig. 4 Training methodology.

A camera-based DNN is trained on a publicly available dataset and used to create pseudo labels for a RADAR-based DNN. The radar and camera data are temporally synced and spatially overlapped. Radar pre-processing includes windowing and 2D FFT on the sweeps and samples dimensions to create a complex 2D array of range-Doppler maps. The radar model is trained using segmentation loss to identify the drivable area.

Extended Data Fig. 5 Model architecture.

Based on encoder-decoder Unet architecture with channel attention mechanism to encourage learnable cross channel correlations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Orr, I., Cohen, M. & Zalevsky, Z. High-resolution radar road segmentation using weakly supervised learning. Nat Mach Intell 3, 239–246 (2021). https://doi.org/10.1038/s42256-020-00288-6

Download citation

Received: 05 July 2020
Accepted: 21 December 2020
Published: 01 February 2021
Issue Date: March 2021
DOI: https://doi.org/10.1038/s42256-020-00288-6
Springer Nature Limited

This article is cited by

Physics and semantic informed multi-sensor calibration via optimization theory and self-supervised learning
- Shmuel Y. Hayoun
- Meir Halachmi
- Itai Orr
Scientific Reports (2024)
Machine vision-based detections of transparent chemical vessels toward the safe automation of material synthesis
- Leslie Ching Ow Tiong
- Hyuk Jun Yoo
- Donghun Kim
npj Computational Materials (2024)
Autonomous vehicles decision-making enhancement using self-determination theory and mixed-precision neural networks
- Mohammed Hasan Ali
- Mustafa Musa Jaber
- P. Punitha
Multimedia Tools and Applications (2023)
Deep learning-based robust positioning for all-weather autonomous driving
- Yasin Almalioglu
- Mehmet Turan
- Andrew Markham
Nature Machine Intelligence (2022)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High-resolution radar road segmentation using weakly supervised learning

From

Abstract

Access this article

Similar content being viewed by others

ARF-YOLOv8: a novel real-time object detection model for UAV-captured images detection

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers

A performance comparison of YOLOv8 models for traffic sign detection in the Robotaxi-full scale autonomous vehicle competition

Data availability

Code availability

Change history

10 February 2021

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Extended data

Extended Data Fig. 1 Additional sample results from the validation dataset.

Extended Data Fig. 2 Camera label projection to RADAR coordinate frame.

Extended Data Fig. 3 Filter correlation heatmap.

Extended Data Fig. 4 Training methodology.

Extended Data Fig. 5 Model architecture.

Rights and permissions

About this article

Cite this article

This article is cited by

Physics and semantic informed multi-sensor calibration via optimization theory and self-supervised learning

Machine vision-based detections of transparent chemical vessels toward the safe automation of material synthesis

Autonomous vehicles decision-making enhancement using self-determination theory and mixed-precision neural networks

Deep learning-based robust positioning for all-weather autonomous driving

Navigation

High-resolution radar road segmentation using weakly supervised learning

Abstract

Access this article

Similar content being viewed by others

Data availability

Code availability

Change history

10 February 2021

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Extended data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Navigation