Log in

High-resolution radar road segmentation using weakly supervised learning

  • Article
  • Published:

From Nature Machine Intelligence

View current issue Submit your manuscript

An Author Correction to this article was published on 10 February 2021

This article has been updated

Abstract

Autonomous driving has recently gained lots of attention due to its disruptive potential and impact on the global economy; however, these high expectations are hindered by strict safety requirements for redundant sensing modalities that are each able to independently perform complex tasks to ensure reliable operation. At the core of an autonomous driving algorithmic stack is road segmentation, which is the basis for numerous planning and decision-making algorithms. Radar-based methods fail in many driving scenarios, mainly as various common road delimiters barely reflect radar signals, coupled with a lack of analytical models for road delimiters and the inherit limitations in radar angular resolution. Our approach is based on radar data in the form of a two-dimensional complex range-Doppler array as input into a deep neural network (DNN) that is trained to semantically segment the drivable area using weak supervision from a camera. Furthermore, guided back propagation was utilized to analyse radar data and design a novel perception filter. Our approach creates the ability to perform road segmentation in common driving scenarios based solely on radar data and we propose to utilize this method as an enabler for redundant sensing modalities for autonomous driving.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1: Sample frames from the dataset collected.
Fig. 2: Conventional CFAR filter limitation.
Fig. 3: Sample results from the validation dataset.
Fig. 4: Evaluation metrics used for assessing performance.
Fig. 5: Sensing modalities failure cases.
Fig. 6: Radar perception filter and comparison to conventional CFAR-based filtering.

Similar content being viewed by others

Data availability

The data generated to support the findings of this study are available from the corresponding author upon reasonable request and for non-commercial purposes only.

Code availability

The code that supports the findings of this study is available at https://doi.org/10.5281/zenodo.4318829

Change history

References

  1. Clements, L. M. & Kockelman, K. M. Economic effects of automated vehicles. Transp. Res. Rec. 2606, 106–114 (2017).

    Article  Google Scholar 

  2. Road Vehicles—Functional Safety—Part 1: Vocabulary (International Organization for Standardization, 2018); https://www.iso.org/standard/68383.html

  3. Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles 1–5 (SAE International, 2018).

  4. Yurtsever, E., Lambert, J., Carballo, A. & Takeda, K. A survey of autonomous driving: common practices and emerging technologies. IEEE Access 8, 58443–58469 (2020).

    Article  Google Scholar 

  5. Divakarla, K. P., Emadi, A. & Razavi, S. A cognitive advanced driver assistance systems architecture for autonomous-capable electrified vehicles. IEEE Trans. Transp. Electrif. 5, 48–58 (2019).

    Article  Google Scholar 

  6. Zhu, H., Yuen, K., Mihaylova, L. & Leung, H. Overview of environment perception for intelligent vehicles. IEEE Trans. Intell. Transp. Syst. 18, 2584–2601 (2017)..

  7. Pendleton, S. D. et al. Perception, planning, control, and coordination for autonomous vehicles. Machines 5, 1–54 (2017).

    Article  Google Scholar 

  8. Graves, D., Rezaee, K. & Scheideman, S. Perception as prediction using general value functions in autonomous driving applications. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems 1202–1209 (IEEE, 2019); https://doi.org/10.1109/IROS40897.2019.8968293

  9. Zong, W., Zhang, C., Wang, Z., Zhu, J. & Chen, Q. Architecture design and implementation of an autonomous vehicle. IEEE Access 6, 21956–21970 (2018).

    Article  Google Scholar 

  10. Yang, D., Jiao, X., Jiang, K. & Cao, Z. Driving space for autonomous vehicles. Automot. Innov. 2, 241–253 (2019).

    Article  Google Scholar 

  11. Alvarez, J. M., Gevers, T., LeCun, Y. & Lopez, A. M. Road scene segmentation from a single image. In 12th European Conference on Computer Vision Vol. 7578, 376–389 (Springer, 2012)..

  12. Shelhamer, E., Long, J. & Darrell, T. Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 640–651 (2017).

    Article  Google Scholar 

  13. Ronneberger, O., Fischer, P. & Brox, T. U-net: convolutional networks for biomedical image segmentation. In 18th International Conference on Medical Image Computing and Computer-assisted Intervention Vol. 9351, 234–241 (Springer, 2015).

  14. Jegou, S., Drozdzal, M., Vazquez, D., Romero, A. & Bengio, Y. The one hundred layers Tiramisu: fully convolutional densenets for semantic segmentation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops 1175–1183 (IEEE, 2017).

  15. Zhao, H., Shi, J., Qi, X., Wang, X. & Jia, J. Pyramid scene parsing network. In 2017 IEEE Conference on Computer Vision and Pattern Recognition 6230–6239 (IEEE, 2017).

  16. Lin, G., Milan, A., Shen, C. & Reid, I. RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition 5168–5177 (IEEE, 2017).

  17. Chen, L.-C., Papandreou, G., Schroff, F. & Adam, H. Rethinking atrous convolution for semantic image segmentation. Preprint at https://arxiv.org/abs/1706.05587 (2017).

  18. Felzenszwalb, P. F. & Huttenlocher, D. P. Efficient graph-based image segmentation. Int. J. Comput. Vis. 59, 167–181 (2004).

    Article  Google Scholar 

  19. Tsutsui, S., Kerola, T., Saito, S. & Crandall, D. J. Minimizing supervision for free-space segmentation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops 1101–1110 (IEEE, 2018).

  20. Tsutsui, S., Saito, S. & Kerola, T. Distantly supervised road segmentation. In 2017 IEEE International Conference on Computer Vision Workshops 174–181 (IEEE, 2017).

  21. Chen, Y. H. et al. No more discrimination: cross city adaptation of road scene segmenters. In 2017 IEEE International Conference on Computer Vision 2011–2020 (IEEE, 2017).

  22. Topudurti, K., Keefe, M., Wooliever, P. & Lewis, N. PointNet: deep learning on point sets for 3D classification and segmentation. Water Sci. Technol. 30, 95–104 (2017).

    Article  Google Scholar 

  23. Badrinarayanan, V., Kendall, A. & Cipolla, R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2481–2495 (2017).

    Article  Google Scholar 

  24. Prophet, R., Li, G., Sturm, C. & Vossiek, M. Semantic segmentation on automotive radar maps. In 2019 IEEE Intelligent Vehicles Symposium 756–763 (IEEE, 2019).

  25. Feng, Z., Zhang, S., Kunert, M. & Wiesbeck, W. Point cloud segmentation with a high-resolution automotive radar. In AmE 2019—Automotive meets Electronics 10th GMM Symposium 1–5 (IEEE, 2019).

  26. Schumann, O., Hahn, M., Dickmann, J. & Wöhler, C. Semantic segmentation on radar point clouds. In 2018 21st International Conference on Information Fusion 2179–2186 (IEEE, 2018).

  27. Sless, L., Cohen, G., Shlomo, B. El, Oron, S. Road scene understanding by occupancy grid learning. In 2019 IEEE/CVF International Conference on Computer Vision Workshop 1–9 (IEEE, 2019).

  28. Lombacher, J., Laudt, K., Hahn, M., Dickmann, J. & Wohler, C. Semantic radar grids. In 2017 IEEE Intelligent Vehicles Symposium 1170–1175 (IEEE, 2017); https://doi.org/10.1109/IVS.2017.7995871

  29. Barnes, D., Gadd, M., Murcutt, P., Newman, P. & Posner, I. The Oxford Radar RobotCar Dataset: A Radar Extension to the Oxford RobotCar Dataset (University of Oxford, 2019); https://oxford-robotics-institute.github.io/radar-robotcar-dataset/

  30. Williams, D., De Martini, D., Gadd, M., Marchegiani, L. & Newman, P. Keep off the Grass: Permissible Driving Routes from Radar with Weak Audio Supervision (University of Oxford, 2020).

  31. Esteves, C., Allen-Blanchette, C., Zhou, X. & Daniilidis, K. Polar transformer networks. In 6th International Conference on Learning Representations 1–14 (DBLP, 2018).

  32. Weston, R., Cen, S., Newman, P. & Posner, I. Probably unknown: deep inverse sensor modelling radar. In 2019 International Conference on Robotics and Automation 5446–5452 (IEEE, 2019).

  33. Kaul, P., De Martini, D., Gadd, M. & Newman, P. RSS-Net: Weakly-supervised Multi-Class Semantic Segmentation with FMCW Radar (University of Oxford, 2020).

  34. Nowruzi, F. E. et al. Deep Open Space Segmentation Using Automotive Radar 2–5 (IEEE, 2020).

  35. Engelhardt, N., Perez, R. & Rao, Q. Occupancy grids generation using deep radar network for autonomous driving. In 2019 IEEE Intelligent Transportation Systems Conference 2866–2871 (IEEE, 2019)..

  36. Li, J. & Stoica, P. MIMO RADAR (Wiley, 2008).

  37. Springenberg, J. T., Dosovitskiy, A., Brox, T. & Riedmiller, M. Striving for simplicity: the all convolutional net. Preprint at https://arxiv.org/abs/1412.6806 (2014).

  38. Ostyakov, P. et al. Label denoising with large ensembles of heterogeneous neural networks. In 2018 European Conference on Computer Vision 250–261 (2019).

  39. Geyer, J. et al. A2D2: Audi Autonomous Driving Dataset (Audi, 2020); https://www.a2d2.audi/a2d2/en/download.html

  40. Oktay, O. et al. Attention U-Net: learning where to look for the pancreas. Preprint at https://arxiv.org/abs/1804.03999 (2018)

  41. Milletari, F., Navab, N. & Ahmadi, S. A. V-Net: Fully convolutional neural networks for volumetric medical image segmentation. In 2016 4th International Conference on 3D Vision 565–571 (2016)..

  42. Wang, Y. et al. Symmetric cross entropy for robust learning with noisy labels. In 2019 IEEE/CVF International Conference on Computer Vision 322–330 (2019).

  43. Liao, W. MUSIC for multidimensional spectral estimation: stability and super-resolution. IEEE Trans. Signal Process. 63, 6395–6406 (2015).

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We thank H. Damari for assembling the dataset and H. Omer, Z. Iluz, Y. Avargel, L. Korkidi, M. Raifel, K. Twizer, P. Fidelman and N. Orr for their insights and advice.

Author information

Authors and Affiliations

Authors

Contributions

I.O. conceived the study and conducted training. All authors contributed to the design of the study, interpreting the results and writing the manuscript.

Corresponding author

Correspondence to Itai Orr.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Machine Intelligence thanks Zdenka Babić and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Additional sample results from the validation dataset.

Blurry images were caused by rain droplets on the camera lens. Left column shows range-Doppler maps in dB. Middle column shows the suggested radar-based DNN prediction overlayed on a camera image, values are confidence level on a scale of (0,1). Right column shows the corresponding camera pseudo label generated from a camera-based DNN, values are confidence level on a scale of (0,1).

Extended Data Fig. 2 Camera label projection to RADAR coordinate frame.

A sample of urban scene from the validation dataset showing the camera pseudo label projected on to cartesian coordinates. (a); camera image overlayed with a camera pseudo label, values are confidence level on a scale of (0,1). (b); displays the associated radar data in cartesian representation with values displayed in dB. (c); radar data in cartesian coordinates with values displayed in dB. Overlayed on top (in black) is the projected camera pseudo label. This sample frame further illustrates the lack of distinguishable features associated with common road delimiters such as sidewalks and curbstones. Note that the projected label minimum range is 4.5m due to the camera’s ground clearance.

Extended Data Fig. 3 Filter correlation heatmap.

Comparison between conventional CFAR and the suggested perception filter. The results were averaged over the validation dataset. Y axis represents the conventional CFAR threshold in dB and X axis represents the perception filter on a scale of (0,1). High correlation between the two filters would have created a diagonal heatmap with IoU values close to 1. However, these results show low correlation with low IoU values between the two methods which further suggests the perception filter eliminates data based on context as well as SNR.

Extended Data Fig. 4 Training methodology.

A camera-based DNN is trained on a publicly available dataset and used to create pseudo labels for a RADAR-based DNN. The radar and camera data are temporally synced and spatially overlapped. Radar pre-processing includes windowing and 2D FFT on the sweeps and samples dimensions to create a complex 2D array of range-Doppler maps. The radar model is trained using segmentation loss to identify the drivable area.

Extended Data Fig. 5 Model architecture.

Based on encoder-decoder Unet architecture with channel attention mechanism to encourage learnable cross channel correlations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Orr, I., Cohen, M. & Zalevsky, Z. High-resolution radar road segmentation using weakly supervised learning. Nat Mach Intell 3, 239–246 (2021). https://doi.org/10.1038/s42256-020-00288-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s42256-020-00288-6

  • Springer Nature Limited

This article is cited by

Navigation