Multi-task Learning with Future States for Vision-Based Autonomous Driving

  • Conference paper
  • First Online:
Computer Vision – ACCV 2020 (ACCV 2020)

Abstract

Human drivers consider past and future driving environments to maintain stable control of a vehicle. To adopt a human driver’s behavior, we propose a vision-based autonomous driving model, called Future Actions and States Network (FASNet), which uses predicted future actions and generated future states in multi-task learning manner. Future states are generated using an enhanced deep predictive-coding network and motion equations defined by the kinematic vehicle model. The final control values are determined by the weighted average of the predicted actions for a stable decision. With these methods, the proposed FASNet has a high generalization ability in unseen environments. To validate the proposed FASNet, we conducted several experiments, including ablation studies in realistic three-dimensional simulations. FASNet achieves a higher Success Rate (SR) on the recent CARLA benchmarks under several conditions as compared to state-of-the-art models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (France)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 85.59
Price includes VAT (France)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 105.49
Price includes VAT (France)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the Kitti dataset. Int. J. Robot. Res. 32, 1231–1237 (2013)

    Article  Google Scholar 

  2. Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016)

    Google Scholar 

  3. Bojarski, M., et al.: End to end learning for self-driving cars. ar**v preprint ar**v:1604.07316 (2016)

  4. Hecker, S., Dai, D., Van Gool, L.: End-to-end learning of driving models with surround-view cameras and route planners. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part VII. LNCS, vol. 11211, pp. 449–468. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_27

    Chapter  Google Scholar 

  5. Huang, Z., Zhang, J., Tian, R., Zhang, Y.: End-to-end autonomous driving decision based on deep reinforcement learning. In: 2019 5th International Conference on Control, Automation and Robotics (ICCAR), pp. 658–662. IEEE (2019)

    Google Scholar 

  6. Codevilla, F., Miiller, M., López, A., Koltun, V., Dosovitskiy, A.: End-to-end driving via conditional imitation learning. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1–9. IEEE (2018)

    Google Scholar 

  7. Codevilla, F., Santana, E., López, A.M., Gaidon, A.: Exploring the limitations of behavior cloning for autonomous driving. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 9329–9338 (2019)

    Google Scholar 

  8. Liang, X., Wang, T., Yang, L., **ng, E.: CIRL: controllable imitative reinforcement learning for vision-based self-driving. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part VII. LNCS, vol. 11211, pp. 604–620. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_36

    Chapter  Google Scholar 

  9. Sauer, A., Savinov, N., Geiger, A.: Conditional affordance learning for driving in urban environments. ar**v preprint ar**v:1806.06498 (2018)

  10. Wang, Q., Chen, L., Tian, B., Tian, W., Li, L., Cao, D.: End-to-end autonomous driving: An angle branched network approach. IEEE Trans. Veh. Technol.(2019)

    Google Scholar 

  11. Chen, D., Zhou, B., Koltun, V., Krähenbühl, P.: Learning by cheating. ar**v preprint ar**v:1912.12294 (2019)

  12. Li, Z., Motoyoshi, T., Sasaki, K., Ogata, T., Sugano, S.: Rethinking self-driving: Multi-task knowledge for better generalization and accident explanation ability. ar**v preprint ar**v:1809.11100 (2018)

  13. Chowdhuri, S., Pankaj, T., Zipser, K.: Multinet: Multi-modal multi-task learning for autonomous driving. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1496–1504. IEEE (2019)

    Google Scholar 

  14. Lotter, W., Kreiman, G., Cox, D.: Deep predictive coding networks for video prediction and unsupervised learning. ar**v preprint ar**v:1605.08104 (2016)

  15. Kong, J., Pfeiffer, M., Schildbach, G., Borrelli, F.: Kinematic and dynamic vehicle models for autonomous driving control design. In: 2015 IEEE Intelligent Vehicles Symposium (IV), pp. 1094–1099. IEEE (2015)

    Google Scholar 

  16. Zhang, Y., Yang, Q.: A survey on multi-task learning. ar**v preprint ar**v:1707.08114 (2017)

  17. Xu, H., Gao, Y., Yu, F., Darrell, T.: End-to-end learning of driving models from large-scale video datasets. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2174–2182 (2017)

    Google Scholar 

  18. Chi, L., Mu, Y.: Deep steering: Learning end-to-end driving model from spatial and temporal visual cues. ar**v preprint ar**v:1708.03798 (2017)

  19. Ohn-Bar, E., Prakash, A., Behl, A., Chitta, K., Geiger, A.: Learning situational driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11296–11305 (2020)

    Google Scholar 

  20. Yu, A., Palefsky-Smith, R., Bedi, R.: Deep reinforcement learning for simulated autonomous vehicle control, pp. 1–7. Course Project Reports, Winter (2016)

    Google Scholar 

  21. Toromanoff, M., Wirbel, E., Moutarde, F.: End-to-end model-free reinforcement learning for urban driving using implicit affordances. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7153–7162 (2020)

    Google Scholar 

  22. Tai, L., Yun, P., Chen, Y., Liu, C., Ye, H., Liu, M.: Visual-based autonomous driving deployment from a stochastic and uncertainty-aware perspective. ar**v preprint ar**v:1903.00821 (2019)

  23. Yang, Z., Zhang, Y., Yu, J., Cai, J., Luo, J.: End-to-end multi-modal multi-task vehicle control for self-driving cars with visual perceptions. In: 2018 24th International Conference on Pattern Recognition (ICPR), pp. 2289–2294. IEEE (2018)

    Google Scholar 

  24. Mathieu, M., Couprie, C., LeCun, Y.: Deep multi-scale video prediction beyond mean square error. ar**v preprint ar**v:1511.05440 (2015)

  25. Srivastava, N., Mansimov, E., Salakhudinov, R.: Unsupervised learning of video representations using LSTMs. In: International Conference on Machine Learning, pp. 843–852 (2015)

    Google Scholar 

  26. Liang, X., Lee, L., Dai, W., **ng, E.P.: Dual motion GAN for future-flow embedded video prediction. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1744–1752 (2017)

    Google Scholar 

  27. Wei, H., Yin, X., Lin, P.: Novel video prediction for large-scale scene using optical flow. ar**v preprint ar**v:1805.12243 (2018)

  28. Ranjan, R., Patel, V.M., Chellappa, R.: Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41, 121–135 (2017)

    Article  Google Scholar 

  29. Du, L., Zhao, Z., Su, F., Wang, L., An, C.: Jointly predicting future sequence and steering angles for dynamic driving scenes. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4070–4074. IEEE (2019)

    Google Scholar 

  30. **, X., et al.: Predicting scene parsing and motion dynamics in the future. In: Advances in Neural Information Processing Systems, pp. 6915–6924 (2017)

    Google Scholar 

  31. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. (2016)

    Google Scholar 

  32. Song, G., Chai, W.: Collaborative learning for deep neural networks. In: Advances in Neural Information Processing Systems, pp. 1832–1841 (2018)

    Google Scholar 

  33. Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V.: Carla: An open urban driving simulator. ar**v preprint ar**v:1711.03938 (2017)

  34. felipecode: Carla 0.8.4 data collector (2018). https://github.com/carla-simulator/data-collector

Download references

Acknowledgments

This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2014-0-00059, Development of Predictive Visual Intelligence Technology), (No. 2017-0-00897, Development of Object Detection and Recognition for Intelligent Vehicles) and (No. 2018-0-01290, Development of an Open Dataset and Cognitive Processing Technology for the Recognition of Features Derived From Unstructured Human Motions Used in Self-driving Cars).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Inhan Kim .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 352 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kim, I., Lee, H., Lee, J., Lee, E., Kim, D. (2021). Multi-task Learning with Future States for Vision-Based Autonomous Driving. In: Ishikawa, H., Liu, CL., Pajdla, T., Shi, J. (eds) Computer Vision – ACCV 2020. ACCV 2020. Lecture Notes in Computer Science(), vol 12624. Springer, Cham. https://doi.org/10.1007/978-3-030-69535-4_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-69535-4_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-69534-7

  • Online ISBN: 978-3-030-69535-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation