A Thorough Review of Big Data Sources and Sets Used in Transportation Research

  • Conference paper
  • First Online:
Reliability and Statistics in Transportation and Communication (RelStat 2017)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 36))

Abstract

The development of Information and Communications Technology (ICT) and the Internet provide Intelligent Transport Systems (ITS) with a huge amount of real-time data. These data are the so-called “Big Data” which can be collected, interpreted, managed and analyzed in a proper way in order to improve the knowledge around the transport system. The use of these technologies has greatly enhanced the efficiency and user friendliness of ITS, providing significant economic and social impacts, contributing positively to the management of sustainable mobility.

In this paper, different sources of big data that have been used in ITS are presented, while their advantages and limitations are further discussed. Analytically, big data sources that have been used within the last 10 years are identified. Then, a review of current applications is done, in order to disclose the most used and proper data source per case.

Aim of the present study is to improve the knowledge around the usage of big data in transport planning and to contribute to the better support of ITS, by providing a roadmap to decision makers for big data collection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 117.69
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 160.49
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Amin, S., Andrews, S., Apte, S., Arnold, J., Ban, J., Benko, M., Bayen, R.M., Chiou, B., Claudel, C., et al.: Mobile century using GPS mobile phones as traffic sensors: a field experiment, pp. 16–20 (2008)

    Google Scholar 

  2. Anda, C., Fourie, P., Erath, A.: Transport modelling in the age of big data. In: Future Cities Laboratory (2016)

    Google Scholar 

  3. Artikis, A., et al.: Self-Adaptive Event Recognition for Intelligent Transport Management, pp. 319–325 (2013)

    Google Scholar 

  4. Arun, K., Jabasheela, L.: Big data: review, classification and analysis survey. Int. J. Innovative Res. Inf. Secur. (IJIRIS) 1(3), 17–23 (2014)

    Google Scholar 

  5. Bagchi, M., White, P.: The potential of public transport smart card data. Transp. Policy 12(5), 464–474 (2005)

    Article  Google Scholar 

  6. Barrow, K.: Big Data predicts train delays before they occur. http://www.railjournal.com/index.php/commuter-rail/big-data-predicts-train-delays-before-they-occur.html. Accessed 11 Aug 2017

  7. Bekhor, S., Cohen, Y., Solomon, S.: Evaluating long-distance travel patterns in Israel by tracking cellular phone positions. J. Adv. Transp. 47, 435–446 (2013)

    Article  Google Scholar 

  8. Bertrand, K.Z., Bialik, M., Virdee, K., Gros, A, Bar-Yam, Y.: Sentiment in New York City: a high resolution spatial and temporal view, New England complex systems institute, Cambridge, United States (2013). http://www.necsi.edu/research/social/newyork/

  9. Biem, A., Bouillet, E., Feng, H., Ranganathan, A., Riabov, A., Verscheure, O., Koutsopoulos, H., Moran, C.: IBM InfoSphere streams for scalable, real-time, intelligent transportation services. In: SIGMOD 2010, 6–11 June, Indianapolis, Indiana, USA (2010)

    Google Scholar 

  10. Calabrese, F., Diao, M., Di Lorenzo, G., Ferreira Jr., J., Ratti, C.: Understanding individual mobility patterns from urban sensing data: a mobile phone trace example. Transp. Res. Part C Emerg. Technol. 26, 301–313 (2013)

    Article  Google Scholar 

  11. Calabrese, F., Lorenzo, G.D., Liu, L., Ratti, C.: Estimating origin-destination flows using mobile phone location data. IEEE Pervasive Comput. 10(4), 36–44 (2011). ISSN 1536-1268

    Article  Google Scholar 

  12. Castro, P., Zhang, D., Li, S.: Urban traffic modelling and prediction using large scale taxi GPS traces. In: Kay, J., Lukowicz, P., Tokuda, H., Olivier, P., Krüger, A. (eds.) Pervasive Computing, pp. 57–72, Berlin, Heidelberg (2012)

    Google Scholar 

  13. Chandio, A.A., Tziritas, N., Xu, C.-Z.: Big-data processing techniques and their challenges in transport domain (2015)

    Google Scholar 

  14. Chandrasekar, P.: Big data and transport modeling: opportunities and challenges. Int. J. Appl. Eng. Res. 10(17), 38038–38044 (2015). ISSN 0973-4562

    Google Scholar 

  15. Chao, C., Daqing, Z., Zhi-Hua, Z., Nan, L., Atmaca, T., Shijian, L.: B-Planner: night bus route planning using large-scale taxi GPS traces. In: 2013 IEEE International Conference on Pervasive Computing and Communications (PerCom) (2013)

    Google Scholar 

  16. Cheng, Z., Caverlee, J., Lee, K., Sui, D.: Exploring millions of footprints in location sharing services. In: Fifth International Association for the Advancement of Artificial Intelligence Conference on Weblogs and Social Media, Barcelona, Spain (2011)

    Google Scholar 

  17. Christian, M., Schneider, V.B., Couronne, T., Smoreda, Z., Gonzalez, M.C.: Unraveling daily human mobility motifs. J. R. Soc. Interface 10, 20130246 (2013)

    Google Scholar 

  18. Cici, B., Markopoulou, A., Frias-Martinez, E., Laoutaris, N.: Assessing the potential of ride-sharing using mobile and social data: a tale of four cities. In: Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing, Seattle, Washington, 2632055, pp. 201–211. ACM (2014)

    Google Scholar 

  19. De Mauro, A., Greco, M., Grimaldi, M.: A formal definition of big data based on its essential features. Libr. Rev. 65(3), 122–135 (2016)

    Article  Google Scholar 

  20. Demchenko, Y., Laat, C.D., Membrey, P.: Defining architecture components of the big data ecosystem. In: Proceedings of International Conference Collaboration Technologies and Systems (CTS 2014), pp. 104–112 (2014)

    Google Scholar 

  21. Dewulf, B., Neutens, T., Vanlommel, M., Logghe, S., De Maeyer, P., Witlox, F.: Examining commuting patterns using floating car data and circular statistics: exploring the use of new methods and visualizations to study travel times. J. Transp. Geogr. 48, 41–51 (2015)

    Article  Google Scholar 

  22. Digital Bonanza – Cover Story: Binghamton Research Magazine, Winter Issue, pp. 12–19 (2014)

    Google Scholar 

  23. Eggermond, M., Chen, H., Erath, A., Cebrian, M.: Investigating the potential of social network data for transport demand models. In: Transportation Research Board 95th Annual Meeting, United States (2015)

    Google Scholar 

  24. Emani, C.K., Cullot, N., Nicolle, C.: Understandable big data: a survey. Comput. Sci. Rev. 17, 70–81 (2015)

    Article  MathSciNet  Google Scholar 

  25. Eom, J., Song, J., Moon, D.-S.: Analysis of public transit service performance using transit smart card data in Seoul. KSCE J. Civ. Eng. 19, 1–8 (2015)

    Article  Google Scholar 

  26. Furletti, B., Gabrielli, L., Renso, C., Rinzivillo, S.: Analysis of GSM calls data for understanding user mobility behavior. In: IEEE International Conference on Big Data, United States (2013)

    Google Scholar 

  27. Ge, Y., et al.: An energy-efficient mobile recommender system. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2010), p. 899. ACM Press, New York (2010)

    Google Scholar 

  28. Gokasar, I., Simsek, K., Ozbay, K.: Using big data of automated fare collection system for analysis and improvement of BRT-Bus rapid transit line in Istanbul. In: Transportation Research Board 94th Annual Meeting, United States (2014)

    Google Scholar 

  29. Gonzalez, M.C., Hidalgo, C.A., Barabasi, A.-L.: Understanding individual human mobility patterns. Nature 453(7196), 779–782 (2008)

    Article  Google Scholar 

  30. He, K., Wang, J., Deng, L., Wang, P.: Congestion avoidance routing in urban rail transit networks. In: 2014. IEEE 17th International Conference on Intelligent Transportation Systems (ITSC), pp. 200–205. IEEE (2014)

    Google Scholar 

  31. Hood, J., Sall, E., Charlton, B.: A GPS-based bicycle route choice model for San Francisco, California. Transp. Lett. 3(1), 63–75 (2011)

    Article  Google Scholar 

  32. IMDA Infocom Media Development Authority: Smart Nation big on Big Data 14. https://www.imda.gov.sg/infocomm-and-media-news/buzz-central/2016/6/smart-nation-big-on-big-data. Accessed 23 July 2017

  33. Iqbal, M.S., Choudhury, C.F., Wang, P., González, M.C.: Development of origin–destination matrices using mobile phone call data. Transp. Res. Part C Emerg. Technol. 40, 63–74 (2014)

    Article  Google Scholar 

  34. ITS China. http://www.itschina.org/article.asp?articleid=2259. Accessed 15 May 2015

  35. Jagadish, H.V., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J.M., Ramakrishnan, R., Shahabi, C.: Big data and its technical challenges. Commun. ACM 7(57), 86–94 (2014)

    Article  Google Scholar 

  36. Ju, G., Cheng, M., **ao, M., Xu, J., Pan, K., Wang, X., Shi, F.: Smart transportation between three phases through a stimulus-responsive functionally cooperating device. Adv. Mater. 25(21), 2915–2919 (2015)

    Article  Google Scholar 

  37. Kemp, G., Vargas-Solar, G., Da Silva, C.F., Ghodous, P., Collet, C.: Aggregating and managing big real-time data in the cloud: application to intelligent transport for smart cities. In: Proceedings of the 1st International Conference on Vehicle Technology and Intelligent Transport Systems, pp. 107–112, Lisbon, Portugal (2015)

    Google Scholar 

  38. Lin, J., Ryaboy, D.: Scaling big data mining infrastructure: the Twitter experience. ACM SIGKDD Explor. Newslett. 14(2), 6 (2013)

    Article  Google Scholar 

  39. Long, Y., Zhang, Y., Cui, C.: Identifying commuting pattern of bei**g using bus smart card data. J. Geogr. Sci. 67, 1339–1352 (2012)

    Google Scholar 

  40. Long, Y., Han, H., Tu, Y., Shu, X.: Evaluating the effectiveness of urban growth boundaries using human mobility and activity records. Cities 46, 76–84 (2015)

    Article  Google Scholar 

  41. Ma, X., Wu, Y.J., Wang, Y., Chen, F., Liu, J.: Mining smart card data for transit riders’ travel patterns. Transp. Res. Part C Emerg. Technol. 36, 1–12 (2013)

    Google Scholar 

  42. Møller-Jensen, L., Kofie, R.Y., Allotey, A.N.: Measuring accessibility and congestion in Accra. Norsk Geografisk Tidsskrift-Norwegian. Geogr. 66(1), 52–60 (2012)

    Article  Google Scholar 

  43. Morstatter, F., Pfeffer, J., Liu, H., Carley, K.M.: Is the sample good enough? Comparing data from Twitter’s streaming API with Twitter’s firehose. In: ICWSM 2013, June 21; cs. SI (2013)

    Google Scholar 

  44. Munizaga, A.N.: Using smart card and GPS data for policy and planning: the case of Transantiago. Res. Transp. Econ. 59, 242–249 (2016)

    Article  Google Scholar 

  45. Munizaga, M., Palma, C.: Estimation of a disaggregate multimodal public transport origin-destination matrix from passive smart card data from Santiago, Chile. Transp. Res. Part C Emerg. Technol. 24, 9–18 (2012)

    Article  Google Scholar 

  46. Network Rail: Asset Management Services (2013)

    Google Scholar 

  47. Noulas, A., Mascolo, C.: Exploiting foursquare and cellular data to infer user activity in urban environments. In: 2013 IEEE 14th International Conference on Paper presented at the Mobile Data Management (MDM), vol. 1, pp. 167–176 (2013)

    Google Scholar 

  48. Owen, A., Levinson, D.M.: Modeling the commute mode share of transit using continuous accessibility to jobs. Transp. Res. Part A Policy Pract. 74, 110–122 (2015)

    Article  Google Scholar 

  49. Pan, B., Zheng, Y., Wilkie, D., Shahabi, C.: Crowd sensing of traffic anomalies based on human mobility and social media. In: Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 344–353. ACM (2013)

    Google Scholar 

  50. Pang, L.X., Chawla, S., Liu, W., Zheng, Y.: On detection of emerging anomalous traffic patterns using GPS data. Data Knowl. Eng. 87, 357–373 (2013)

    Google Scholar 

  51. Papacharalampous, A.E.: Aggregated GSM data in origin destination studies. Masters’ thesis. Technical University of Delft, Netherlands (2014)

    Google Scholar 

  52. Pelletier, M., Trépanier, M., Morency, C.: Smart card data use in public transit: a literature review. Transp. Res. Part C Emerg. Technol. 19(4), 557–568 (2011)

    Article  Google Scholar 

  53. Phithakkitnukoon, S., Horanont, T., Di Lorenzo, G., Shibasaki, R., Ratti, C.: Activity aware map: identifying human daily activity pattern using mobile phone data. In: Human Behavior Understanding, pp. 14–25. Springer (2010)

    Google Scholar 

  54. Romph, E.: Using big data in transport modelling. Data Model. Magaz. 10, Summer Issue (2013)

    Google Scholar 

  55. Roth, C., Kang, S.M., Batty, M., Barthelemy, M.: Structure of urban movements: polycentric activity and entangled hierarchical flows. PLoS One 6(1), 1–8 (2011)

    Article  Google Scholar 

  56. Rusitschka, S., Curry, E.: Big data in the energy and transport sectors. In: New Horizons for a Data-Driven Economy, pp. 225–244 (2015). Chapter 13

    Google Scholar 

  57. Santi, P., Resta, G., Szell, M., Sobolevsky, S., Strogatz, S.H., Ratti, C.: Quantifying the benefits of vehicle pooling with shareability networks. Proc. Natl. Acad. Sci. 111(37), 13290–13294 (2014)

    Article  Google Scholar 

  58. Schmöcker, J.D., Shimamoto, H., Kurauchi, F.: Generation and calibration of transit hyperpaths. Transp. Res. Part C Emerg. Technol. 36, 406–418 (2013)

    Article  Google Scholar 

  59. Schulz, A., Ristoski, P., Paulheim, H.: I see a car crash: real-time detection of small scale incidents in microblogs. In: The Semantic Web: ESWC 2013 Satellite Events, pp. 22–33. Springer (2013)

    Google Scholar 

  60. Seaborn, C., Attanucci, J., Wilson, N.H.M.: Analyzing multimodal public transport journeys in London with smart card fare payment data. Transp. Res. Rec. J Transp. Res. Board 2121, 55–62 (2009)

    Article  Google Scholar 

  61. Sharma, S.: Expanded cloud plumes hiding big data ecosystem. Future Gener. Comput. Syst. 59, 63–92 (2016)

    Article  Google Scholar 

  62. Song, C., Qu, Z., Blumm, N., Barabasi, A.-L.: Limits of predictability in human mobility. Science 327(5968), 1018–1021 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  63. Swan, M.: Philosophy of big data: expanding the human-data relation with big data science services. In: Proceedings of First International IEEE Conference of Big Data Computing Service and Applications, pp. 468–477 (2015)

    Google Scholar 

  64. Tabbitt, S.: Big data analytics keeps Dublin moving. http://www.telegraph.co.uk/sponsored/sport/rugby-trytracker/10630406/ibm-big-dataanalytics-dublin.html. Accessed 6 May 2015

  65. Tamor, M.A., Gearhart, C., Soto, C.: A statistical approach to estimating acceptance of electric vehicles and electrification of personal transportation. Transp. Res. Part C Emerg. Technol. 26, 125–134 (2013)

    Article  Google Scholar 

  66. Toole, J.L., Ulm, M., Gonz, M.C., Bauer, D.I.: Inferring land use from mobile phone activity. In: Proceedings of the ACM SIGKDD International Workshop on Urban Computing, pp. 1–8, Bei**g, China (2012)

    Google Scholar 

  67. Trépanier, M., Tranchant, N., Chapleau, R.: Individual trip destination estimation in a transit smart card automated fare collection system. J. Intell. Transp. Syst. 11, 1–14 (2007)

    Article  Google Scholar 

  68. van Oort, N., Brands, T., de Romph, E.: Short-term prediction of ridership on public transport with smart card data Transp. Res. Rec. J. Transp. Res. Board 2535, 105–111 (2015)

    Article  Google Scholar 

  69. van Oort, N., Cats, O.: Improving public transport decision making, planning and operations by using big data cases from Sweden and the Netherlands. In: 2015 IEEE 18th International Conference on Intelligent Transportation Systems (ITSC) (2015)

    Google Scholar 

  70. Wang, P., Hunter, T., Bayen, A.M., Schechtner, K., Gonzalez, M.C.: Understanding road usage patterns in urban areas. Sci. Rep. 2, 1001 (2012)

    Article  Google Scholar 

  71. Wang, X., Zeng, K., Zhao, X.L., Wang, F.Y.: Using web data to enhance traffic situation awareness. In: IEEE 17th International Conference on Intelligent Transportation Systems (ITSC), pp. 195–199. IEEE (2014b)

    Google Scholar 

  72. Wang, Y., Zheng, Y., Xue, Y.: Travel time estimation of a path using sparse trajectories. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 25–34. ACM (2014)

    Google Scholar 

  73. Wanichayapong, N., Pruthipunyaskul, W., Pattara-Atikom, W., Chaovalit, P.: Social-based traffic information extraction and classification. In: 11th International Conference on ITS Telecommunications (ITST), pp. 107–112. IEEE (2011)

    Google Scholar 

  74. Wanq, Q., Taylor, J.E.: Quantifying Human Mobility Perturbation and Resilience in Hurricane Sandy (2014)

    Google Scholar 

  75. Watson, H.J.: Tutorial: big data analytics: concepts, technology, and applications. Assoc. Inf. Syst. 34, 5–16 (2014)

    Google Scholar 

  76. Weinstein, S.L.: Innovations in London’s transport: big data for a better customer experience. http://2015.data-forum.eu/sites/default/files/1600-1640%20Weinstein_SEC.pdf. Accessed 20 Aug 2017

  77. Widhalm, P., Yang, Y., Ulm, M., Athavale, S., González, M.: Discovering urban activity patterns in cell phone data. Transportation 42, 1–27 (2015)

    Article  Google Scholar 

  78. Wood, S.A., Guerry, A.D., Silver, J.M., Lacayo, M.: Using social media to quantify nature-based tourism and recreation. Sci. Rep. 3, 2976 (2013)

    Article  Google Scholar 

  79. Yeung, C.H., Saad, D., Wong, K.M.: From the physics of interacting polymers to optimizing routes on the London underground. Proc. Natl. Acad. Sci. 110(34), 13717–13722 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  80. Yu, W., Mao, M., Wang, B., Liu, X.: Implementation evaluation of Bei**g urban master plan based on subway transit smart card data. In: 22nd International Conference on Geoinformatics, Kaohsiung, Taiwan (2014)

    Google Scholar 

  81. Yuan, N.J., et al.: T-finder: a recommender system for finding passengers and vacant taxis. IEEE Trans. Knowl. Data Eng. 25, 2390–2403 (2013)

    Article  Google Scholar 

  82. Zhang, W., Qi, G., Pan, G., Lu, H., Li, S., Wu, Z.: City-scale social event detection and evaluation with taxi traces. ACM Trans. Intell. Syst. Technol. 6(3), 1–20 (2015)

    Google Scholar 

  83. Zheng, X., Chen, W., Wang, P., Shen, D., Chen, S., Wang, X., Zhang, Q., Yang, L.: Big data for social transportation. IEEE Trans. Intell. Transp. Syst. 17(3), 620–630 (2016)

    Article  Google Scholar 

  84. Zheng, Y., Zhang, L., **e, X., Ma, W.-Y.: Mining interesting locations and travel sequences from GPS trajectorie. In: Proceedings of International Conference on World Wide Web (WWW 2009), Madrid, Spain, pp. 791–800. ACM Press (2009)

    Google Scholar 

Download references

Acknowledgements

This work has been supported by the ALLIANCE project (http://alliance-project.eu/) and has been funded within the European Commission’s H2020 Programme under contract number 692426. This paper expresses the opinions of the authors and not necessarily those of the European Commission. The European Commission is not liable for any use that may be made of the information contained in this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maria Karatsoli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Karatsoli, M., Nathanail, E. (2018). A Thorough Review of Big Data Sources and Sets Used in Transportation Research. In: Kabashkin, I., Yatskiv, I., Prentkovskis, O. (eds) Reliability and Statistics in Transportation and Communication. RelStat 2017. Lecture Notes in Networks and Systems, vol 36. Springer, Cham. https://doi.org/10.1007/978-3-319-74454-4_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-74454-4_52

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-74453-7

  • Online ISBN: 978-3-319-74454-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Navigation