Abstract
In recent years, and especially since the development of the smartphone, enormous amounts of data relevant for transportation have become available. These data hold out the potential to redefine how transportation system (i.e., design, planning and operations) is done. While researchers in both academia and industry are making advances in using this data to transportation system ends (e.g., information inference from collected data), little attention has been paid to four larger scale challenges that will need to be overcome if the potential for Big Transportation Data is to be harnessed for transportation decision-making purposes. This paper aims to provide awareness of these large-scale challenges and provides insight into how we believe these challenges are likely to be met.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs42421-019-00001-z/MediaObjects/42421_2019_1_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs42421-019-00001-z/MediaObjects/42421_2019_1_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs42421-019-00001-z/MediaObjects/42421_2019_1_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs42421-019-00001-z/MediaObjects/42421_2019_1_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs42421-019-00001-z/MediaObjects/42421_2019_1_Fig5_HTML.png)
Similar content being viewed by others
References
Abadi D (2016) Optimizing disk io and memory for big data vector analysis. http://blogs.teradata.com/data-points/optimizing-disk-io-and-memory-for-big-data-vector-analysis/. Accessed 17 Aug 2018
Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A (2005) Anonymizing tables. In: International conference on database theory, Springer, 2005, pp 246–258
Amini S, Gerostathopoulos I, Prehofer C (2017) Big data analytics architecture for real-time traffic control. In: Models and technologies for intelligent transportation systems (MT-ITS), 2017 5th IEEE international conference on, IEEE, 2017, pp 710–715
Anderson JC, Lehnardt J, Slater N (2010) CouchDB: the definitive guide: time to relax. O’Reilly Media Inc, Newton
Arentze T, Timmermans H, Hofman F, Kalfs N (2000) Data needs, data collection, and data quality requirements of activity-based transport demand models. Transp Res Circ (E-C008), p 30
Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I et al (2010) A view of cloud computing. Commun ACM 53(4):50–58
Auer S, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives Z (2007) Dbpedia: a nucleus for a web of open data. In: The semantic web. Springer, pp 722–735
Bagchi M, White PR (2005) The potential of public transport smart card data. Transp Policy 12(5):464–474
Barcelo J, Montero L, Marques L, Carmona C (2010) Travel time forecasting and dynamic origin-destination estimation for freeways based on bluetooth traffic monitoring. Transp Res Rec J Transp Res Board 2175:19–27
Bayardo RJ, Agrawal R (2005) Data privacy through optimal k-anonymization. In: Data engineering, 2005. ICDE 2005. Proceedings. 21st international conference on, IEEE, 2005, pp 217–228
Beresford AR, Stajano F (2004) Mix zones: user privacy in location-aware services. In: Pervasive computing and communications workshops, 2004. Proceedings of the second IEEE annual conference on, IEEE, 2004, pp 127–131
Bhardwaj S, Jain L, Jain S (2010) Cloud computing: a study of infrastructure as a service (iaas). Int J Eng Inf Technol 2(1):60–63
Bierlaire M, Chen J, Newman J (2013) A probabilistic map matching method for smartphone GPS data. Transp Res Part C Emerg Technol 26:78–98
Bohte W, Maat K (2009) Deriving and validating trip purposes and travel modes for multi-day GPS-based travel surveys: a large-scale application in the netherlands. Transp Res Part C Emerg Technol 17(3):285–297
Borthakur D (2007) The hadoop distributed file system: architecture and design. Hadoop Proj Website 11(2007):21
Brewer EA (2000) Towards robust distributed systems. In: PODC, vol 7
Brynko B (2012) Nuodb: reinventing the database. Inf Today 29(9):9–9
Calil A, dos Santos Mello R (2012) Simplesql: a relational layer for simpledb. In: East European conference on advances in databases and information systems, Springer, 2012, pp 99–110
Cathey F, Dailey D (2005) A novel technique to dynamically measure vehicle speed using uncalibrated roadway cameras. In: Intelligent vehicles symposium, 2005. Proceedings. IEEE, IEEE, 2005, pp 777–782
Cattell R (2011) Scalable sql and nosql data stores. ACM SIGMOD Rec 39(4):12–27
Chaganti P, Helms R (2010) Amazon SimpleDB developer guide. Packt Publishing Ltd, Birmingham
Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE (2008) Bigtable: a distributed storage system for structured data. ACM Trans Comput Syst 26(2):4
Chen CP, Zhang C-Y (2014) Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf Sci 275:314–347
Chen C, Ma J, Susilo Y, Liu Y, Wang M (2016) The promises of big data and small data for travel behavior (aka human mobility) analysis. Transp Res Part C Emerg Technol 68:285–299
Chodorow K (2013) MongoDB: the definitive guide: powerful and scalable data storage. O’Reilly Media Inc, Newton
Choi A, Leyba TL, Porst B, Somani AR (2006) Real-time aggregation of unstructured data into structured data for SQL processing by a relational database engine, US Patent 7,146,356
Chow CY, Mokbel MF, Liu X (2006) A peer-to-peer spatial cloaking algorithm for anonymous location-based service. In: Proceedings of the 14th annual ACM international symposium on advances in geographic information systems, ACM, 2006, pp 171–178
Codd EF (1970) A relational model of data for large shared data banks. Commun ACM 13(6):377–387
Corbett JC, Dean J, Epstein M, Fikes A, Frost C, Furman JJ, Ghemawat S, Gubarev A, Heiser C, Hochschild P et al (2013) Spanner: Googles globally distributed database. ACM Trans Comput Syst 31(3):8
Cormode G, Srivastava D (2009) Anonymized data: generation, models, usage. In: Proceedings of the 2009 ACM SIGMOD international conference on management of data, ACM, 2009, pp 1015–1018
Damaiyanti TI, Imawan A, Kwon J (2014) Querying road traffic data from a document store. In: Proceedings of the 2014 IEEE/ACM 7th international conference on utility and cloud computing, IEEE Computer Society, 2014, pp 485–486
Danalet A, Farooq B, Bierlaire M (2014) A bayesian approach to detect pedestrian destination-sequences from wifi signatures. Transp Res Part C Emerg Technol 44:146–170
Davies DK, Stock SE, Holloway S, Wehmeyer ML (2010) Evaluating a GPS-based transportation device to support independent bus travel by people with intellectual disability. Intellect Dev Disabil 48(6):454–463
DeCandia G, Hastorun D, Jampani M, Kakulapati G, Lakshman A, Pilchin A, Sivasubramanian S, Vosshall P, Vogels W (2007) Dynamo: amazon’s highly available key-value store. In: ACM SIGOPS operating systems review, vol 41, ACM, 2007, pp 205–220
Dirolf M, Chodorow K (2010) MongoDB: the definitive guide. O’Reilly Media, Incorporated, Newton
Doan A, Naughton JF, Ramakrishnan R, Baid A, Chai X, Chen F, Chen T, Chu E, DeRose P, Gao B et al (2009) Information extraction challenges in managing unstructured data. ACM SIGMOD Rec 37(4):14–20
Dong H, Wu M, Ding X, Chu L, Jia L, Qin Y, Zhou X (2015) Traffic zone division based on big data from mobile phone base stations. Transp Res Part C Emerg Technol 58:278–291
Draijer G, Kalfs N, Perdok J (2000) Global positioning system as data collection method for travel research. Transp Res Rec J Transp Res Board 1719:147–153
Dwork C (2008) Differential privacy: a survey of results. In: International conference on theory and applications of models of computation, Springer, 2008, pp 1–19
Efthymiou D, Antoniou C (2012) Use of social media for transport data collection. Procedia Soc Behav Sci 48:775–785
Farooq B, Beaulieu A, Ragab M, Ba VD (2015) Ubiquitous monitoring of pedestrian dynamics: exploring wireless ad hoc network of multi-sensor technologies. In: Sensors, 2015 IEEE, IEEE, 2015, pp 1–4
Fathi M (2013) Integration of practice-oriented knowledge technology: trends and prospectives. Springer, Berlin
Gill M, Spriggs A (2005) Assessing the impact of CCTV, vol 292. Home Office Research, Development and Statistics Directorate, London
Gandomi A, Haider M (2015) Beyond the hype: big data concepts, methods, and analytics. Int J Inf Manag 35(2):137–144
Gartner (2012) Gartner IT Glossary. http://www.gartner.com/it-glossary/big-data/. Accessed 25 Mar 2017
George L (2011) HBase: the definitive guide: random access to your planet-size data. O’Reilly Media Inc., Newton
Gewirtz D (2016) Volume, velocity, and variety: understanding the three v’s of big data
Ghemawat S, Gobioff H, Leung ST (2003) The Google file system, vol 37. In: ACM, 2003
Ghinita G, Karras P, Kalnis P, Mamoulis N (2007) Fast data anonymization with low information loss. In: Proceedings of the 33rd international conference on very large data bases, VLDB endowment, 2007, pp 758–769
Ghinita G, Kalnis P, Khoshgozaran A, Shahabi C, Tan KL (2008) Private queries in location based services: anonymizers are not necessary. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, ACM, 2008, pp 121–132
Gilbert S, Lynch N (2002) Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. Acm SIGACT News 33(2):51–59
Gilbert S, Lynch N (2012) Perspectives on the cap theorem. Computer 45(2):30–36
Gonzalez PA, Weinstein JS, Barbeau SJ, Labrador MA, Winters PL, Georggi NL, Perez R (2010) Automating mode detection for travel behaviour analysis by using global positioning systems-enabled mobile phones and neural networks. IET Intell Transport Syst 4(1):37–49
Google (2018) Google. https://www.google.com/. Accessed 12 June 2017
Gray J, Reuter A (1992) Transaction processing: concepts and techniques. Elsevier, Amsterdam
Griffin T, Huang Y (2005) A decision tree classification model to automate trip purpose derivation. In: The Proceedings of the ISCA 18th international conference on computer applications in industry and engineering, 2005, pp 44–49
Grolinger K, Higashino WA, Tiwari A (2013) Capretz MA (2013) Data management in cloud environments: nosql and newsql data stores. J Cloud Comput Adv Syst Appl 2(1):22
Gruteser M, Grunwald D (2003) Anonymous usage of location-based services through spatial and temporal cloaking. In: Proceedings of the 1st international conference on mobile systems, applications and services, ACM, 2003, pp 31–42
Guardian T (2016) Ransomware attack on san francisco public transit gives everyone a free ride. https://www.theguardian.com/technology/2016/nov/28/passengers-free-ride-san-francisco-muni-ransomeware. Accessed 3 Jan 2018
Hainen A, Wasson J, Hubbard S, Remias S, Farnsworth G, Bullock D (2011) Estimating route choice and travel time reliability with field observations of bluetooth probe vehicles. Transp Res Rec J Transp Res Board 2256:43–50
Hasan O, Brunie L, Bertino E, Shang N (2013) A decentralized privacy preserving reputation protocol for the malicious adversarial model. IEEE Trans Inf Forensics Secur 8(6):949–962
Hashem IAT, Yaqoob I, Anuar NB, Mokhtar S, Gani A, Khan SU (2015) The rise of big data on cloud computing: review and open research issues. Inf Syst 47:98–115
Hilbert M, Lopez P (2011) The worlds technological capacity to store, communicate, and compute information. Science 332(6025):60–65
Hoh B, Gruteser M (2005) Protecting location privacy through path confusion. In: Security and privacy for emerging areas in communications networks, 2005. SecureComm 2005. First international conference on, IEEE, 2005, pp 194–205
Hood J, Sall E, Charlton B (2011) A GPS-based bicycle route choice model for san francisco, california. Transp Lett 3(1):63–75
Iordanov B (2010) Hypergraphdb: a generalized graph database. In: International conference on web-age information management, Springer, 2010, pp 25–36
Jagadish H, Gehrke J, Labrinidis A, Papakonstantinou Y, Patel JM, Ramakrishnan R, Shahabi C (2014) Big data and its technical challenges. Commun ACM 57(7):86–94
Ji C, Li Y, Qiu W, Awada U, Li K (2012) Big data processing in cloud computing environments. In: Pervasive systems, algorithms and networks (ISPAN), 2012 12th international symposium on, IEEE, 2012, pp 17–23
Kahn SD (2011) On the future of genomic data. Science 331(6018):728–729
Kalnis P, Ghinita G, Mouratidis K, Papadias D (2007) Preventing location-based identity inference in anonymous spatial queries. IEEE Trans Knowl Data Eng 19(12):1719–1733
Katal A, Wazid M, Goudar R (2013) Big data: issues, challenges, tools and good practices. In: Contemporary computing (IC3), 2013 sixth international conference on, IEEE, 2013, pp 404–409
Khetrapal A, Ganesh V (2006) Hbase and hypertable for large scale distributed storage systems. Department of Computer Science, Purdue University, pp 22–28
Kish LB (2002) End of moore’s law: thermal (noise) death of integration in micro and nano electronics. Phys Lett A 305(3–4):144–149
Krzanich B (2016) Data is the new oil in the future of automated driving. https://newsroom.intel.com/editorials/krzanich-the-future-of-automated-driving/. Accessed 13 Aug 2018
Lagoze C (2014) Big data, data integrity, and the fracturing of the control zone. Big Data Soc 1(2):2053951714558281
Lakshman A, Malik P (2010) Cassandra: a decentralized structured storage system. ACM SIGOPS Oper Syst Rev 44(2):35–40
Leduc G (2008) Road traffic data: collection methods and applications, working papers on energy. Transport Clim Change 1(55)
Leick A, Rapoport L, Tatarnikov D (2015) GPS satellite surveying. Wiley, New York
Li N, Li T, Venkatasubramanian S (2007) t-closeness: privacy beyond k-anonymity and l-diversity. In: Data engineering, 2007. ICDE 2007. IEEE 23rd international conference on, IEEE, 2007, pp 106–115
Lindell Y (2005) Secure multiparty computation for privacy preserving data mining. In: Encyclopedia of data warehousing and mining, IGI global, 2005, pp 1005–1009
Lopez D, Farooq B (2018) A blockchain framework for smart mobility, submitted to the Blockchain technology symposium (BTS’18)—from hype to reality, The Fields Institute, Toronto (September, 2018)
Lv Y, Duan Y, Kang W, Li Z, Wang F-Y (2015) Traffic flow prediction with big data: a deep learning approach. IEEE Trans Intell Transp Syst 16(2):865–873
Machanavajjhala A, Gehrke J, Kifer D, Venkitasubramaniam M (2006) l-diversity: privacy beyond k- anonymity. In: Data engineering, 2006. ICDE’06. Proceedings of the 22nd international conference on, IEEE, 2006, pp 24–24
Maier D (1983) The theory of relational databases, vol 11. Computer Science Press, Rockville
Mansuri IR, Sarawagi S (2006) Integrating unstructured data into relational databases. In: Data engineering, 2006. ICDE’06. Proceedings of the 22nd international conference on, IEEE, 2006, pp 29–29
Marz N (2013) Storm: Distributed and fault-tolerant realtime computation. https://www.infoq.com/presentations/Storm-Introduction
McAfee A, Brynjolfsson E, Davenport TH, Patil D, Barton D (2012) Big data: the management revolution. Harvard Bus Rev 90(10):60–68
McCallister E, Grance T, Scarfone KA (2010) Sp 800-122. guide to protecting the confidentiality of personally identifiable information (pii)
McGowen PT, McNally MG (2007) Evaluating the potential to predict activity types from GPS and GIS data. In: Proceedings of annual meeting of the transportation research board, transportation research board, Washington, DC, 2007, reference number: 07-3199
Mikkelsen MR, Christensen P (2009) Is children’s independent mobility really independent? A study of children’s mobility combining ethnography and GPS/mobile phone technologies. Mobilities 4(1):37–58
Moniruzzaman ABM, Hossain SA (2013) Nosql database: New era of databases for big data analytics-classification, characteristics and comparison. ar**v:1307.0191
Montini L, Prost S, Schrammel J, Rieser-Schussler N, Axhausen KW (2015) Comparison of travel diaries generated from smartphone data and dedicated GPS devices. Transp Res Procedia 11:227–241
Nergiz ME, Atzori M, Saygin Y (2008) Towards trajectory anonymization: a generalization-based approach. In: Proceedings of the SIGSPATIAL ACM GIS 2008 international workshop on security and privacy in GIS and LBS, ACM, 2008, pp 52–61
Neumeyer L, Robbins B, Nair A, Kesari A (2010) S4: distributed stream computing platform. In: Data mining workshops (ICDMW), 2010 IEEE international conference on, IEEE, 2010, pp 170–177
Neustar Research (2018) Riding with the stars: passenger privacy in the NYC taxicab dataset. https://research.neustar.biz/2014/09/15/riding-with-the-stars-passenger-privacy-in-the-nyc-taxicab-dataset/. Accessed 14 May 2018
Nitsche P, Widhalm P, Breuss S, Brandle N, Maurer P (2014) Supporting large-scale travel surveys with smartphones—a practical approach. Transp Res Part C Emerg Technol 43:212–221
Oracle (2015) Managing consistency with Berkeley DB HA (white paper). http://www.oracle.com/technetwork/products/berkeleydb/high-availability-099050.html. Accessed 5 May 2015
Orebaugh A, Ramirez G, Beale J (2006) Wireshark & ethereal network protocol analyzer toolkit. Elsevier, Amsterdam
Orru M, Paolillo R, Detti A, Rossi G, Melazzi NB (2017) Demonstration of opengeobase: the ICN nosql spatio-temporal database. In: Local and metropolitan area networks (LANMAN), 2017 IEEE international symposium on, IEEE, 2017, pp 1–2
Ousterhout J, Douglis F (1989) Beating the i/o bottleneck: a case for log-structured file systems. ACM SIGOPS Oper Syst Rev 23(1):11–28
Patil PT (2016) A study on evolution of storage infrastructure. Int J 6(7)
Patterson Z (2017) MTL trajet 2016, paper presented at the 11th international conference on travel survey methods, Esterel, Quebec. http://itinerum.ca/documents.html. Accessed 30 Mar 2018
Patterson Z, Fitzsimmons K (2016) Datamobile: smartphone travel survey experiment. Transp Res Rec J Transp Res Board 2594:35–43
Patterson Z, Fitzsimmons K (2017) The Itinerum open smartphone travel survey platform, technical report, Concordia University TRIP Lab, Montreal, Canada, TRIP Lab Working Paper 2017-2. http://itinerum.ca/documents.html. Accessed 21 Jul 2018
Patterson Z, Fitzsimmons K, Widener M, Reid J, Hammond D (2018) Designing smartphone travel surveys: recruitment, burden, incentives and participation. J Urb Manag
Pelletier M-P, Trépanier M, Morency C (2011) Smart card data use in public transit: a literature review. Transp Res Part C Emerg Technol 19(4):557–568
Perego P, Andreoni G, Rizzo G (2017) Wireless mobile communication and healthcare: 6th international conference, MobiHealth 2016, Milan, Italy, November 14–16, 2016, Proceedings, vol 192, Springer
Pokorny J (2013) Nosql databases: a step to database scalability in web environment. Int J Web Inf Syst 9(1):69–82
Poucin G, Farooq B, Patterson Z (2016) Pedestrian activity pattern mining in wifi-network connection data. (No. 16-5846)
Poucin G, Farooq B, Patterson Z (2018) Activity patterns mining in Wi-Fi access point logs. Comput Environ Urban Syst 67:55–67
Ranjan R (2014) Streaming big data processing in datacenter clouds. IEEE Cloud Comput 1(1):78–83
Rector K (2015) MTA real-time bus data’hacked,’ offered on private mobile application. http://www.baltimoresun.com/business/bs-bz-mta-tracker-hack-20150224-story.html. Accessed 24 May 2018
Reddy S, Mun M, Burke J, Estrin D, Hansen M, Srivastava M (2010) Using mobile phones to determine transportation modes. ACM Trans Sens Netw 6(2):13
Samarati P (2001) Protecting respondents identities in microdata release. IEEE Trans Knowl Data Eng 13(6):1010–1027
Schaller RR (1997) Moore’s law: past, present and future. IEEE Spectrum 34(6):52–59
Schwartz PM, Solove DJ (2011) The pii problem: privacy and a new concept of personally identifiable information. NYUL Rev 86:1814
Serra J (2018) What is the lambda architecture? http://www.jamesserra.com/archive/2016/08/what-is-the-lambda-architecture/. Accessed 20 Dec 2017
Shafer J, Rixner S, Cox AL (2010) The hadoop distributed filesystem: balancing portability and performance. In: Performance analysis of systems & software (ISPASS), 2010 IEEE international symposium on, IEEE, 2010, pp 122–133
Shen L, Stopher PR (2013) A process for trip purpose imputation from global positioning system data. Transp Res Rec J Transp Res Board 36:261–267
Shi Q, Abdel-Aty M (2015) Big data applications in real-time traffic operation and safety monitoring and improvement on urban expressways. Transp Res Part C Emerg Technol 58:380–394
Shlayan N, Kurkcu A, Ozbay K (2016) Exploring pedestrian bluetooth and wifi detection at public transportation terminals. In: Intelligent transportation systems (ITSC), 2016 IEEE 19th international conference on, IEEE, 2016, pp 229–234
Shvachko K, Kuang H, Radia S, Chansler R (2010) The hadoop distributed file system. In: Mass storage systems and technologies (MSST), 2010 IEEE 26th symposium on, IEEE, 2010, pp 1–10
Solon O (2018) Facebook says cambridge analytica may have gained 37 m more users’ data. https://www.theguardian.com/technology/2018/apr/04/facebook-cambridge-analytica-user-data-latest-more-than-thought. Accessed 18 Aug 2018
Stamp M (2011) Information security: principles and practice. Wiley, New York
Stonebraker M (2012) Newsql: an alternative to nosql and old sql for new oltp apps. Communications of the ACM. Retrieved, 07-06
Stonebraker M, Weisberg A (2013) The voltdb main memory DBMS. IEEE Data Eng Bull 36(2):21–27
Stopher PR, Greaves SP (2007) Household travel surveys: where are we going? Transp Res Part A Policy Pract 41(5):367–381
StreetLight (2018) StreetLight Data. https://www.streetlightdata.com. Accessed 15 June 2017
Sweeney L (2002) k-Anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 10(05):557–570
Tanenbaum AS, Woodhull AS (1987) Operating systems: design and implementation, vol 2. Prentice-Hall, Englewood Cliffs
Tankard C (2012) Big data security. Netw Secur 2012(7):5–8
Tene O, Polonetsky J (2011) Privacy in the age of big data: a time for big decisions. Stan L Rev Online 64:63
Terrovitis M, Mamoulis N (2008) Privacy preservation in the publication of trajectories. In: Mobile data management, 2008. MDM’08. 9th international conference on, IEEE, 2008, pp 65–72
Thein K (2014) Apache kafka: next generation distributed messaging system. Int J Sci Eng Technol Res 3(47):9478–9483
Thusoo A, Sarma JS, Jain N, Shao Z, Chakka P, Anthony S, Liu H, Wyckoff P, Murthy R (2009) Hive: a warehousing solution over a map-reduce framework. Proc VLDB Endow 2(2):1626–1629
Tierney B, Kissel E, Swany M, Pouyoul E (2012) Efficient data transfer protocols for big data. In: E-Science (e-Science), 2012 IEEE 8th international conference on, IEEE, 2012, pp 1–9
Trépanier M, Morency C (2010) Assessing transit loyalty with smart card data. In: 12th World conference on transport research, July, 2010, pp 11–15
Tsirogiannis D, Harizopoulos S, Shah MA, Wiener JL, Graefe G (2009) Query processing techniques for solid state drives. In: Proceedings of the 2009 ACM SIGMOD international conference on management of data, ACM, 2009, pp 59–72
U.S. Department of Transportation (2013) Some observations on probe data in the v2v world: a unified view of shared situation data
Uber (2018) https://www.uber.com/. Accessed 6 Dec 2017
Van Diggelen FST (2009) A-GPS: assisted GPS, GNSS, and SBAS. Artech House, Norwood
Vaquero LM, Rodero-Merino L, Buyya R (2011) Dynamically scaling applications in the cloud. ACM SIGCOMM Comput Commun Rev 41(1):45–52
Vela B, Cavero JM, Caceres P, Sierra-Alonso A, Cuesta CE (2018) Using a nosql graph oriented database to store accessible transport routes. In: EDBT/ICDT workshops, 2018, pp 62–66
Vicknair C, Macias M, Zhao Z, Nan X, Chen Y, Wilkins D (2010) A comparison of a graph database and a relational database: a data provenance perspective. In: Proceedings of the 48th annual southeast regional conference, ACM, 2010, p 42
Ville de Montreal (2018) Montreal’s Open Data Policy. http://donnees.ville.montreal.qc.ca/portail/city-of-montreal-open-data-policy/. Accessed 14 May 2018
Vora MN (2011) Hadoop-hbase for large-scale data. In: Computer science and network technology (ICC-SNT), 2011 international conference on, vol 1, IEEE, 2011, pp 601–605
Vukotic A, Watt N, Abedrabbo T, Fox D, Partner J (2015) Neo4j in action (vol. 22). Shelter Island: Manning
White CE, Bernstein D, Kornhauser AL (2000) Some map matching algorithms for personal navigation assistants. Transp Res Part C Emerg Technol 8(1):91–108
Wolf J, Guensler R, Bachman W (2001) Elimination of the travel diary: experiment to derive trip purpose from global positioning system travel data. Transp Res Rec J Transp Res Board 1768:125–134
Wu X, Zhu X, Wu G-Q, Ding W (2014) Data mining with big data. IEEE Trans Knowl Data Eng 26(1):97–107
Xu L, Jiang C, Wang J, Yuan J, Ren Y (2014) Information security in big data: privacy and data mining. IEEE Access 2:1149–1176
Yazdizadeh A, Patterson Z, Farooq B (2019) An automated approach from GPS traces to complete trip information. Int J Transp Sci Technol 8(1):82–100
You TH, Peng WC, Lee WC (2007) Protecting moving trajectories with dummies. In: Mobile data management, 2007 international conference on, IEEE, 2007, pp 278–282
Zahabi SAH, Ajzachi A, Patterson Z (2017) Transit trip itinerary inference with GTFS and smartphone data. Transp Res Rec J Transp Res Board 2652:59–69
Zhang J, You S, Gruenwald L (2014) High-performance spatial query processing on big taxi trip data using gpgpus. In: Big data (BigData Congress), 2014 IEEE international congress on, IEEE, 2014, pp 72–79
Zhao F, Ghorpade A, Pereira FC, Zegras C, Ben-Akiva M (2015) Stop detection in smartphone-based travel surveys. Transp Res Procedia 11:218–226
Zheng X, Chen W, Wang P, Shen D, Chen S, Wang X, Zhang Q, Yang L (2016) Big data for social transportation. IEEE Trans Intell Transp Syst 17(3):620–630
Zikopoulos P, Eaton C et al (2011) Understanding big data: analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media, New York
Funding
Funding was provided by Social Sciences and Humanities Research Council.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Badu-Marfo, G., Farooq, B. & Patterson, Z. A Perspective on the Challenges and Opportunities for Privacy-Aware Big Transportation Data. J. Big Data Anal. Transp. 1, 1–23 (2019). https://doi.org/10.1007/s42421-019-00001-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42421-019-00001-z