Log in

A Perspective on the Challenges and Opportunities for Privacy-Aware Big Transportation Data

  • Original Paper
  • Published:
Journal of Big Data Analytics in Transportation Aims and scope Submit manuscript

Abstract

In recent years, and especially since the development of the smartphone, enormous amounts of data relevant for transportation have become available. These data hold out the potential to redefine how transportation system (i.e., design, planning and operations) is done. While researchers in both academia and industry are making advances in using this data to transportation system ends (e.g., information inference from collected data), little attention has been paid to four larger scale challenges that will need to be overcome if the potential for Big Transportation Data is to be harnessed for transportation decision-making purposes. This paper aims to provide awareness of these large-scale challenges and provides insight into how we believe these challenges are likely to be met.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Abadi D (2016) Optimizing disk io and memory for big data vector analysis. http://blogs.teradata.com/data-points/optimizing-disk-io-and-memory-for-big-data-vector-analysis/. Accessed 17 Aug 2018

  • Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A (2005) Anonymizing tables. In: International conference on database theory, Springer, 2005, pp 246–258

  • Amini S, Gerostathopoulos I, Prehofer C (2017) Big data analytics architecture for real-time traffic control. In: Models and technologies for intelligent transportation systems (MT-ITS), 2017 5th IEEE international conference on, IEEE, 2017, pp 710–715

  • Anderson JC, Lehnardt J, Slater N (2010) CouchDB: the definitive guide: time to relax. O’Reilly Media Inc, Newton

    Google Scholar 

  • Arentze T, Timmermans H, Hofman F, Kalfs N (2000) Data needs, data collection, and data quality requirements of activity-based transport demand models. Transp Res Circ (E-C008), p 30

  • Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I et al (2010) A view of cloud computing. Commun ACM 53(4):50–58

    Article  Google Scholar 

  • Auer S, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives Z (2007) Dbpedia: a nucleus for a web of open data. In: The semantic web. Springer, pp 722–735

  • Bagchi M, White PR (2005) The potential of public transport smart card data. Transp Policy 12(5):464–474

    Article  Google Scholar 

  • Barcelo J, Montero L, Marques L, Carmona C (2010) Travel time forecasting and dynamic origin-destination estimation for freeways based on bluetooth traffic monitoring. Transp Res Rec J Transp Res Board 2175:19–27

    Article  Google Scholar 

  • Bayardo RJ, Agrawal R (2005) Data privacy through optimal k-anonymization. In: Data engineering, 2005. ICDE 2005. Proceedings. 21st international conference on, IEEE, 2005, pp 217–228

  • Beresford AR, Stajano F (2004) Mix zones: user privacy in location-aware services. In: Pervasive computing and communications workshops, 2004. Proceedings of the second IEEE annual conference on, IEEE, 2004, pp 127–131

  • Bhardwaj S, Jain L, Jain S (2010) Cloud computing: a study of infrastructure as a service (iaas). Int J Eng Inf Technol 2(1):60–63

    Google Scholar 

  • Bierlaire M, Chen J, Newman J (2013) A probabilistic map matching method for smartphone GPS data. Transp Res Part C Emerg Technol 26:78–98

    Article  Google Scholar 

  • Bohte W, Maat K (2009) Deriving and validating trip purposes and travel modes for multi-day GPS-based travel surveys: a large-scale application in the netherlands. Transp Res Part C Emerg Technol 17(3):285–297

    Article  Google Scholar 

  • Borthakur D (2007) The hadoop distributed file system: architecture and design. Hadoop Proj Website 11(2007):21

    Google Scholar 

  • Brewer EA (2000) Towards robust distributed systems. In: PODC, vol 7

  • Brynko B (2012) Nuodb: reinventing the database. Inf Today 29(9):9–9

    Google Scholar 

  • Calil A, dos Santos Mello R (2012) Simplesql: a relational layer for simpledb. In: East European conference on advances in databases and information systems, Springer, 2012, pp 99–110

  • Cathey F, Dailey D (2005) A novel technique to dynamically measure vehicle speed using uncalibrated roadway cameras. In: Intelligent vehicles symposium, 2005. Proceedings. IEEE, IEEE, 2005, pp 777–782

  • Cattell R (2011) Scalable sql and nosql data stores. ACM SIGMOD Rec 39(4):12–27

    Article  Google Scholar 

  • Chaganti P, Helms R (2010) Amazon SimpleDB developer guide. Packt Publishing Ltd, Birmingham

    Google Scholar 

  • Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE (2008) Bigtable: a distributed storage system for structured data. ACM Trans Comput Syst 26(2):4

    Article  Google Scholar 

  • Chen CP, Zhang C-Y (2014) Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf Sci 275:314–347

    Article  Google Scholar 

  • Chen C, Ma J, Susilo Y, Liu Y, Wang M (2016) The promises of big data and small data for travel behavior (aka human mobility) analysis. Transp Res Part C Emerg Technol 68:285–299

    Article  Google Scholar 

  • Chodorow K (2013) MongoDB: the definitive guide: powerful and scalable data storage. O’Reilly Media Inc, Newton

    Google Scholar 

  • Choi A, Leyba TL, Porst B, Somani AR (2006) Real-time aggregation of unstructured data into structured data for SQL processing by a relational database engine, US Patent 7,146,356

  • Chow CY, Mokbel MF, Liu X (2006) A peer-to-peer spatial cloaking algorithm for anonymous location-based service. In: Proceedings of the 14th annual ACM international symposium on advances in geographic information systems, ACM, 2006, pp 171–178

  • Codd EF (1970) A relational model of data for large shared data banks. Commun ACM 13(6):377–387

    Article  MATH  Google Scholar 

  • Corbett JC, Dean J, Epstein M, Fikes A, Frost C, Furman JJ, Ghemawat S, Gubarev A, Heiser C, Hochschild P et al (2013) Spanner: Googles globally distributed database. ACM Trans Comput Syst 31(3):8

    Article  Google Scholar 

  • Cormode G, Srivastava D (2009) Anonymized data: generation, models, usage. In: Proceedings of the 2009 ACM SIGMOD international conference on management of data, ACM, 2009, pp 1015–1018

  • Damaiyanti TI, Imawan A, Kwon J (2014) Querying road traffic data from a document store. In: Proceedings of the 2014 IEEE/ACM 7th international conference on utility and cloud computing, IEEE Computer Society, 2014, pp 485–486

  • Danalet A, Farooq B, Bierlaire M (2014) A bayesian approach to detect pedestrian destination-sequences from wifi signatures. Transp Res Part C Emerg Technol 44:146–170

    Article  Google Scholar 

  • Davies DK, Stock SE, Holloway S, Wehmeyer ML (2010) Evaluating a GPS-based transportation device to support independent bus travel by people with intellectual disability. Intellect Dev Disabil 48(6):454–463

    Article  Google Scholar 

  • DeCandia G, Hastorun D, Jampani M, Kakulapati G, Lakshman A, Pilchin A, Sivasubramanian S, Vosshall P, Vogels W (2007) Dynamo: amazon’s highly available key-value store. In: ACM SIGOPS operating systems review, vol 41, ACM, 2007, pp 205–220

  • Dirolf M, Chodorow K (2010) MongoDB: the definitive guide. O’Reilly Media, Incorporated, Newton

    Google Scholar 

  • Doan A, Naughton JF, Ramakrishnan R, Baid A, Chai X, Chen F, Chen T, Chu E, DeRose P, Gao B et al (2009) Information extraction challenges in managing unstructured data. ACM SIGMOD Rec 37(4):14–20

    Article  Google Scholar 

  • Dong H, Wu M, Ding X, Chu L, Jia L, Qin Y, Zhou X (2015) Traffic zone division based on big data from mobile phone base stations. Transp Res Part C Emerg Technol 58:278–291

    Article  Google Scholar 

  • Draijer G, Kalfs N, Perdok J (2000) Global positioning system as data collection method for travel research. Transp Res Rec J Transp Res Board 1719:147–153

    Article  Google Scholar 

  • Dwork C (2008) Differential privacy: a survey of results. In: International conference on theory and applications of models of computation, Springer, 2008, pp 1–19

  • Efthymiou D, Antoniou C (2012) Use of social media for transport data collection. Procedia Soc Behav Sci 48:775–785

    Article  Google Scholar 

  • Farooq B, Beaulieu A, Ragab M, Ba VD (2015) Ubiquitous monitoring of pedestrian dynamics: exploring wireless ad hoc network of multi-sensor technologies. In: Sensors, 2015 IEEE, IEEE, 2015, pp 1–4

  • Fathi M (2013) Integration of practice-oriented knowledge technology: trends and prospectives. Springer, Berlin

    Book  Google Scholar 

  • Gill M, Spriggs A (2005) Assessing the impact of CCTV, vol 292. Home Office Research, Development and Statistics Directorate, London

    Google Scholar 

  • Gandomi A, Haider M (2015) Beyond the hype: big data concepts, methods, and analytics. Int J Inf Manag 35(2):137–144

    Article  Google Scholar 

  • Gartner (2012) Gartner IT Glossary. http://www.gartner.com/it-glossary/big-data/. Accessed 25 Mar 2017

  • George L (2011) HBase: the definitive guide: random access to your planet-size data. O’Reilly Media Inc., Newton

    Google Scholar 

  • Gewirtz D (2016) Volume, velocity, and variety: understanding the three v’s of big data

  • Ghemawat S, Gobioff H, Leung ST (2003) The Google file system, vol 37. In: ACM, 2003

  • Ghinita G, Karras P, Kalnis P, Mamoulis N (2007) Fast data anonymization with low information loss. In: Proceedings of the 33rd international conference on very large data bases, VLDB endowment, 2007, pp 758–769

  • Ghinita G, Kalnis P, Khoshgozaran A, Shahabi C, Tan KL (2008) Private queries in location based services: anonymizers are not necessary. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, ACM, 2008, pp 121–132

  • Gilbert S, Lynch N (2002) Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. Acm SIGACT News 33(2):51–59

    Article  Google Scholar 

  • Gilbert S, Lynch N (2012) Perspectives on the cap theorem. Computer 45(2):30–36

    Article  Google Scholar 

  • Gonzalez PA, Weinstein JS, Barbeau SJ, Labrador MA, Winters PL, Georggi NL, Perez R (2010) Automating mode detection for travel behaviour analysis by using global positioning systems-enabled mobile phones and neural networks. IET Intell Transport Syst 4(1):37–49

    Article  Google Scholar 

  • Google (2018) Google. https://www.google.com/. Accessed 12 June 2017

  • Gray J, Reuter A (1992) Transaction processing: concepts and techniques. Elsevier, Amsterdam

    MATH  Google Scholar 

  • Griffin T, Huang Y (2005) A decision tree classification model to automate trip purpose derivation. In: The Proceedings of the ISCA 18th international conference on computer applications in industry and engineering, 2005, pp 44–49

  • Grolinger K, Higashino WA, Tiwari A (2013) Capretz MA (2013) Data management in cloud environments: nosql and newsql data stores. J Cloud Comput Adv Syst Appl 2(1):22

    Article  Google Scholar 

  • Gruteser M, Grunwald D (2003) Anonymous usage of location-based services through spatial and temporal cloaking. In: Proceedings of the 1st international conference on mobile systems, applications and services, ACM, 2003, pp 31–42

  • Guardian T (2016) Ransomware attack on san francisco public transit gives everyone a free ride. https://www.theguardian.com/technology/2016/nov/28/passengers-free-ride-san-francisco-muni-ransomeware. Accessed 3 Jan 2018

  • Hainen A, Wasson J, Hubbard S, Remias S, Farnsworth G, Bullock D (2011) Estimating route choice and travel time reliability with field observations of bluetooth probe vehicles. Transp Res Rec J Transp Res Board 2256:43–50

    Article  Google Scholar 

  • Hasan O, Brunie L, Bertino E, Shang N (2013) A decentralized privacy preserving reputation protocol for the malicious adversarial model. IEEE Trans Inf Forensics Secur 8(6):949–962

    Article  Google Scholar 

  • Hashem IAT, Yaqoob I, Anuar NB, Mokhtar S, Gani A, Khan SU (2015) The rise of big data on cloud computing: review and open research issues. Inf Syst 47:98–115

    Article  Google Scholar 

  • Hilbert M, Lopez P (2011) The worlds technological capacity to store, communicate, and compute information. Science 332(6025):60–65

    Article  Google Scholar 

  • Hoh B, Gruteser M (2005) Protecting location privacy through path confusion. In: Security and privacy for emerging areas in communications networks, 2005. SecureComm 2005. First international conference on, IEEE, 2005, pp 194–205

  • Hood J, Sall E, Charlton B (2011) A GPS-based bicycle route choice model for san francisco, california. Transp Lett 3(1):63–75

    Article  Google Scholar 

  • Iordanov B (2010) Hypergraphdb: a generalized graph database. In: International conference on web-age information management, Springer, 2010, pp 25–36

  • Jagadish H, Gehrke J, Labrinidis A, Papakonstantinou Y, Patel JM, Ramakrishnan R, Shahabi C (2014) Big data and its technical challenges. Commun ACM 57(7):86–94

    Article  Google Scholar 

  • Ji C, Li Y, Qiu W, Awada U, Li K (2012) Big data processing in cloud computing environments. In: Pervasive systems, algorithms and networks (ISPAN), 2012 12th international symposium on, IEEE, 2012, pp 17–23

  • Kahn SD (2011) On the future of genomic data. Science 331(6018):728–729

    Article  Google Scholar 

  • Kalnis P, Ghinita G, Mouratidis K, Papadias D (2007) Preventing location-based identity inference in anonymous spatial queries. IEEE Trans Knowl Data Eng 19(12):1719–1733

    Article  Google Scholar 

  • Katal A, Wazid M, Goudar R (2013) Big data: issues, challenges, tools and good practices. In: Contemporary computing (IC3), 2013 sixth international conference on, IEEE, 2013, pp 404–409

  • Khetrapal A, Ganesh V (2006) Hbase and hypertable for large scale distributed storage systems. Department of Computer Science, Purdue University, pp 22–28

  • Kish LB (2002) End of moore’s law: thermal (noise) death of integration in micro and nano electronics. Phys Lett A 305(3–4):144–149

    Article  Google Scholar 

  • Krzanich B (2016) Data is the new oil in the future of automated driving. https://newsroom.intel.com/editorials/krzanich-the-future-of-automated-driving/. Accessed 13 Aug 2018

  • Lagoze C (2014) Big data, data integrity, and the fracturing of the control zone. Big Data Soc 1(2):2053951714558281

    Article  Google Scholar 

  • Lakshman A, Malik P (2010) Cassandra: a decentralized structured storage system. ACM SIGOPS Oper Syst Rev 44(2):35–40

    Article  Google Scholar 

  • Leduc G (2008) Road traffic data: collection methods and applications, working papers on energy. Transport Clim Change 1(55)

  • Leick A, Rapoport L, Tatarnikov D (2015) GPS satellite surveying. Wiley, New York

    Book  Google Scholar 

  • Li N, Li T, Venkatasubramanian S (2007) t-closeness: privacy beyond k-anonymity and l-diversity. In: Data engineering, 2007. ICDE 2007. IEEE 23rd international conference on, IEEE, 2007, pp 106–115

  • Lindell Y (2005) Secure multiparty computation for privacy preserving data mining. In: Encyclopedia of data warehousing and mining, IGI global, 2005, pp 1005–1009

  • Lopez D, Farooq B (2018) A blockchain framework for smart mobility, submitted to the Blockchain technology symposium (BTS’18)—from hype to reality, The Fields Institute, Toronto (September, 2018)

  • Lv Y, Duan Y, Kang W, Li Z, Wang F-Y (2015) Traffic flow prediction with big data: a deep learning approach. IEEE Trans Intell Transp Syst 16(2):865–873

    Google Scholar 

  • Machanavajjhala A, Gehrke J, Kifer D, Venkitasubramaniam M (2006) l-diversity: privacy beyond k- anonymity. In: Data engineering, 2006. ICDE’06. Proceedings of the 22nd international conference on, IEEE, 2006, pp 24–24

  • Maier D (1983) The theory of relational databases, vol 11. Computer Science Press, Rockville

    MATH  Google Scholar 

  • Mansuri IR, Sarawagi S (2006) Integrating unstructured data into relational databases. In: Data engineering, 2006. ICDE’06. Proceedings of the 22nd international conference on, IEEE, 2006, pp 29–29

  • Marz N (2013) Storm: Distributed and fault-tolerant realtime computation. https://www.infoq.com/presentations/Storm-Introduction

  • McAfee A, Brynjolfsson E, Davenport TH, Patil D, Barton D (2012) Big data: the management revolution. Harvard Bus Rev 90(10):60–68

    Google Scholar 

  • McCallister E, Grance T, Scarfone KA (2010) Sp 800-122. guide to protecting the confidentiality of personally identifiable information (pii)

  • McGowen PT, McNally MG (2007) Evaluating the potential to predict activity types from GPS and GIS data. In: Proceedings of annual meeting of the transportation research board, transportation research board, Washington, DC, 2007, reference number: 07-3199

  • Mikkelsen MR, Christensen P (2009) Is children’s independent mobility really independent? A study of children’s mobility combining ethnography and GPS/mobile phone technologies. Mobilities 4(1):37–58

    Article  Google Scholar 

  • Moniruzzaman ABM, Hossain SA (2013) Nosql database: New era of databases for big data analytics-classification, characteristics and comparison. ar**v:1307.0191

  • Montini L, Prost S, Schrammel J, Rieser-Schussler N, Axhausen KW (2015) Comparison of travel diaries generated from smartphone data and dedicated GPS devices. Transp Res Procedia 11:227–241

    Article  Google Scholar 

  • Nergiz ME, Atzori M, Saygin Y (2008) Towards trajectory anonymization: a generalization-based approach. In: Proceedings of the SIGSPATIAL ACM GIS 2008 international workshop on security and privacy in GIS and LBS, ACM, 2008, pp 52–61

  • Neumeyer L, Robbins B, Nair A, Kesari A (2010) S4: distributed stream computing platform. In: Data mining workshops (ICDMW), 2010 IEEE international conference on, IEEE, 2010, pp 170–177

  • Neustar Research (2018) Riding with the stars: passenger privacy in the NYC taxicab dataset. https://research.neustar.biz/2014/09/15/riding-with-the-stars-passenger-privacy-in-the-nyc-taxicab-dataset/. Accessed 14 May 2018

  • Nitsche P, Widhalm P, Breuss S, Brandle N, Maurer P (2014) Supporting large-scale travel surveys with smartphones—a practical approach. Transp Res Part C Emerg Technol 43:212–221

    Article  Google Scholar 

  • Oracle (2015) Managing consistency with Berkeley DB HA (white paper). http://www.oracle.com/technetwork/products/berkeleydb/high-availability-099050.html. Accessed 5 May 2015

  • Orebaugh A, Ramirez G, Beale J (2006) Wireshark & ethereal network protocol analyzer toolkit. Elsevier, Amsterdam

    Google Scholar 

  • Orru M, Paolillo R, Detti A, Rossi G, Melazzi NB (2017) Demonstration of opengeobase: the ICN nosql spatio-temporal database. In: Local and metropolitan area networks (LANMAN), 2017 IEEE international symposium on, IEEE, 2017, pp 1–2

  • Ousterhout J, Douglis F (1989) Beating the i/o bottleneck: a case for log-structured file systems. ACM SIGOPS Oper Syst Rev 23(1):11–28

    Article  Google Scholar 

  • Patil PT (2016) A study on evolution of storage infrastructure. Int J 6(7)

  • Patterson Z (2017) MTL trajet 2016, paper presented at the 11th international conference on travel survey methods, Esterel, Quebec. http://itinerum.ca/documents.html. Accessed 30 Mar 2018

  • Patterson Z, Fitzsimmons K (2016) Datamobile: smartphone travel survey experiment. Transp Res Rec J Transp Res Board 2594:35–43

    Article  Google Scholar 

  • Patterson Z, Fitzsimmons K (2017) The Itinerum open smartphone travel survey platform, technical report, Concordia University TRIP Lab, Montreal, Canada, TRIP Lab Working Paper 2017-2. http://itinerum.ca/documents.html. Accessed 21 Jul 2018

  • Patterson Z, Fitzsimmons K, Widener M, Reid J, Hammond D (2018) Designing smartphone travel surveys: recruitment, burden, incentives and participation. J Urb Manag

  • Pelletier M-P, Trépanier M, Morency C (2011) Smart card data use in public transit: a literature review. Transp Res Part C Emerg Technol 19(4):557–568

    Article  Google Scholar 

  • Perego P, Andreoni G, Rizzo G (2017) Wireless mobile communication and healthcare: 6th international conference, MobiHealth 2016, Milan, Italy, November 14–16, 2016, Proceedings, vol 192, Springer

  • Pokorny J (2013) Nosql databases: a step to database scalability in web environment. Int J Web Inf Syst 9(1):69–82

    Article  Google Scholar 

  • Poucin G, Farooq B, Patterson Z (2016) Pedestrian activity pattern mining in wifi-network connection data. (No. 16-5846)

  • Poucin G, Farooq B, Patterson Z (2018) Activity patterns mining in Wi-Fi access point logs. Comput Environ Urban Syst 67:55–67

    Article  Google Scholar 

  • Ranjan R (2014) Streaming big data processing in datacenter clouds. IEEE Cloud Comput 1(1):78–83

    Article  Google Scholar 

  • Rector K (2015) MTA real-time bus data’hacked,’ offered on private mobile application. http://www.baltimoresun.com/business/bs-bz-mta-tracker-hack-20150224-story.html. Accessed 24 May 2018

  • Reddy S, Mun M, Burke J, Estrin D, Hansen M, Srivastava M (2010) Using mobile phones to determine transportation modes. ACM Trans Sens Netw 6(2):13

    Article  Google Scholar 

  • Samarati P (2001) Protecting respondents identities in microdata release. IEEE Trans Knowl Data Eng 13(6):1010–1027

    Article  Google Scholar 

  • Schaller RR (1997) Moore’s law: past, present and future. IEEE Spectrum 34(6):52–59

    Article  Google Scholar 

  • Schwartz PM, Solove DJ (2011) The pii problem: privacy and a new concept of personally identifiable information. NYUL Rev 86:1814

    Google Scholar 

  • Serra J (2018) What is the lambda architecture? http://www.jamesserra.com/archive/2016/08/what-is-the-lambda-architecture/. Accessed 20 Dec 2017

  • Shafer J, Rixner S, Cox AL (2010) The hadoop distributed filesystem: balancing portability and performance. In: Performance analysis of systems & software (ISPASS), 2010 IEEE international symposium on, IEEE, 2010, pp 122–133

  • Shen L, Stopher PR (2013) A process for trip purpose imputation from global positioning system data. Transp Res Rec J Transp Res Board 36:261–267

    Google Scholar 

  • Shi Q, Abdel-Aty M (2015) Big data applications in real-time traffic operation and safety monitoring and improvement on urban expressways. Transp Res Part C Emerg Technol 58:380–394

    Article  Google Scholar 

  • Shlayan N, Kurkcu A, Ozbay K (2016) Exploring pedestrian bluetooth and wifi detection at public transportation terminals. In: Intelligent transportation systems (ITSC), 2016 IEEE 19th international conference on, IEEE, 2016, pp 229–234

  • Shvachko K, Kuang H, Radia S, Chansler R (2010) The hadoop distributed file system. In: Mass storage systems and technologies (MSST), 2010 IEEE 26th symposium on, IEEE, 2010, pp 1–10

  • Solon O (2018) Facebook says cambridge analytica may have gained 37 m more users’ data. https://www.theguardian.com/technology/2018/apr/04/facebook-cambridge-analytica-user-data-latest-more-than-thought. Accessed 18 Aug 2018

  • Stamp M (2011) Information security: principles and practice. Wiley, New York

    Book  Google Scholar 

  • Stonebraker M (2012) Newsql: an alternative to nosql and old sql for new oltp apps. Communications of the ACM. Retrieved, 07-06

  • Stonebraker M, Weisberg A (2013) The voltdb main memory DBMS. IEEE Data Eng Bull 36(2):21–27

    Google Scholar 

  • Stopher PR, Greaves SP (2007) Household travel surveys: where are we going? Transp Res Part A Policy Pract 41(5):367–381

    Article  Google Scholar 

  • StreetLight (2018) StreetLight Data. https://www.streetlightdata.com. Accessed 15 June 2017

  • Sweeney L (2002) k-Anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 10(05):557–570

    Article  MathSciNet  MATH  Google Scholar 

  • Tanenbaum AS, Woodhull AS (1987) Operating systems: design and implementation, vol 2. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  • Tankard C (2012) Big data security. Netw Secur 2012(7):5–8

    Article  Google Scholar 

  • Tene O, Polonetsky J (2011) Privacy in the age of big data: a time for big decisions. Stan L Rev Online 64:63

    Google Scholar 

  • Terrovitis M, Mamoulis N (2008) Privacy preservation in the publication of trajectories. In: Mobile data management, 2008. MDM’08. 9th international conference on, IEEE, 2008, pp 65–72

  • Thein K (2014) Apache kafka: next generation distributed messaging system. Int J Sci Eng Technol Res 3(47):9478–9483

    Google Scholar 

  • Thusoo A, Sarma JS, Jain N, Shao Z, Chakka P, Anthony S, Liu H, Wyckoff P, Murthy R (2009) Hive: a warehousing solution over a map-reduce framework. Proc VLDB Endow 2(2):1626–1629

    Article  Google Scholar 

  • Tierney B, Kissel E, Swany M, Pouyoul E (2012) Efficient data transfer protocols for big data. In: E-Science (e-Science), 2012 IEEE 8th international conference on, IEEE, 2012, pp 1–9

  • Trépanier M, Morency C (2010) Assessing transit loyalty with smart card data. In: 12th World conference on transport research, July, 2010, pp 11–15

  • Tsirogiannis D, Harizopoulos S, Shah MA, Wiener JL, Graefe G (2009) Query processing techniques for solid state drives. In: Proceedings of the 2009 ACM SIGMOD international conference on management of data, ACM, 2009, pp 59–72

  • U.S. Department of Transportation (2013) Some observations on probe data in the v2v world: a unified view of shared situation data

  • Uber (2018) https://www.uber.com/. Accessed 6 Dec 2017

  • Van Diggelen FST (2009) A-GPS: assisted GPS, GNSS, and SBAS. Artech House, Norwood

    Google Scholar 

  • Vaquero LM, Rodero-Merino L, Buyya R (2011) Dynamically scaling applications in the cloud. ACM SIGCOMM Comput Commun Rev 41(1):45–52

    Article  Google Scholar 

  • Vela B, Cavero JM, Caceres P, Sierra-Alonso A, Cuesta CE (2018) Using a nosql graph oriented database to store accessible transport routes. In: EDBT/ICDT workshops, 2018, pp 62–66

  • Vicknair C, Macias M, Zhao Z, Nan X, Chen Y, Wilkins D (2010) A comparison of a graph database and a relational database: a data provenance perspective. In: Proceedings of the 48th annual southeast regional conference, ACM, 2010, p 42

  • Ville de Montreal (2018) Montreal’s Open Data Policy. http://donnees.ville.montreal.qc.ca/portail/city-of-montreal-open-data-policy/. Accessed 14 May 2018

  • Vora MN (2011) Hadoop-hbase for large-scale data. In: Computer science and network technology (ICC-SNT), 2011 international conference on, vol 1, IEEE, 2011, pp 601–605

  • Vukotic A, Watt N, Abedrabbo T, Fox D, Partner J (2015) Neo4j in action (vol. 22). Shelter Island: Manning

  • White CE, Bernstein D, Kornhauser AL (2000) Some map matching algorithms for personal navigation assistants. Transp Res Part C Emerg Technol 8(1):91–108

    Article  Google Scholar 

  • Wolf J, Guensler R, Bachman W (2001) Elimination of the travel diary: experiment to derive trip purpose from global positioning system travel data. Transp Res Rec J Transp Res Board 1768:125–134

    Article  Google Scholar 

  • Wu X, Zhu X, Wu G-Q, Ding W (2014) Data mining with big data. IEEE Trans Knowl Data Eng 26(1):97–107

    Article  Google Scholar 

  • Xu L, Jiang C, Wang J, Yuan J, Ren Y (2014) Information security in big data: privacy and data mining. IEEE Access 2:1149–1176

    Article  Google Scholar 

  • Yazdizadeh A, Patterson Z, Farooq B (2019) An automated approach from GPS traces to complete trip information. Int J Transp Sci Technol 8(1):82–100

    Article  Google Scholar 

  • You TH, Peng WC, Lee WC (2007) Protecting moving trajectories with dummies. In: Mobile data management, 2007 international conference on, IEEE, 2007, pp 278–282

  • Zahabi SAH, Ajzachi A, Patterson Z (2017) Transit trip itinerary inference with GTFS and smartphone data. Transp Res Rec J Transp Res Board 2652:59–69

    Article  Google Scholar 

  • Zhang J, You S, Gruenwald L (2014) High-performance spatial query processing on big taxi trip data using gpgpus. In: Big data (BigData Congress), 2014 IEEE international congress on, IEEE, 2014, pp 72–79

  • Zhao F, Ghorpade A, Pereira FC, Zegras C, Ben-Akiva M (2015) Stop detection in smartphone-based travel surveys. Transp Res Procedia 11:218–226

    Article  Google Scholar 

  • Zheng X, Chen W, Wang P, Shen D, Chen S, Wang X, Zhang Q, Yang L (2016) Big data for social transportation. IEEE Trans Intell Transp Syst 17(3):620–630

    Article  Google Scholar 

  • Zikopoulos P, Eaton C et al (2011) Understanding big data: analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media, New York

    Google Scholar 

Download references

Funding

Funding was provided by Social Sciences and Humanities Research Council.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Godwin Badu-Marfo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Badu-Marfo, G., Farooq, B. & Patterson, Z. A Perspective on the Challenges and Opportunities for Privacy-Aware Big Transportation Data. J. Big Data Anal. Transp. 1, 1–23 (2019). https://doi.org/10.1007/s42421-019-00001-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42421-019-00001-z

Keywords

Navigation