What is Data Science?

  • Chapter
  • First Online:
Guide to Teaching Data Science

Abstract

Although many attempts have been made to define data science, such a definition has not yet been reached. One reason for the difficulty to reach a single, consensus definition for data science is its multifaceted nature: it can be described as a science, as a research method, as a discipline, as a workflow, or as a profession. One single definition just cannot capture this diverse essence of data science. In this chapter, we first take an interdisciplinary perspective and review the background for the development of data science (Sect. 2.1). Then we present data science from several perspectives: data science as a science (Sect. 2.2), data science as a research method (Sect. 2.3), data science as a discipline (Sect. 2.4), data science as a workflow (Sect. 2.5), and data science as a profession (Sect. 2.6). We conclude by highlighting three main characteristics of data science: interdisciplinarity, learner diversity, and its research-oriented nature (Sect. 2.7).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now
Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 58.84
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 53.49
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
EUR 74.89
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Earth image was originally posted to Flickr by DonkeyHotey at https://flickr.com/photos/47422005@N04/5679642883. It was reviewed on 4 December 2020 by FlickreviewR 2 and was confirmed to be licensed under the terms of the cc-by-2.0.

References

  • Al-Hashedi, K. G., & Magalingam, P. (2021). Financial fraud detection applying data mining techniques: A comprehensive review from 2009 to 2019. Computer Science Review, 40, 100402.

    Article  Google Scholar 

  • Alvargonza, D. (2011). Multidisciplinarity interdisciplinarity transdisciplinarity and the sciences. International Studies in the Philosophy Science, 25(4), 387–403.

    Google Scholar 

  • Berman, F. (co-chair), Rutenbar, R. (co-chair), Christensen, H., Davidson, S., Estrin, D., Franklin, M., Hailpern, B., Martonosi, M., Raghavan, P., Stodden, V., & Szalay, A. (2016). Realizing the potential of data science: Final report from the national science foundation computer and information science and engineering advisory committee data science working group. National Science Foundation Computer and Information Science and Engineering Advisory Committee Report, December 2016; https://www.nsf.gov/cise/ac-data-science-report/CISEACDataScienceReport1.19.17.pdf

  • Berman, F., Rutenbar, R., Hailpern, B., Christensen, H., Davidson, S., Estrin, D., Franklin, M., Martonosi, M., Raghavan, P., Stodden, V., & Szalay, A. S. (2018). Realizing the potential of data science. Communications of the ACM, 61(4), 67–72. https://doi.org/10.1145/3188721

    Article  Google Scholar 

  • Cassel, B., & Topi, H. (2015). Strengthening data science education through collaboration: Workshop report 7-27-2016. Arlington, VA.

    Google Scholar 

  • Chang, W., & Grady, N. (2019). NIST big data interoperability framework: Volume 1, Definitions, Special Publication (NIST SP). National Institute of Standards and Technology, [online], https://doi.org/10.6028/NIST.SP.1500-1r2

  • Cleveland, W. S. (2001). Data science: An action plan for expanding the technical areas of the field of statistics. International Statistical Review, 69(1), 21–26.

    Article  MATH  Google Scholar 

  • Conway, D. (2010). The data science venn diagram. Datist. http://www.dataists.com/2010/09/the-data-science-venn-diagram/

  • Cox, M., & Ellsworth, D. (1997). Managing big data for scientific visualization. ACM Siggraph, 97(1), 21–38.

    Google Scholar 

  • Danyluk, A., & Leidig, P. (2021). Computing competencies for undergraduate data science curricula. https://www.acm.org/binaries/content/assets/education/curricula-recommendations/dstf_ccdsc2021.pdf

  • Davenport, T. H., & Patil, D. (2012). Data scientist: The sexiest job of the 21st century. Harvard Business Review, 90(5), 70–76.

    Google Scholar 

  • Donoho, D. (2017). 50 years of data science. Journal of Computational and Graphical Statistics, 26(4), 745–766.

    Article  MathSciNet  Google Scholar 

  • Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery in databases. AI Magazine, 17(3), 37–37.

    Google Scholar 

  • Gray, J. (2007). EScience—A transformed scientific method. http://research.microsoft.com/en-us/um/people/gray/talks/NRC-CSTB_eScience.ppt

  • Grimmer, J., Roberts, M. E., & Stewart, B. M. (2021). Machine learning for social science: An agnostic approach. Annual Review of Political Science, 24, 395–419.

    Article  Google Scholar 

  • Harris, H., Murphy, S., & Vaisman, M. (2013). Analyzing the analyzers: An introspective survey of data scientists and their work. O’Reilly Media, Inc.

    Google Scholar 

  • Hey, T., Tansley, S., Tolle, K., & Gray, J. (2009). The fourth paradigm: Data-intensive scientific discovery (vol. 1). Microsoft research Redmond.

    Google Scholar 

  • Irizarry, R. A. (2020). The role of academia in data science education. Harvard Data Science Review, 2(1). https://doi.org/10.1162/99608f92.dd363929

  • Ishaq, A., Sadiq, S., Umer, M., Ullah, S., Mirjalili, S., Rupapara, V., & Nappi, M. (2021). Improving the prediction of heart failure patients’ survival using SMOTE and effective data mining techniques. IEEE Access, 9, 39707–39716.

    Article  Google Scholar 

  • Jeff Wu, C. F. (2021). In Wikipedia. https://en.wikipedia.org/w/index.php?title=C._F._Jeff_Wu&oldid=1049935836

  • Johnstone, I., & Roberts, F. (2014). Data science at NSF. https://www.nsf.gov/attachments/129788/public/Final_StatSNSFJan14.pdf

  • Lovell, M. C. (1983). Data mining. The Review of Economics and Statistics, 65(1), 1–12.

    Article  Google Scholar 

  • Mohebbi, M., Vanderkam, D., Kodysh, J., Schonberger, R., Choi, H., & Kumar, S. (2011). Google correlate whitepaper.

    Google Scholar 

  • Naur, P. (1966). The science of datalogy. Communications of the ACM, 9(7), 485.

    Article  Google Scholar 

  • National Science Board. (2005). Long-Lived digital data collections: Enabling research and education in the 21st century. National Science Foundation Report NSB-05-04, September 2005. http://www.nsf.gov/pubs/2005/nsb05040

  • Piatetsky-Shapiro, G. (1990). Knowledge discovery in real databases: A report on the IJCAI-89 workshop. AI Magazine, 11(4), 68–68.

    Google Scholar 

  • Piatetsky-Shapiro, G. (2000). Knowledge discovery in databases: 10 years after. Acm Sigkdd Explorations Newsletter, 1(2), 59–61.

    Article  Google Scholar 

  • Prebor, G. (2021). When feminism meets social networks. Library Hi Tech.

    Google Scholar 

  • Provost, F., & Fawcett, T. (2013). Data science for business: What you need to know about data mining and data-analytic thinking. O’Reilly Media, Inc.

    Google Scholar 

  • Shearer, C. (2000). The CRISP-DM model: The new blueprint for data mining. Journal of Data Warehousing, 5(4), 13–22.

    Google Scholar 

  • Skiena, S. S. (2017). The data science design manual. Springer.

    Book  MATH  Google Scholar 

  • Su, Y.-S., & Wu, S.-Y. (2021). Applying data mining techniques to explore user behaviors and watching video patterns in converged IT environments. Journal of Ambient Intelligence and Humanized Computing, 1–8.

    Google Scholar 

  • Taylor, D. (2016). Battle of the data science venn diagrams. KDnuggets. https://www.kdnuggets.com/battle-of-the-data-science-venn-diagrams.html/

  • Tukey, J. W. (1962). The future of data analysis. The Annals of Mathematical Statistics, 33(1), 1–67.

    Article  MathSciNet  MATH  Google Scholar 

  • Tukey, J. W. (1977). Exploratory data analysis. Addison-Wesley.

    Google Scholar 

  • Vamathevan, J., Clark, D., Czodrowski, P., Dunham, I., Ferran, E., Lee, G., Li, B., Madabhushi, A., Shah, P., Spitzer, M., & Zhao, S. (2019). Applications of machine learning in drug discovery and development. Nature Reviews Drug Discovery, 18(6), 463–477.

    Article  Google Scholar 

  • Wu, J. (1997). Statistics = Data Science? http://www2.isye.gatech.edu/~jeffwu/presentations/datascience.pdf

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Orit Hazzan .

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Hazzan, O., Mike, K. (2023). What is Data Science?. In: Guide to Teaching Data Science. Springer, Cham. https://doi.org/10.1007/978-3-031-24758-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-24758-3_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-24757-6

  • Online ISBN: 978-3-031-24758-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation