Abstract
The examination of data science as a workflow is yet another facet of data science. In this chapter we elaborate on the data science workflow from an educational perspective. First, we present several approaches to the data science workflow (Sect. 10.1), following which we elaborate on the pedagogical aspects of the different phases of the workflow: data collection (Sect. 10.2), data preparation (Sect. 10.3), exploratory data analysis (Sect. 10.4), modeling (Sect. 10.5), and communication and action (Sect. 10.6). We conclude with an interdisciplinary perspective on the data science workflow (Sect. 10.7).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Berman, F. (co-chair), Rutenbar, R. (co-chair), Christensen, H., Davidson, S., Estrin, D., Franklin, M., Hailpern, B., Martonosi, M., Raghavan, P., Stodden, V., & Szalay, A. (2016). Realizing the potential of data science: Final report from the national science foundation computer and information science and engineering advisory committee data science working group. National Science Foundation Computer and Information Science and Engineering Advisory Committee Report, December, 2016. https://www.nsf.gov/cise/ac-data-science-report/CISEACDataScienceReport1.19.17.pdf
Brownlow, T. (2022, March 15). The last mile of analytics can make or break your startup. More Than Numbers. https://blog.count.co/the-last-mile-of-analytics-can-make-or-break-your-startup/
Data at WHO. (2022). https://www.who.int/data
Dataset Search. (2022). https://datasetsearch.research.google.com/
Kaggle: Your Home for Data Science. (2022). https://www.kaggle.com/
Pfister, H., Blitzstein, J., & Kaynig, V. (2015). CS109 data science. https://github.com/cs109/2015/blob/f4dcbcc1446b7dfc33ecad4dd5e92b9a23a274e0/Lectures/01-Introduction.pdf
Shearer, C. (2000). The CRISP-DM model: The new blueprint for data mining. Journal of Data Warehousing, 5(4), 13–22.
TED Talks. (2017). https://www.kaggle.com/rounakbanik/ted-talks
Wikidata. (n.d.). Retrieved May 18, 2022, from https://www.wikidata.org/wiki/Wikidata:Main_Page
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Hazzan, O., Mike, K. (2023). The Data Science Workflow. In: Guide to Teaching Data Science. Springer, Cham. https://doi.org/10.1007/978-3-031-24758-3_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-24758-3_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24757-6
Online ISBN: 978-3-031-24758-3
eBook Packages: Computer ScienceComputer Science (R0)