Log in

Towards Actionable Data Science: Domain Experts as End-Users of Data Science Systems

  • Research Article
  • Published:
Computer Supported Cooperative Work (CSCW) Aims and scope Submit manuscript

Abstract

As a wider range of organizations explore using data science systems, data science research has given growing attention to the role of domain experts. Most of this research still views data science systems as centered on the development of statistical models or algorithms by technical data scientists, with domain experts limited to the role of informers. Our paper turns attention to how domain experts mediate whether data science models or algorithms lead to action through their situated data practices. Drawing on ethnographic fieldwork and a pilot machine learning project at a craft brewery, we identify situations where the brewers’ data practices led to unreliable, incomplete data, and unpack how such data limited the effectiveness of data science activities. Extending research in CSCW and STS on domain experts’ data practices to the data science context, we aim to inform the design of data science systems that are more actionable for their end-users.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (France)

Instant access to the full article PDF.

Similar content being viewed by others

Data Availability

Although there were data for machine learning drawn on in this paper, the paper was an ethnographic analysis of this use of data for machine learning. So we do not believe it is applicable to provide a data availability statement, as the paper presents exclusively qualitative findings from our ethnographic study.

References

  • Agre, Philip E. (1993). The symbolic worldview: reply to Vera and Simon. Cognitive Science, vol. 17, pp. 61-69.

    Article  Google Scholar 

  • Amershi, Saleema; Maya Cakmak; William Bradley Knox; and Todd Kulesza (2014). Power to the people: the role of humans in interactive machine learning. AI Magazine, vol. 35, no. 4, pp. 105–120.

    Article  Google Scholar 

  • Amershi, Saleema; Dan Weld; Mihaela Vorvoreanu; Adam Fourney; Besmira Nushi; Penny Collisson; **a Suh; Shamsi Iqbal; Paul N. Bennett; Kori Inkpen; Jaime Teevan; Ruth Kikin-Gil; and Eric Horvitz (2019a). Guidelines for human-AI interaction. CHI’19: Proceedings of the CHI Conference on Human Factors in Computing Systems, Glasgow, Scotland, UK, 4–9 May 2019. New York: ACM Press, pp. 1–13.

  • Amershi, Saleema; Andrew Begel; Christian Bird; Robert DeLine; Harald Gall; Ece Kamar; Nachiappan Nagappan; Besmira Nushi; and Thomas Zimmermann (2019b). Software engineering for machine learning: a case study. ICSE-SEIP’19: IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice, Montreal, Canada, 25–31 May 2019. Piscataway, NJ: IEEE, pp. 291–300.

  • Arrieta, Alejandro Barredo; Natalia Díaz-Rodríguez; Javier Del Ser; Adrien Bennetot; Siham Tabik; Alberto Barbado; Salvador García; Sergio Gil-López; Daniel Molina; Richard Benjamins; Raja Chatila; and Francisco Herrera (2020). Explainable artificial intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, vol. 58, June 2020, pp. 82-115.

    Article  Google Scholar 

  • Aragon, Cecilia; Clayton Hutto; Andy Echenique; Brittany Fiore-Gartland; Yun Huang; **young Kim; Gina Neff; Wanli ** a research agenda for human-centered data science. CSCW’16 Companion: Proceedings of the ACM Conference on Computer Supported Cooperative Work and Social Computing Companion, San Francisco, USA, 27 February – 2 March 2016. New York: ACM Press, pp. 529–535.

  • Aslan, Sinem; Sinem Emine Mete; Eda Okur; Ece Oktay; Nese Alyuz; Utku Ergin Genc; David Stanhill; and Asli Arslan Esme (2017). Human expert labeling process (HELP): towards a reliable higher-order use state labeling process and tool to assess student engagement. Educational Technology, vol 57, no. 1, January-February 2017, pp. 53–59.

    Google Scholar 

  • Balka, Ellen; and Ina Wagner (2006). Making things work: dimensions of configurability as appropriation work. CSCW’06: Proceedings of the ACM Conference on Computer-Supported Cooperative Work, Alberta, Canada, 4–8 November 2006. New York: ACM Press, pp. 229–238.

  • Bamforth, Charles (2003). Beer: tap into the art and science of brewing. Oxford and New York: Oxford University Press.

    Google Scholar 

  • Baumer, Eric P.S.; David Mimno; Shion Guha; Emily Quan; and Geri K. Gay (2017). Comparing grounded theory and topic modeling: extreme divergence or unlikely convergence? Journal of the Association for Information Science and Technology, vol. 68, no. 6, June 2017, pp. 1397-1410.

    Article  Google Scholar 

  • Becker, Howard S. (1978). Arts and crafts. American Journal of Sociology, vol. 83, no. 4, January 1978, pp. 862-889.

    Article  Google Scholar 

  • Beede, Emma; Elizabeth Baylor; Fred Hersch; Anna Iurchenko; Lauren Wilcox; Paisan Ruamviboonsuk; and Laura M. Vardoulakis (2020). A human-centered evaluation of a deep learning system deployed in clinics for the detection of diabetic retinopathy. CHI’20: Proceedings of the CHI Conference on Human Factors in Computing Systems, Honolulu, USA, 25–30 April 2020. New York: ACM Press, pp. 1–12.

  • Bopp, Chris; Ellie Harmon; and Amy Voida (2017). Disempowered by data: nonprofits, social enterprises, and the consequences of data-driven work. CHI’17: Proceedings of the CHI Conference on Human Factors in Computing Systems, Denver, USA, 6–11 May 2017. New York: ACM Press, pp. 3608–3619.

  • Borgman, Christine L.; Jillian C. Wallis; and Matthew S. Mayernik (2012). Who’s got the data? Interdependencies in science and technology collaborations. Computer Supported Cooperative Work (CSCW), vol. 21, no. 6, August 2012, pp. 485-523.

    Article  Google Scholar 

  • Borgman, Christine L. (2016). Big data, little data, no data: scholarship in the networked world. Cambridge, MA: MIT Press.

    Google Scholar 

  • Bossen, Claus; Kathleen H. Pine; Federico Cabitza; Gunnar Ellingsen; and Enrico Maria Piras (2019). Data work in healthcare: an Introduction. Health Informatics Journal, vol. 25, no. 3, September 2019, pp. 465–474.

    Article  Google Scholar 

  • Boukhelifa, Nadia; Marc-Emmanuel Perrin; Samuel Huron; and James Eagan (2017). How data workers cope with uncertainty: a task characterisation study. CHI’17: Proceedings of the CHI Conference on Human Factors in Computing Systems, Denver, USA, 6–11 May 2017. New York: ACM Press, pp. 3645–3656.

  • Bowker, Geoffrey C. (2000). Biodiversity datadiversity. Social Studies of Science, vol. 30, no. 5, pp. 643-683.

    Article  Google Scholar 

  • Bowker, Geoffrey C. (2005). Memory practices in the sciences. Cambridge, MA: MIT Press.

    Google Scholar 

  • Boyd, Karen L. (2021). Datasheets for datasets help ML engineers notice and understand ethical issues in training data. Proceedings of the ACM Human-Computer Interaction, vol. 5, no. CSCW2, article 438, October 2021, pp, 1-27.

    Article  Google Scholar 

  • Breck, Eric; Neoklis Polyzotis; Sudip Roy; Steven Euijong Whang; and Martin Zinkevich (2019). Data validation for machine learning. SysML’19: Proceedings of the Conference on Systems and Machine Learning, Stanford, CA, USA, 31 March - 2 April 2019. Indio, CA: Systems and Machine Learning Foundation, pp. 1–14.

  • Chancellor, Stevie; Shion Guha; Jofish Kaye; Jen King; Niloufar Salehi; Sarita Schoenebeck; and Elizabeth Stowell (2019). The relationships between data, power, and justice in CSCW research. CSCW'19: Conference Companion Publication of the ACM Computer Supported Cooperative Work and Social Computing, Austin, TX, USA, 9–13 November 2019. New York: ACM Press, pp. 102–105.

  • Chatfield, Akemi T.;Vivian N. Shlemoon; Wilbur Redublado; and Faizur Rahman (2014). Data scientists as game changers in big data environments. Proceedings of the Australasian Conference on Information Systems, Auckland, NZ, 8–10 December 2014. Auckland: Auckland University of Technology, pp. 1–11.

  • Clarke, Adele E.; Carrie Friese; and Rachel Washburn (Eds.). (2015). Situational analysis in practice: map** research with grounded theory. Walnut Creek, CA: Left Coast Press.

    Google Scholar 

  • Dhar, Vasant (2013). Data science and prediction. Communications of the ACM, vol. 56, no. 12, December 2013, pp. 64-73.

    Article  Google Scholar 

  • Drosos, Ian; Titus Barik; Philip J. Guo; Robert DeLine; and Sumit Gulwani (2020). Wrex: a unified programming-by-example interaction for synthesizing readable code for data scientists. CHI’20: Proceedings of the CHI Conference on Human Factors in Computing Systems, Honolulu, USA, 25–30 April 2020. New York: ACM Press, pp. 1–12.

  • Edwards, Paul N.; Matthew S. Mayernik; Archer L. Batcheller; Geoffrey C. Bowker; and Christine L. Borgman (2011). Science friction: data, metadata, and collaboration in the interdisciplinary sciences. Social Studies of Science, vol. 41, no. 5, pp. 667-690.

    Article  Google Scholar 

  • Feinberg, Melanie (2017). A design perspective on data. CHI’17: Proceedings of the CHI Conference on Human Factors in Computing Systems, Denver, USA, 6–11 May 2017. New York: ACM Press, pp. 2952–2963.

  • Ferreira, Juliana Jansen; and Mateus de Souza Monteiro (2020). Do ML experts discuss explainability for AI systems? A discussion case in the industry for a domain-specific solution. ar**v preprint ar**v:2002.12450. Accessed 20 January 2022.

  • Fischer, Gerhard; Elisa Giaccardi; Yunwen Ye; Alistair G. Sutcliffe; and Nikolay Mehandjiev (2004). Meta-design: a manifesto for end-user development. Communications of the ACM, vol. 47, no. 9, September 2004, pp. 33-37.

    Article  Google Scholar 

  • Fisher, Oliver J.; Nicholas J. Watson; Josep E. Escrig; Rob Witt; Laura Porcu; Darren Bacon; Martin Rigley; and Rachel L. Gomes (2020). Considerations, challenges and opportunities when develo** data-driven models for process manufacturing systems. Computers & Chemical Engineering, vol. 140, article 106881, September 2020, pp. 1-14.

    Google Scholar 

  • Fletcher, George; Paul Groth; and Juan Sequeda (2020). Knowledge scientists: unlocking the data-driven organization. ar**v preprint ar**v:2004.07917. Accessed 1 January 2022.

  • Gebru, Timnit; Jamie Morgenstern; Briana Vecchione; Jennifer Wortman Vaughan; Hanna Wallach; Hal Daumé Iii; and Kate Crawford (2021). Datasheets for datasets. Communications of the ACM, vol. 64, no. 12, 2021, pp. 86-92.

    Article  Google Scholar 

  • Ghassemi, Marzyeh; Luke Oakden-Rayner; and Andrew L. Beam (2021). The false hope of current approaches to explainable artificial intelligence in health care. The Lancet Digital Health, vol. 3, no. 11 (2021), pp. 745-750.

    Article  Google Scholar 

  • Gil, Yolanda; James Honaker; Shikhar Gupta; Yibo Ma; Vito D'Orazio; Daniel Garijo; Shruti Gadewar; Qifan Yang; and Neda Jahanshad (2019). Towards human-guided machine learning. IUI’19: Proceedings of the International Conference on Intelligent User Interfaces, Marina del Rey, CA, USA, 17–20 March 2019. New York: ACM Press, pp. 614–624.

  • Gitelman, Lisa (Ed.). (2013). Raw data is an oxymoron. Cambridge, MA: MIT Press.

    Google Scholar 

  • Graham, Stephen; and Nigel Thrift (2007). Out of order: understanding repair and maintenance. Theory, Culture & Society, vol. 24, no. 3, May 2007, pp. 1-25.

    Article  Google Scholar 

  • Hoens, T. Ryan; Robi Polikar; and Nitesh V. Chawla (2012). Learning from streaming data with concept drift and imbalance: an overview. Progress in Artificial Intelligence, vol. 1, no. 1, January 2012, pp. 89-101.

    Article  Google Scholar 

  • Hohman, Fred; Kanit Wongsuphasawat; Mary Beth Kery; and Kayur Patel (2020). Understanding and visualizing data iteration in machine learning. CHI’20: Proceedings of the CHI Conference on Human Factors in Computing Systems, Honolulu, USA, 25–30 April 2020. New York: ACM Press, pp. 1–13.

  • Hough, James S. (1991). The biotechnology of malting and brewing. Cambridge: Cambridge University Press.

    Google Scholar 

  • Hynes, Nick; D. Sculley; and Michael Terry (2017). The data linter: lightweight, automated sanity checking for ml data sets. NIPS’17: Machine Learning Systems Workshop at the Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. San Diego, CA: Neural Information Processing Systems, pp. 1–7.

  • Iansiti, Marco; and Karim R. Lakhani (2020). Competing in the age of AI: strategy and leadership when algorithms and networks run the world. Boston, MA: Harvard Business Review Press.

    Google Scholar 

  • Kandel, Sean; Andreas Paepcke; Joseph M. Hellerstein; and Jeffrey Heer (2012). Enterprise data analysis and visualization: an interview study. IEEE Transactions on Visualization and Computer Graphics, vol. 18, no. 12, December 2012, pp. 2917–2926.

  • Kandel, Sean; Andreas Paepcke; Joseph Hellerstein; and Jeffrey Heer (2011). Wrangler: Interactive visual specification of data transformation scripts. CHI' 11: Proceedings of the CHI Conference on Human Factors in Computing Systems, Vancouver, Canada, 7-12 May 2011. New York: ACM Press, pp. 3363-3372.

  • Kery, Mary Beth; Bonnie E. John; Patrick O'Flaherty; Amber Horvath; and Brad A. Myers (2019). Towards effective foraging by data scientists to find past analysis choices. CHI’19: Proceedings of the CHI Conference on Human Factors in Computing Systems, Glasgow, Scotland, UK, 4–9 May 2019. New York: ACM Press, pp. 1–13.

  • Koehler, Martin; Alex Bogatu; Cristina Civili; Nikolaos Konstantinou; Edward Abel; Alvaro A.A. Fernandes; John Keane; Leonid Libkin; and Norman W. Paton (2017). Data context informed data wrangling. Big Data’17: IEEE International Conference on Big Data, Boston, MA, USA, 11–14 December. Piscataway, NJ: IEEE, pp. 956–963.

  • Kogan, Marina; Aaron Halfaker; Shion Guha; Cecilia Aragon; Michael Muller; and Stuart Geiger (2020). Map** out human-centered data science: methods, approaches, and best practices. GROUP’20: Companion of the 2020 ACM International Conference on Supporting Group Work, Sanibel Island, FL, USA, 6–8 January 2020. New York: ACM Press, pp. 151–156.

  • Krishnan, Sanjay; Daniel Haas; Michael J Franklin; and Eugene Wu (2016). Towards reliable interactive data cleaning: a user survey and recommendations. HILDA '16: Proceedings of the Workshop on Human-In-the-Loop Data Analytics, San Francisco, CA, USA, 26 June 2016. New York: ACM Press, pp. 1–5.

  • Lima, Luis; Tiago Brandão; Nelson Lima; and José António Teixeira (2011). Comparing the impact of environmental factors during very high gravity brewing fermentations. Journal of the Institute of Brewing, vol. 117, no. 3, May 2011, pp. 359-367.

    Article  Google Scholar 

  • Liu, Jiali; Nadia Boukhelifa; and James R. Eagan (2020). Understanding the role of alternatives in data analysis practices. IEEE Transactions on Visualization and Computer Graphics, vol. 26, no. 1, January 2020, pp. 66-76.

    Google Scholar 

  • Lohr, Steve (2014). For big-data scientists, ‘janitor work’ is key hurdle to insights. The New York Times. https://www.nytimes.com/2014/08/18/technology/for-big-data-scientists-hurdle-to-insights-is-janitor-work.html. Accessed 15 December 2021.

  • Lourenço, Raoni; Juliana Freire; and Dennis Shasha (2019). Debugging machine learning pipelines. DEEM'19: Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning, Amsterdam, Netherlands, 30 June 2019. New York: ACM Press, pp. 1–10.

  • Mao, Yaoli; Dakuo Wang; Michael Muller; Kush R. Varshney; Ioana Baldini; Casey Dugan; and Aleksandra Mojsilović (2019). How data scientists work together with domain experts in scientific collaborations: to find the right answer or to ask the right question? Proceedings of the ACM on Human-Computer Interaction, vol. 3, no. GROUP, article 237, December 2019, pp. 1-23.

    Article  Google Scholar 

  • Marchand, Trevor H.J. (Ed.) (2016). Craftwork as problem solving: ethnographic studies of design and making. Farnham, UK: Ashgate Publishing.

    Google Scholar 

  • Marcus, Gary (2018). Deep learning: a critical appraisal. ar**v:1801.00631. Accessed 15 December 2021.

  • Mitchell, Margaret; Simone Wu; Andrew Zaldivar; Parker Barnes; Lucy Vasserman; Ben Hutchinson; Elena Spitzer; Inioluwa Deborah Raji; and Timnit Gebru (2019). Model cards for model reporting. FAT’19: Proceedings of the ACM Conference on Fairness, Accountability, and Transparency, Atlanta, GA, USA, 29–31 January 2019. New York: ACM, pp. 1–10.

  • Morgan, Dyfed Rhys; Eifiona Thomas Lane; and David Styles (2020). Crafty marketing: an evaluation of distinctive criteria for “craft” beer. Food Reviews International, 2020, pp. 1-17.

    Google Scholar 

  • Mørch, Anders I.; and Nikolay D. Mehandjiev (2000). Tailoring as collaboration: the mediating role of multiple representations and application units. Computer Supported Cooperative Work (CSCW), vol. 9, no. 1, March 2000, pp. 75-100.

    Article  Google Scholar 

  • Muller, Michael; Melanie Feinberg; Timothy George; Steven J. Jackson; Bonnie E. John; Mary Beth Kery; and Samir Passi (2019a). Human-centered study of data science work practices. CHI EA '19: Extended Abstracts of the 2019a CHI Conference on Human Factors in Computing Systems, Glasgow, Scotland, UK, 4–9 May 2019a. New York: ACM Press, pp. 1–8.

  • Muller, Michael; Ingrid Lange; Dakuo Wang; David Piorkowski; Jason Tsay; Q. Vera Liao; Casey Dugan; and Thomas Erickson (2019b). How data science workers work with data: discovery, capture, curation, design, creation. CHI’19: Proceedings of the CHI Conference on Human Factors in Computing Systems, Glasgow, Scotland, UK, 4–9 May 2019b. New York: ACM Press, pp. 1–15.

  • Muller, Michael; Christine T. Wolf; Josh Andres; Michael Desmond; Narendra Nath Joshi; Zahra Ashktorab; Aabhas Sharma; Kristina Brimijoin; Qian Pan; Evelyn Duesterwald; and Casey Dugan (2021) Designing ground truth and the social life of labels. CHI’21: Proceedings of the CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 8–13 May 2021. New York: ACM Press, pp. 1–16.

  • Orlikowski, Wanda J. (1992). The duality of technology: rethinking the concept of technology in organizations. Organization Science, vol. 3, no. 3, August 1992, pp. 398-427.

    Article  Google Scholar 

  • Passi, Samir; and Steven J. Jackson (2017). Data vision: learning to see through algorithmic abstraction. CSCW’17: Proceedings of the ACM Conference on Computer Supported Cooperative Work, Portland, OR, USA, 25 February–1 March 2017. New York: ACM Press, pp. 2436–2447.

  • Passi, Samir; and Steven J. Jackson (2018). Trust in data science: collaboration, translation, and accountability in corporate data science projects. Proceedings of the ACM on Human-Computer Interaction, vol. 2, no. CSCW, article 136, November 2018, pp. 1–28.

  • Passi, Samir; and Solon Barocas (2019). Problem formulation and fairness. FAT’19: Proceedings of the Conference on Fairness, Accountability, and Transparency, Atlanta, GA, USA, 29–31 January 2019. New York: ACM Press, pp. 39–48.

  • Passi, Samir; and Phoebe Sengers (2020). Making data science systems work. Big Data & Society, vol. 7, no. 2, July 2020, pp. 1-13.

    Article  Google Scholar 

  • Pine, Kathleen H.; and Max Liboiron (2015). The politics of measurement and action. CHI’15: Proceedings of the ACM Conference on Human Factors in Computing Systems, Seoul, Korea, 18–23 April 2015. New York: ACM Press, pp. 3147–3156.

  • Piorkowski, David; Soya Park; April Yi Wang; Dakuo Wang; Michael Muller; and Felix Portnoy (2021). How AI developers overcome communication challenges in a multidisciplinary team: a case study. Proceedings of the ACM on Human-Computer Interaction, vol. 5, no. CSCW1, article 131, April 2021, pp. 1-25.

    Article  Google Scholar 

  • Polyzotis, Neoklis; Sudip Roy; Steven Euijong Whang; and Martin Zinkevich (2017). Data management challenges in production machine learning. SIGMOD '17: Proceedings of the ACM International Conference on Management of Data, Chicago, IL, USA, 14–19 May 2017. New York: ACM Press, pp. 1723–1726.

  • Redman, Thomas C. (2018). If your data is bad, your machine learning tools are useless. Harvard Business Review. https://hbr.org/2018/04/if-your-data-is-bad-your-machine-learning-tools-are-useless. Accessed 15 December 2021.

  • Ribes, David (2019). STS, meet data science, once again. Science, Technology, & Human Values, vol. 44, no. 3, September 2019, pp. 514-539.

    Article  Google Scholar 

  • Ribes, David; and Steven J. Jackson (2013). Data bite man: the work of sustaining a long-term study. In Lisa Gitelman (Ed.), Raw data is an oxymoron. Cambridge, MA: MIT Press, pp. 147-166.

    Chapter  Google Scholar 

  • Roh, Yuji; Geon Heo; and Steven Euijong Whang (2019). A survey on data collection for machine learning: a big data-ai integration perspective. IEEE Transactions on Knowledge and Data Engineering, vol. 33, no. 4, April 2021, pp. 1328-1347.

    Article  Google Scholar 

  • Rudin, Cynthia (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, vol. 1, no. 5, May 2019, pp. 206-215.

    Article  Google Scholar 

  • Sambasivan, Nithya; Shivani Kapania; Hannah Highfill; Diana Akrong; Praveen Kumar Paritosh; and Lora Mois Aroyo (2021). “Everyone wants to do the model work, not the data work”: data cascades in high-stakes AI. CHI’21: Proceedings of the CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 8–13 May 2021. New York: ACM Press, pp. 1–15.

  • Sanches, Pedro; and Barry Brown (2018). Data bites man: the production of malaria by technology. Proceedings of the ACM on Human-Computer Interaction, vol. 2, no. CSCW, article 153, November 2018, pp. 1-19.

    Article  Google Scholar 

  • Scheuerman, Morgan K.; Hanna, Alex; and Emily Denton (2021). Do datasets have politics? Disciplinary values in computer vision dataset development. Proceedings of the ACM on Human-Computer Interaction, vol. 5, no. CSCW2, article 317, October 2021, pp. 1-37.

    Google Scholar 

  • Segal, Judith (2009). Software development cultures and cooperation problems: a field study of the early stages of development of software for a scientific community. CSCW’09: Computer Supported Cooperative Work (CSCW), vol. 18, no. 5, September 2009, pp. 581-606.

    Google Scholar 

  • Seidelin, Catherine; Yvonne Dittrich; and Erik Grönvall (2020). Co-designing data experiments: domain experts’ exploration and experimentation with self-selected data sources. NordiCHI’20: Proceedings of the Nordic Conference on Human-Computer Interaction, Tallinn, Estonia, 25–29 October 2020. New York: ACM Press, pp. 1–11.

  • Sennett, Richard (2008). The Craftsman. New Haven CT: Yale University Press.

    Google Scholar 

  • Seo, Ji-Eun (2018). Maker of Cass buys local craft beer company. JoongAng Daily. https://koreajoongangdaily.joins.com/2018/04/04/industry/Maker-of-Cass-buys-local-craft-beer-company/3046509.html. Accessed 10 May 2021.

  • Shankar, Shreya; Rolando Garcia; Joseph M. Hellerstein; and Aditya G. Parameswaran (2022). Operationalizing machine learning: an interview study. ar**v preprint ar**v:2209.09125. Accessed 5 September 2022.

  • Stadelmann, Thilo; Mohammadreza Amirian; Ismail Arabaci; Marek Arnold; Gilbert François Duivesteijn; Ismail Elezi; Melanie Geiger; Stefan Lörwald; Benjamin Bruno Meier; Katharina Rombach; and Lukas Tuggener (2018). Deep learning in the wild. ANNPR’18: IAPR Workshop on Artificial Neural Networks in Pattern Recognition, Siena, Italy, 19–21 September 2019, pp. 17–38.

  • Suchman, Lucy A. (1983). Office procedure as practical action: models of work and system design. ACM Transactions on Information Systems (TOIS), vol. 1, no. 4, October 1983, pp. 320-328.

    Article  Google Scholar 

  • Suchman, Lucy A. (2007). Human-machine reconfigurations: plans and situated actions. New York: Cambridge University Press.

    Google Scholar 

  • Sun, Chen; Abhinav Shrivastava; Saurabh Singh; and Abhinav Gupta (2017). Revisiting unreasonable effectiveness of data in deep learning era. ICCV’17: Proceedings of the IEEE International Conference on Computer Vision, ​Venice, Italy, 22–29 October 2017. Piscataway, NJ: IEEE, pp. 843–852.

  • Taylor, Alex S.; Siân Lindley; Tim Regan; David Sweeney; Vasillis Vlachokyriakos; Lillie Grainger; and Jessica Lingel (2015) Data-in-place: thinking through the relations between data and community. CHI’15: Proceedings of the ACM Conference on Human Factors in Computing Systems, Seoul, Korea, April 18–23, 2015. New York: ACM Press, pp. 2863–2872.

  • Viaene, Stijn (2013). Data scientists aren't domain experts. IT Professional, vol. 15, no. 6 November-December 2013, pp. 12-17.

    Article  Google Scholar 

  • Wagstaff, Kiri (2012) Machine learning that matters. ar**v preprint ar**v:1206.4656. Accessed 11 January 2022.

  • Wang, April Yi; Anant Mittal; Christopher Brooks; and Steve Oney (2019). How data scientists use computational notebooks for real-time collaboration. Proceedings of the ACM on Human-Computer Interaction, vol. 3, no. CSCW, article 39, November 2019, pp. 1-30.

    Google Scholar 

  • Wang, Dakuo; Josh Andres; Justin D. Weisz; Erick Oduor; and Casey Dugan (2021). AutoDS: towards human-centered automation of data science. CHI’21: Proceedings of the CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 8–13 May 2021. New York: ACM Press, pp. 1–12.

  • Williamson, Sinead A.; and Jette Henderson (2021). Understanding collections of related datasets using dependent MMD coresets. Information, vol. 12, no. 10, September 2021, 392.

    Article  Google Scholar 

  • Zhang, Amy X.; Michael Muller; and Dakuo Wang (2020). How do data science workers collaborate? Roles, workflows, and tools. Proceedings of the ACM on Human-Computer Interaction, vol. 4, no. CSCW1, article 22, May 2020, pp. 1-23.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ju Yeon Jung.

Ethics declarations

Conflicts of interests/competing interests

This manuscript has not been published or presented elsewhere in its entirety or part, and it is also not under consideration by another journal. We have read and understood your journal’s policies, and we believe that neither the manuscript nor the study violates any of these policies. There are no conflicts of interest or competing interests to declare.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jung, J.Y., Steinberger, T. & So, C. Towards Actionable Data Science: Domain Experts as End-Users of Data Science Systems. Comput Supported Coop Work (2023). https://doi.org/10.1007/s10606-023-09475-6

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10606-023-09475-6

Keywords

Navigation