From Coding To Curing. Functions, Implementations, and Correctness in Deep Learning

Angius, Nicola; Plebe, Alessio

doi:10.1007/s13347-023-00642-7

From Coding To Curing. Functions, Implementations, and Correctness in Deep Learning

Research Article
Published: 05 July 2023

Volume 36, article number 47, (2023)
Cite this article

Philosophy & Technology Aims and scope Submit manuscript

310 Accesses
Explore all metrics

Abstract

This paper sheds light on the shift that is taking place from the practice of ‘coding’, namely develo** programs as conventional in the software community, to the practice of ‘curing’, an activity that has emerged in the last few years in Deep Learning (DL) and that amounts to curing the data regime to which a DL model is exposed during training. Initially, the curing paradigm is illustrated by means of a study-case on autonomous vehicles. Subsequently, the shift from coding to curing is analysed taking into consideration the epistemological notions, central in the philosophy of computer science, of function, implementation, and correctness. First, it is illustrated how, in the curing paradigm, the functions performed by the trained model depend much more on dataset curation rather than on the model algorithms which, in contrast with the coding paradigm, do not comply with requested specifications. Second, it is highlighted how DL models cannot be considered implementations according to any of the available definitions of implementation that follow an intentional theory of functions. Finally, it is argued that DL models cannot be evaluated in terms of their correctness but rather in their experimental computational validity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Mastering the Data Pipeline for Autonomous Driving

Practical Experience Report: Engineering Safe Deep Neural Networks for Automated Driving Systems

PaRoT: A Practical Framework for Robust Deep Neural Network Training

Notes

The LoAs here listed should not be taken to suggest a cascade software development model (Sommerville, 2021): from intention to execution, but rather to define a stratified ontology for computational systems. The different development methods, such as cascade, spiral, or agile, may go through some or all those levels. See on this Angius et al. (2021).
Andrej Karpathy, former director of AI at Tesla, recently revealed that much of the self-driving dataset for Tesla cars is being collected using cameras only (https://www.youtube.com/watch?v=g6bOwQdCJrc).
The Cityscape dataset contains video recordings from 50 cities with 25.000 labelled frames, concerning street scenes in distinct seasons. See https://www.cityscapes-dataset.com/. The graph in Fig. 2 is taken from https://paperswithcode.com/sota/semantic-segmentation-on-cityscapes.
Karpathy’s keynote talk at the 2021 IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR) is available at https://karpathy.ai/; unfortunately, no published paper or proceedings is associated with the talk.
The algorithm proceeds by first ranking snippets given a set of complexity measures and a task, then by selecting the most interesting snippets (with high ranking scores), and finally by ensuring diversity of the snippet set, quantifying diversity as a difference in their complexity measures.
Recall that a finite transition system \(TS=(S, A, T, I, F, AP, L)\) is a set-theoretic structure defined by a finite set of states \(S=\{s_0, \ldots , s_n\}\), a finite set of labels for transitions A, a transition relation \(T \subseteq S \times A \times S\), each transition having form \(s_i \xrightarrow {\alpha } s_j\), a set of initial states \(I \subseteq S\), a set of final states \(F \subseteq S\), a finite set of labels for states AP, and a state labelling function \(L: S \rightarrow 2^{AP}\). The example in Fig. 5 is inspired by a similar one in Baier and Katoen (2008, 100-102).
Clearly, that both traffic lights enter the green state infinitely often together is avoided by requiring safety property S.
Distinct cases should be made for the two different forms of malfunction: dysfunction when the system does not display required behaviours, and mysfunction, when the system displays behaviours not requested by its specifications (Floridi et al., 2015).
It follows from this analysis that any LoA can also be considered a specification level for its lower LoAs, in that it corresponds to the functional level of an artefact prescribing functions for its implementations.
That is, if the focus is on a high-level language program as an implementation of an algorithm, the program is understood as a structural level; supposing then focus is made on the computational machine as an implementation of the high-level language program, the latter has to be reinterpreted as a functional level.
Note that this is not a problem for biological systems in evolution theory, new functions caused by the same structure being known as exaptations (Gould & Vrba, 1982).
It would be interesting at this point analysing such a taxonomy in the light of the conceptual distinctions made, for instance, by Fresco & Primiero (2013); Primiero (2014); Floridi et al. (2015) and comparing it to the empirical classification of traditional software errors carried out by Horner & Symons (2019). This, however, would go far beyond the scope of this paper.
Notice that in case the validation process requires to fix the model may times, modifying the so called hyper-parameters (such as the number of layers or the size of the layers), the model may also overfit to the validation data set. In this case it said that an information leaks from the validation set to the trained model takes place. For more details see Chollet (2021, pp. 97-100).
It should be noted that, as it is for common formal verification, applying those methods to real cases often requires using abstractions and approximations to allow computational tractability. However, abstractions and approximations lead to the incompleteness of the involved verification method: the algorithm may either terminate with an ‘unknown’ answer, or terminate with a false negative, that is, a counterexample showing a violation of the verified functional property to which, though, does not correspond any actual model computation.
Another restraint is given by the difficulty of controlling every parameter involved in a specific function, given the complex layered structure of contemporary networks (Gerasimou et al., 2020).
A mathematical model is called by Primiero (2020) robust; in case it is able to include all variables of interest and all the inferential properties of the simulated empirical system.
There are indeed scientific contexts in which DL models used to predict the evolution of some empirical system also bear structural similarities with the target system. The convolutional DL model used by Monk (2018) to simulate parton shower, the neural model by Choudhary et al. (2020) for simulating the Henon-Heiles potential, and the MetNet-2 (Meteorological Neural Network 2) (Espeholt et al., 2021) DL model are some examples.

Abbreviations

DL:: Deep Learning
LoA:: Level of abstraction
AV:: Autonomous vehicle
XAI:: Explainable artificial intelligence

References

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., et al. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. Technical report, Google Brain Team.
Ackerman, E. (2016). Self-driving cars were just around the corner — in 1960. IEEE Spectrum, 31 Aug.
Alpern, B., & Schneider, F. B. (1985). Defining liveness. Information processing letters, 21(4), 181–185.
Article Google Scholar
Alpern, B., & Schneider, F. B. (1987). Recognizing safety and liveness. Distributed Computing, 2(3), 117–126.
Article Google Scholar
Alvarado, R. (2022a). Ai as an epistemic technology. http://philsci-archive.pitt.edu/21243/
Alvarado, R. (2022). Computer simulations as scientific instruments. Foundations of Science, 27, 1–23.
Article Google Scholar
Ammann, P., & Offutt, J. (2016). Introduction to software testing. Cambridge University Press.
Angius, N. (2013). Model-based abductive reasoning in automated software testing. Logic Journal of IGPL, 21(6), 931–942.
Article Google Scholar
Angius, N., Primiero, G., and Turner, R. (2021). The Philosophy of Computer Science. In E. N. Zalta (Eds.), The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, Spring 2021 edition.
Angius, N., & Tamburrini, G. (2011). Scientific theories of computational systems in model checking. Minds and Machines, 21(2), 323–336.
Article Google Scholar
Angius, N., & Tamburrini, G. (2017). Explaining engineered computing systems’ behaviour: the role of abstraction and idealization. Philosophy & Technology, 30(2), 239–258.
Article Google Scholar
Baier, C. & Katoen, J.-P. (2008). Principles of model checking. MIT press.
Bengio, Y., Lamblin, P., Popovici, D., & Larochelle, H. (2007). Greedy layer-wise training of deep networks. In Advances in Neural Information Processing Systems (pp. 153–160).
Chalmers, D. J. (1996). Does a rock implement every finite-state automaton? Synthese, 108, 309–333.
Article Google Scholar
Chollet, F. (2021). Deep learning with Python. Simon and Schuster.
Choudhary, A., Lindner, J. F., Holliday, E. G., Miller, S. T., Sinha, S., & Ditto, W. L. (2020). Physics-enhanced neural networks learn order and chaos. Physical Review E, 27, 217–236.
Google Scholar
Clarke, E. M., Grumberg, O., & Peled, D. A. (1999). Model checking. MIT Press.
Colburn, T., & Shute, G. (2007). Abstraction in computer science. Minds and Machines, 17(2), 169–184.
Article Google Scholar
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., & Schiele, B. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3213–3223).
Cummins, R. (1975). Functional analysis. Journal of Philosophy, 72, 741–764.
Article Google Scholar
Curtis-Trudel, A. (2021). Implementation as resemblance. Philosophy of Science, 88(5), 1021–1032.
Article Google Scholar
de Villers, J., & Barnard, E. (1992). Backpropagation neural nets with one and two hidden layers. IEEE Transactions on Neural Networks, 4, 136–141.
Article Google Scholar
Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12, 2121–2159.
Google Scholar
Durán, J. M. (2018). Computer simulations in science and engineering: Concepts-Practices-Perspectives. Springer.
Durán, J. M., & Formanek, N. (2018). Grounds for trust: Essential epistemic opacity and computational reliabilism. Minds and Machines, 28, 645–666.
Article Google Scholar
Espeholt, L., Agrawal, S., Słonderby, C., Kumar, M., Heek, J., Bromberg, C., Gazen, C., Hickey, J., Bell, A., & Kalchbrenner, N. (2021). Skillful twelve hour precipitation forecasts using large context neural networks. abs/2111.07472.
Fetzer, J. H. (1988). Program verification: The very idea. Communications of the ACM, 31(9), 1048–1063.
Article Google Scholar
Floridi, L. (2008). The method of levels of abstraction. Minds and machines, 18(3), 303–329.
Article Google Scholar
Floridi, L., Fresco, N., & Primiero, G. (2015). On malfunctioning software. Synthese, 192(4), 1199–1220.
Article Google Scholar
Fresco, N., & Primiero, G. (2013). Miscomputation. Philosophy & Technology, 26, 253–272.
Article Google Scholar
Gerasimou, S., Eniser, H. F., Sen, A., & Cakan, A. (2020). Importance-driven deep learning system testing. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE) (pp. 702–713). IEEE.
Gould, S. J., & Vrba, E. S. (1982). Exaptation–a missing term in the science of form. Paleobiology, 8(1), 4–15.
Article Google Scholar
Hilpinen, R. (1992). On artifacts and works of art 1. Theoria, 58(1), 58–82.
Article Google Scholar
Hinton, G. E., Krizhevsky, A., & Wang, S. D. (2011). Transforming auto-encoders. In International Conference on Artificial Neural Networks (pp. 44–51). Springer-Verlag.
Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 28, 504–507.
Article Google Scholar
Hoare, C. A. R. (1969). An axiomatic basis for computer programming. Communications of the ACM, 12(10), 576–580.
Article Google Scholar
Horner, J. K., & Symons, J. (2019). Understanding error rates in software engineering: Conceptual, empirical, and experimental approaches. Philosophy & Technology, 32, 363–378.
Article Google Scholar
Humbatova, N., Jahangirova, G., Bavota, G., Riccio, V., Stocco, A., & Tonella, P. (2020). Taxonomy of real faults in deep learning systems. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (pp. 1110–1121).
Humphreys, P. (2004). Extending ourselves: Computational science, empiricism, and scientific method. Oxford University Press.
Itkonen, J. & Rautiainen, K. (2005). Exploratory testing: a multiple case study. In 2005 International Symposium on Empirical Software Engineering, 2005. (pp. 10–pp). IEEE.
Janai, J., Güney, F., Behl, A., & Geiger, A. (2020). Computer vision for autonomous vehicles, problems, datasets and state of the art. Foundations and Trends in Computer Graphics and Vision, 12, 1–308.
Article Google Scholar
Johnson, D. G. (2009). Computer ethics. London: Pearson.
Google Scholar
Jones, C. B. (1990). Systematic software development using vdm. In Prentice Hall International Series in Computer Science.
Joshi, A. J., Porikli, F., & Papanikolopoulos, N. (2009). Multi-class active learning for image classification. In 2009 IEEE conference on computer vision and pattern recognition (pp. 2372–2379). IEEE.
Ketkar, N. (2017). Introduction to PyTorch (pp. 195–208). Berkeley (CA): Apress.
Google Scholar
Kim, T. K., Yi, P. H., Hager, G. D., & Lin, C. T. (2020). Refining dataset curation methods for deep learning-based automated tuberculosis screening. Journal of Thoracic Disease, 12, 5078–5085.
Article Google Scholar
Kingma, D. P. & Ba, J. (2014). Adam: A method for stochastic optimization. In Proceedings of International Conference on Learning Representations.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, pp. 1090–1098.
Kroes, P. (2012). Technical artefacts: Creations of mind and matter: A philosophy of engineering design (vol. 6). Springer Science & Business Media.
Kroes, P., & Meijers, A. (2006). The dual nature of technical artefacts. Studies in History and Philosophy of Science, 37(1), 1–4.
Article Google Scholar
Kröger, F., & Merz, S. (2008). Temporal Logic and State Systems. Springer.
Kuutti, S., Fallah, S., Bowden, R., & Barber, P. (2019). Deep learning for autonomous vehicle control - algorithms, state-of-the-art, and future prospects. Synthesis Lectures on Advances in Automotive Technology, 3, 1–80.
Article Google Scholar
Leeuwen, J. V. (1990). Handbook of Theoretical Computer Science, Volume B: Formal models and semantics. MIT Press.
Li, X. & Guo, Y. (2013). Adaptive active learning for image classification. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 859–866).
Lipton, Z. C. (2018). The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue, 16(3), 31–57.
Article Google Scholar
Liu, C., Arnon, T., Lazarus, C., Strong, C., Barrett, C., Kochenderfer, M. J., et al. (2021). Algorithms for verifying deep neural networks. Foundations and Trends® in Optimization, 4(3-4), 244–404.
Ma, L., Juefei-Xu, F., Zhang, F., Sun, J., Xue, M., Li, B., Chen, C., Su, T., Li, L., Liu, Y., et al. (2018). Deepgauge: Multi-granularity testing criteria for deep learning systems. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (pp. 120–131).
McLaughlin, P. (2001). What functions explain: Functional explanation and self-reproducing systems. Cambridge University Press.
Meng, J., Chen, P., Wahib, M., Yang, M., Zheng, L., Wei, Y., Feng, S., & Liu, W. (2022). Boosting the predictive performance with aqueous solubility dataset curation. Scientific Data, 9, 71.
Article Google Scholar
Monk, J. W. (2018). Deep learning as a parton shower. Journal of High Energy Physics, 2018, 21.
Article Google Scholar
Odena, A., Olsson, C., Andersen, D., & Goodfellow, I. (2019). Tensorfuzz: Debugging neural networks with coverage-guided fuzzing. In International Conference on Machine Learning (pp. 4901–4911). PMLR.
Pei, K., Cao, Y., Yang, J., & Jana, S. (2017). Deepxplore: Automated whitebox testing of deep learning systems. In Proceedings of the 26th Symposium on Operating Systems Principles (pp. 1–18).
Piccinini, G. (2007). Computing mechanisms. Philosophy of Science, 74(4), 501–526.
Article Google Scholar
Plebe, A., & Grasso, G. (2019). The unbearable shallow understanding of deep learning. Minds and Machines, 29(4), 515–553.
Article Google Scholar
Plebe, A., & Perconti, P. (2022). The Future of the Artificial Mind. Boca Raton: CRC Press.
Book Google Scholar
Primiero, G. (2014). A taxonomy of errors for information systems. Minds and Machines, 24, 249–273.
Article Google Scholar
Primiero, G. (2016). Information in the philosophy of computer science. In The Routledge handbook of philosophy of information (pp. 106–122). Routledge.
Primiero, G. (2020). On the foundations of computing. Oxford University Press.
Rapaport, W. J. (1999). Implementation is semantic interpretation. The Monist, 82(1), 109–130.
Article Google Scholar
Rapaport, W. J. (2005). Implementation is semantic interpretation: further thoughts. Journal of Experimental & Theoretical Artificial Intelligence, 17(4), 385–417.
Article Google Scholar
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323, 533–536.
Article Google Scholar
Rumelhart, D. E. & McClelland, J. L. (Eds.) (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition.
Sadat, A., Segal, S., Casas, S., Tu, J., Yang, B., Urtasun, R., & Yumer, E. (2021). Diverse complexity measures for dataset curation in self-driving. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 8609–8616). IEEE.
Salay, R., Queiroz, R., & Czarnecki, K. (2017). An analysis of iso 26262: Using machine learning safely in automotive software. ar**v preprint ar**v:1709.02435.
Samek, W., Montavon, G., Vedaldi, A., Hansen, L. K., & Müller, K.-R. (2019). Explainable AI: interpreting, explaining and visualizing deep learning (vol. 11700). Springer Nature.
Searle, J. R. (1995). The construction of social reality. Free Press.
Google Scholar
Settles, B. (2012). Active learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 6(1), 1–114.
Article Google Scholar
Shapiro, S. (1997). Splitting the difference: the historical necessity of synthesis in software engineering. IEEE Annals of the History of Computing, 19(1), 20–54.
Article Google Scholar
Sommerville, I. (2021). Software Engineering. London: Pearson.
Google Scholar
Spivey, J. M. (1988). Understanding Z: a specification language and its formal semantics (vol. 3). Cambridge University Press.
Symons, J., & Horner, J. K. (2020). Why there is no general solution to the problem of software verification. Foundations of Science, 25, 541–557.
Article Google Scholar
Thung, F., Wang, S., Lo, D., & Jiang, L. (2012). An empirical study of bugs in machine learning systems. In 2012 IEEE 23rd International Symposium on Software Reliability Engineering (pp. 271–280). IEEE.
Turing, A. (1936). On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, 42, 230–265.
Google Scholar
Turing, A. (1948). Intelligent machinery. Technical report, National Physical Laboratory, London. Reprinted in Ince, D. C. (ed.) Collected Works of A. M. Turing: Mechanical Intelligence, Elsevier Science Publishers, 1992.
Turner, R. (2009). Computable models (vol. 1193). Springer.
Turner, R. (2011). Specification. Minds & Machines, 21(2), 135–152.
Article Google Scholar
Turner, R. (2018). Computational artifacts\(:\ :\)towards a philosophy of computer science. Springer.
Turner, R. (2020). Computational intention. Studies in Logic, Grammar and Rhetoric, 63(1), 19–30.
Article Google Scholar
Winsberg, E. (1999). Sanctioning models: The epistemology of simulation. Science in Context, 12(2), 275–292.
Article Google Scholar
Winsberg, E. (2010). Science in the age of computer simulation. University of Chicago Press.
Winsberg, E. (2022). Computer Simulations in Science. In E. N. Zalta, & U, Nodelman (Eds.), The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, Winter 2022 edition.

Download references

Funding

Nicola Angius was partially funded by the Research Project ANR-17-CE38-0003-01 (ANR - Agence Nationale de la Recherche) titled ‘What is a (computer) program: Historical and Philosophical Perspectives’.

Author information

Authors and Affiliations

Department of Cognitive Science, University of Messina, Messina, Italy
Nicola Angius & Alessio Plebe

Authors

Nicola Angius
View author publications
You can also search for this author in PubMed Google Scholar
Alessio Plebe
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors equally contributed to the study conception and design of this paper. Material preparation and analysis were performed by both NA e AP. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Nicola Angius.

Ethics declarations

Conflict of interest

Nicola Angius and Alessio Plebe declare they have no financial interests.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Angius, N., Plebe, A. From Coding To Curing. Functions, Implementations, and Correctness in Deep Learning. Philos. Technol. 36, 47 (2023). https://doi.org/10.1007/s13347-023-00642-7

Download citation

Received: 23 January 2023
Accepted: 04 May 2023
Published: 05 July 2023
DOI: https://doi.org/10.1007/s13347-023-00642-7

Keywords

Access this article

Log in via an institution

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

From Coding To Curing. Functions, Implementations, and Correctness in Deep Learning

Abstract

Access this article

Similar content being viewed by others

Mastering the Data Pipeline for Autonomous Driving

Practical Experience Report: Engineering Safe Deep Neural Networks for Automated Driving Systems

PaRoT: A Practical Framework for Robust Deep Neural Network Training

Notes

Abbreviations

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

From Coding To Curing. Functions, Implementations, and Correctness in Deep Learning

Abstract

Access this article

Similar content being viewed by others

Mastering the Data Pipeline for Autonomous Driving

Practical Experience Report: Engineering Safe Deep Neural Networks for Automated Driving Systems

PaRoT: A Practical Framework for Robust Deep Neural Network Training

Notes

Abbreviations

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation