Inductivism

  • Chapter
  • First Online:
On the Epistemology of Data Science

Part of the book series: Philosophical Studies Series ((PSSP,volume 148))

  • 656 Accesses

Abstract

In this Chapter, data science is characterized as an inductivist approach, i.e. an approach which aims to start from the facts to infer increasingly general laws and theories. This perspective is corroborated first by a case study of successful scientific practice from the field of machine translation and second by an analysis of recent developments in statistics, in particular the shift from so-called data modeling to algorithmic modeling. Over the past century, inductivism has not been well regarded by many scientists and philosophers of science. Given that inductivism is generally considered to be a failed methodology, the fundamental epistemological problem of data science turns out to be the justification of inductivism. Some classic objections against inductivism are revisited, the most pertinent of which is the so-called problem of induction. Without a satisfying solution to the problem of induction, data science seems doomed to failure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 93.08
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 117.69
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
EUR 117.69
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Compare Pietsch (2017).

  2. 2.

    That Newton stands in a Baconian tradition is widely recognized in the literature (e.g. Ducheyne 2005). Newton’s posthumous editors Collin MacLaurin, Roger Cotes, and Henry Pemberton have played a crucial role in establishing the connection (Pérez-Ramos 1996, 319).

  3. 3.

    Another version of the hypothetico-deductive doctrine is given by Richard Feynman, arguably the most influential physicist in the second half of the 20th century, in his Character of Physical Law: “In general we look for a new law by the following process. First we guess it. Then we compute the consequences of the guess to see what would be implied if this law that we guessed is right. Then we compare the result of the computation to nature, with experiment or experience, compare it directly with observation, to see if it works. If it disagrees with experiment it is wrong. In that simple statement is the key to science. It does not make any difference how beautiful your guess is. It does not make any difference how smart you are, who made the guess, or what his name is – if it disagrees with experiment it is wrong. That is all there is to it.” (Feynman 1967, 156)

  4. 4.

    Norvig 2009, 240; cp. also Jelinek 2009, 492.

  5. 5.

    ‘The Unreasonable Effectiveness of Data’, talk given by Peter Norvig at UBC, 23.9.2010 http://www.youtube.com/watch?v=yvDCzhbjYWs at 43:45.

  6. 6.

    http://www-03.ibm.com/ibm/history/ibm100/us/en/icons/speechreco/team/, accessed 1.8.2013.

  7. 7.

    For an interesting exchange between Noam Chomsky and Peter Norvig, representing a model-driven approach and a statistical approach to linguistics, respectively, compare Norvig (2017).

  8. 8.

    For example, Google Translate switched in November 2016.

  9. 9.

    http://www.statisticsviews.com/details/feature/5133141/Nate-Silver-What-I-need-from-statisticians.html

  10. 10.

    http://magazine.amstat.org/blog/2010/09/01/statrevolution/ (accessed 31.1.2015).

  11. 11.

    For a graphic illustration of this claim compare the terms ‘computer’ and ‘non-parametric’ on Google’s Ngram Viewer https://books.google.com/ngrams.

  12. 12.

    Hastie & Tibshirani (1990) is a milestone; a useful overview can be found in Kauermann (2006); from a philosophical perspective, Sprenger (2011) discusses an interesting example of non-parametric modeling, bootstrap resampling, and argues for its epistemic significance.

  13. 13.

    Here, parameters are not to be understood in terms of variables but of constants that determine the properties of a specific model: e.g. in the linear model y = ax + b, a and b are the model parameters.

  14. 14.

    This curse of dimensionality does not automatically apply to all algorithms in data science and machine learning. To the contrary, it occasionally turns out helpful to artificially increase the dimensionality of the variable space in methods like decision trees or support vector machines (Breiman 2001, 208-209).

  15. 15.

    Some authors deny that Whewell should be considered a deductivist (e.g. Snyder 2017, Sec. 2). But while his epistemological stance does not fulfill all the criteria laid out in Section 2.1, he can certainly be seen as a precursor to hypothetico-deductivism. In particular, his methodological approach has considerable rationalistic elements, stressing the importance of ideas, which are “not a consequence of experience, but a result of the particular constitution and activity of the mind, which is independent of all experience in its origin, though constantly combined with experience in its exercise” (Whewell 1858b, 91; cited in Snyder 2017, Sec. 2). This was a major point of contention in the debate with Mill. In a somewhat Kantian perspective, Whewell introduced the notion of ‘colligation of facts’ referring to a process which subsumes certain phenomena under a general idea, for example geometric phenomena under the concept of space (Whewell 1858a, Ch. II.IV).

References

  • Ampère, Jean-Marie. 1826/2012. Mathematical theory of electro-dynamic phenomena uniquely derived from experiments. Transl. Michael D. Godfrey. Paris: A. Hermann. https://archive.org/details/AmpereTheorieEn

  • Bacon, Francis. 1620/1994. Novum Organum. Chicago, Il: Open Court.

    Google Scholar 

  • Bellman, Richard E. 1961. Adaptive Control Processes: A Guided Tour. Princeton: Princeton University Press.

    Book  Google Scholar 

  • Breiman, Leo. 2001. Statistical Modeling: The Two Cultures. Statistical Science 16 (3): 199–231.

    Article  Google Scholar 

  • Callebaut, Werner. 2012. Scientific perspectivism: A philosopher of science’s response to the challenge of big data biology. Studies in History and Philosophy of Biological and Biomedical Science 43 (1): 69–80.

    Article  Google Scholar 

  • Chomsky, Noam. 1965. Aspects of the Theory of Syntax. MIT Press.

    Google Scholar 

  • Ducheyne, Steffen. 2005. Bacon’s Idea and Newton’s Practice of Induction. Philosophica 76: 115–128.

    Google Scholar 

  • Duhem, Pierre. 1906/1962. The Aim and Structure of Physical Theory. New York: Atheneum.

    Google Scholar 

  • Einstein. 1934. On the Method of Theoretical Physics. Philosophy of Science 1 (2): 163–169.

    Article  Google Scholar 

  • Feynman, Richard. 1967. The Character of Physical Law

    Google Scholar 

  • Frické, Martin. 2014. Big Data and Its Epistemology. Journal of the Association for Information Science and Technology 66 (4): 651–661.

    Article  Google Scholar 

  • Gillies, Donald. 1996. Artificial Intelligence and Scientific Method. Oxford: Oxford University Press.

    Google Scholar 

  • Goodman, Nelson. 1954. Fact, Fiction, and Forecast. London: Athlone Press.

    Google Scholar 

  • Halevy, Alon, Peter Norvig, and Fernando Pereira. 2009. The Unreasonable Effectiveness of Data. IEEE Intelligent Systems 24 (2): 8–12.

    Article  Google Scholar 

  • Hanson, Norwood Russell. 1958. Patterns of Discovery: An Inquiry into the Conceptual Foundations of Science. Cambridge: Cambridge University Press.

    Google Scholar 

  • Harman, Gilbert, and Sanjeev Kulkarni. 2007. Reliable Reasoning. Induction and Statistical Learning Theory. Boston: MIT Press.

    Book  Google Scholar 

  • Hastie, T., and R. Tibshirani. 1990. Generalized Additive Models. London: Chapman and Hall.

    Google Scholar 

  • Hume, David. 1748. An Enquiry concerning Human Understanding. London: A. Millar.

    Google Scholar 

  • Jelinek, Frederick. 2009. The Dawn of Statistical ASR and MT. Computational Linguistics 35 (4): 483–494.

    Article  Google Scholar 

  • Kauermann, Goeran. 2006. Nonparametric Models and their Estimation. In Modern Econometric Analysis, ed. Olaf Hübler and Joachim Frohn, 137–152. Springer: Berlin.

    Chapter  Google Scholar 

  • Kitchin, Rob. 2014. The Data Revolution. Los Angeles: Sage.

    Google Scholar 

  • Lavoisier, Antoine. 1789/1890. Elements of Chemistry. Edinburgh: William Creech. http://www.gutenberg.org/files/30775/30775-h/30775-h.htm

    Google Scholar 

  • Leonelli, Sabina. 2012. Introduction: Making sense of data-driven research in the biological and biomedical sciences. Studies in History and Philosophy of Biological and Biomedical Sciences 43 (1): 1–3.

    Article  Google Scholar 

  • Newton, Isaac. 1726/1999. Mathematical Principles of Natural Philosophy. Berkeley: University of California Press.

    Google Scholar 

  • Norvig, Peter. 2009. Natural Language Corpus Data. In Beautiful Data, ed. T. Segaran and J. Hammerbacher, 219–242. Sebastopol: O’Reilly.

    Google Scholar 

  • ———. 2017. On Chomsky and the Two Cultures of Statistical Learning. In Berechenbarkeit der Welt? Philosophie und Wissenschaft im Zeitalter von Big Data, ed. W. Pietsch, J. Wernecke, and M. Ott, 61–83. Wiesbaden: Springer.

    Google Scholar 

  • Pérez-Ramos, Antonio. 1996. Bacon’s Forms and the Maker’s Knowledge. In The Cambridge Companion to Bacon, ed. Markuu Peltonen, 99–120. Cambridge: Cambridge University Press.

    Chapter  Google Scholar 

  • Pietsch, Wolfgang. 2015. Aspects of Theory-Ladenness in Data-Intensive Science. Philosophy of Science 82 (5): 905–916.

    Article  Google Scholar 

  • ———. 2016. The Causal Nature of Modeling with Big Data. Philosophy & Technology 29 (2): 137–171.

    Article  Google Scholar 

  • ———. 2017. Causation, Probability, and all that: Data Science as a Novel Inductive Paradigm. In Frontiers in Data Science, ed. Matthias Dehmer and Frank Emmert-Streib, 329–353. Boca Raton: CRC Press.

    Chapter  Google Scholar 

  • ———. 2021. Big Data. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Popper, Karl. 1935/2002. The Logic of Scientific Discovery. London: Routledge Classics.

    Google Scholar 

  • ———. 1963. Conjectures and Refutations. Abingdon: Routledge.

    Google Scholar 

  • Quine, Willard Van Orman. 1951. Two Dogmas of Empiricism. The Philosophical Review 60 (1): 20–43.

    Article  Google Scholar 

  • Russell, Stuart, and Peter Norvig. 2009. Artificial Intelligence. Upper Saddle River, NJ: Pearson.

    Google Scholar 

  • Snyder, Laura J. 2017. William Whewell. Stanford Encyclopedia of Philosophy (Winter 2017 Edition). https://plato.stanford.edu/archives/win2017/entries/whewell/

  • Sprenger. 2011. Science without (parametric) models: the case of bootstrap resampling. Synthese 180 (1): 65–76.

    Article  Google Scholar 

  • Whewell, William. 1858a. Novum Organon Renovatum. 3rd ed. London: John W. Parker.

    Google Scholar 

  • ———. 1858b. History of Scientific Ideas. Vol. I. London: John W. Parker.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Pietsch, W. (2022). Inductivism. In: On the Epistemology of Data Science. Philosophical Studies Series, vol 148. Springer, Cham. https://doi.org/10.1007/978-3-030-86442-2_2

Download citation

Publish with us

Policies and ethics

Navigation