Log in

Aligning artificial intelligence with moral intuitions: an intuitionist approach to the alignment problem

  • Original Research
  • Published:
AI and Ethics Aims and scope Submit manuscript

Abstract

As artificial intelligence (AI) continues to advance, one key challenge is ensuring that AI aligns with certain values. However, in the current diverse and democratic society, reaching a normative consensus is complex. This paper delves into the methodological aspect of how AI ethicists can effectively determine which values AI should uphold. After reviewing the most influential methodologies, we detail an intuitionist research agenda that offers guidelines for aligning AI applications with a limited set of reliable moral intuitions, each underlying a refined cooperative view of AI. We discuss appropriate epistemic tools for collecting, filtering, and justifying moral intuitions with the aim of reducing cognitive and social biases. The proposed methodology facilitates a large collective participation in AI alignment, while ensuring the reliability of the considered moral judgments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. For example, the ultimate version of ChatGPT (GPT-4) performs better than average in many academic and professional exams [43].

  2. Jonker [31] calls this aspect “social alignment”, while distinguishing it from “value alignment”, which concerns the safety of AI. By contrast, we understand “value alignment” more broadly, comprising social alignment.

  3. In a similar vein, Morley and colleagues [39] distinguish two aspects in AI ethics: the “what”, i.e., the ethical principles for good AI, and the “how”, i.e., the identification of the tools and methods to apply in the principles. Also, Gabriel [24] discerns the “technical” and “normative” aspects of value alignment and examine the connections between the two.

  4. The alignment process is likely to be iterative [57]. Following value implementation, developers receive feedback from the use of the systems. This feedback may prompt a recalibration of value setting.

  5. This means that the reliability of a methodology can be ultimately assessed by the long term consequences produced by AI on society. In the meanwhile, philosophers can debate about that based on rational expectations and predictions.

  6. Indeed, universal principles influenced the enactment of the first laws about AI in EU [18] and US [55].

  7. For example, the need to expand datasets to program fair, unbiased algorithms may conflict with individual privacy rights over personal information.

  8. General intuitions might be tested by qualitative methods that elicit reflection on ethical issues in AI (e.g., [16] and [40]). Instead, particular intuitions may require quantitative measurements of moral judgment in response to specific scenarios involving AI (e.g., [20]).

  9. In support of these statements, see the already mentioned review by Jobin et al. [30]. For the claim that general intuitions tend to be more stable, see Dabbagh [14].

  10. For example, in the ethics of autonomous vehicles, the principle of the Institute of Electrical and Electronics Engineers “to treat fairly all persons and to not engage in acts of discrimination based on race, religion, gender, disability, age, national origin, sexual orientation, gender identity, or gender expression” [29] has been challenged the particular intuition to prioritize the young over the elders when presented an avoidable accident [2, 20].

  11. We disagree here with Huemer [28], according to which general moral intuitions are less prone to biases.

  12. A recent study investigating algorithmic interpretability and transparency corroborates this hypothesis [59]. In the study, participants are asked to justify the implementation of an algorithm to allocate limited resources in different real-life scenarios; although the subjects opt for different solutions, moral concepts like “fairness” or “rightness” mostly guided their decisions.

  13. For instance, a person inclined to the “authority” foundation might approve extensive data collection for security purposes, while a subject sensitive to “liberty” could not see that as a sufficient reason for the privacy intrusion [54].

  14. For example, moral intuitions about a self-driving car’s decisions vary according to the weight given to the car’s driving style and reliability (A), the compliance with traffic norms (D), and whether the action results in an accident (C) [10].

  15. We agree with Savulescu et al. [51] on the fact that “overlap** consensus” between intuitions from different sources is desirable and should be strongly considered in AI policy making. However, we assume here that consensus is not always possible and our discussion focuses on cases of reasonable disagreement.

References

  1. Anderson, M., Anderson, S.L.: Case-supported principle-based behavior paradigm. In: Trappl, R. (ed.) A Construction Manual for Robots’ Ethical Systems: Requirements, Methods, Implementations, pp. 155–168. Springer, Cham (2015)

    Chapter  Google Scholar 

  2. Awad, E., Dsouza, S., Kim, R., Schulz, J., Henrich, J., ShariffRahwan, A.J.-F., Bonnefon, I.: The moral machine experiment. Nature 563(7729), 59–64 (2018)

    Article  Google Scholar 

  3. Baase, S., Henry, T.M.: A Gift of Fire: Social, Legal, and Ethical Issues for Computing Technology. Pearson, New York (2018)

    Google Scholar 

  4. Bargh, J.A.: The ecology of automaticity: toward establishing the conditions needed to produce automatic processing effects. Am. J. Psychol. 105(2), 181–199 (1992)

    Article  Google Scholar 

  5. Baumer, E.P.S., Polletta, F., Pierski, N., Gay, G.K.: A simple intervention to reduce framing effects in perceptions of global climate change. Environ. Commun. 11(3), 289–310 (2017)

    Article  Google Scholar 

  6. Bengson, J.: The intellectual given. Mind 124(495), 707–760 (2015)

    Article  Google Scholar 

  7. Bonnefon, J.-F., Shariff, A., Rahwan, I.: The moral psychology of ai and the ethical opt-out problem. In: Liao, S.M. (ed.) Ethics of Artificial Intelligence, pp. 109–126. Oxford University Press, Oxford (2020)

    Chapter  Google Scholar 

  8. Bonnefon, J.-F., Shariff, A., Rahwan, I.: The social dilemma of autonomous vehicles. Science 352(6397), 36–37 (2016)

    Google Scholar 

  9. Cecchini, D.: Moral intuition, strength, and metacognition. Philos. Psychol. 36(1), 4–28 (2023)

    Article  Google Scholar 

  10. Cecchini, D., Brantley, S., Dubljević, D.: Moral judgment in realistic traffic scenarios: moving beyond the trolley paradigm for ethics of autonomous vehicles. AI Soci (2023). https://doi.org/10.1007/s00146-023-01813-y

    Article  Google Scholar 

  11. Christian, B.: The Alignment Problem: Machine Learning and Human Values. W.W. Norton & Company, New York (2020)

    Google Scholar 

  12. Curry, O.S., Mullins, D.A., Whitehouse, H.: Is it good to cooperate? testing the theory of morality-as-cooperation in 60 societies. Curr. Anthropol. 60(1), 47–69 (2019)

    Article  Google Scholar 

  13. Curry, O.S., Alfano, M., Brandt, M.J., Pelican, C.: Moral molecules: morality as a combinatorial system. Rev. Philos. Psychol. 13, 1039–1058 (2021)

    Article  Google Scholar 

  14. Dabbagh, H.: Intuitions about moral relevance—good news for moral intuitionism. Philos. Psychol. 34(7), 1047–1072 (2021)

    Article  Google Scholar 

  15. Dasgupta, N.: Implicit attitudes and beliefs adapt to situations: A decade of research on the malleability of implicit prejudice, stereotypes, and the self-concept. In: Devine, P., Plant, A. (eds.) Advances in Experimental Social Psychology, vol. 47, pp. 233–279. Academic Press, Burlington (2013)

    Google Scholar 

  16. Dubljević, V., List, G., Milojevich, J., Ajmeri, N., Bauer, W.A., Singh, M.P., Bardaka, E., et al.: Toward a rational and ethical sociotechnical system of autonomous vehicles: A novel application of multi-criteria decision analysis. PLoS ONE 16(8), e0256224 (2021)

    Article  Google Scholar 

  17. Dung, L.: Current cases of AI misalignment and their implications for future risks. Synthese 202, 138 (2023)

    Article  MathSciNet  Google Scholar 

  18. European Union: Artificialintelligenceact.eu. https://artificialintelligenceact.eu/. Accessed May 2024 (2024)

  19. Evans, J., Stanovich, K.: Dual-process theories of higher cognition: advancing the debate. Perspect. Psychol. Sci. 8(3), 223–241 (2013)

    Article  Google Scholar 

  20. Faulhaber, A.K., Dittmer, A., Blind, F., Wächter, M.A., Timm, S., Sütfeld, L.R., Stephan, A., Pipa, G.: Human decisions in moral dilemmas are largely described by utilitarianism: virtual car driving study provides guidelines for autonomous driving vehicles. Sci. Eng. Ethics 25, 399–418 (2019)

    Article  Google Scholar 

  21. Floridi, L.: The Ethics of Artificial Intelligence: Principles, Challenges, and Opportunities. Oxford University Press, Oxford (2023)

    Book  Google Scholar 

  22. Floridi, L., Cowls, J., Beltrametti, M., et al.: AI4People–-an ethical framework for good AI society: opportunities, risks, principles, and recommendations. Mind. Mach. 28, 689–707 (2018)

    Article  Google Scholar 

  23. Forscher, P.S., Lai, C.K., Axt, J.R., Ebersole, C.R., Herman, M., Devine, P.G.: A meta-analysis of procedures to change implicit measures. J. Personal. Soc. Psychol. Attitudes Soc. Cognit. 117(3), 522–559 (2019)

    Article  Google Scholar 

  24. Gabriel, I.: Artificial Intelligence, values, and alignment. Mind. Mach. 30, 411–437 (2020)

    Article  Google Scholar 

  25. Hager, G.D., Drobnis, A., Fang, F., Ghani, R., Greenwald, A., Lyons, T., Parkes, D.C., Schultz, J., Saria, S., Smith. S.F.: Artificial intelligence for social good. ar**v:1901.05406 (2019)

  26. Haidt, J.: The Moral Emotions. In: Davidson, R.J., Scherer, K.R., Goldsmith, H.H. (eds.) Handbook of Affective Sciences, pp. 852–870. Oxford University Press, Oxford (2003)

    Google Scholar 

  27. Hauser, M., Cushman, F., Young, L., **, K., Mikhail, J.: A dissociation between moral judgments and justifications. Mind Lang. 22(1), 1–21 (2007)

    Article  Google Scholar 

  28. Huemer, M.: Revisionary Intuitionism. Soc. Philos. Policy 25(1), 368–392 (2007)

    Article  Google Scholar 

  29. IEEE: IEEE code of ethics. https://www.ieee.org/about/corporate/governance/p7-8.html. Accessed Jun 2023 (2020)

  30. Jobin, A., Ienca, M., Vayena, E.: The global landscape of AI ethics guidelines. Nat. Mach. Intell. 1, 389–399 (2019)

    Article  Google Scholar 

  31. Jonker, J.D.: Automation, alignment, and the cooperative interface. J. Ethics 1–22 (2023). https://doi.org/10.1007/s10892-023-09449-2

    Article  Google Scholar 

  32. Kneer, M., Skoczen, I.: Outcome effects, moral luck and the hindsight bias. Cognition 232, 1–21 (2023)

    Article  Google Scholar 

  33. Luetge, C., Rusch, H., Uhl, M.: Experimental Ethics: Toward an Empirical Moral Philosophy. Palgrave Macmillan, Houndmills, Basingstoke (2014)

    Book  Google Scholar 

  34. Machery, E.: Philosophy Within Its Proper Bounds. Oxford University Press, Oxford (2017)

    Book  Google Scholar 

  35. Mata, A.: Social metacognition in moral judgment: decisional conflict promotes perspective taking. J. Pers. Soc. Psychol. 117(6), 1061–1082 (2019)

    Article  Google Scholar 

  36. May, J.: Regard for Reason in the Moral Mind. Oxford University Press, Oxford (2018)

    Book  Google Scholar 

  37. Mercier, H., Sperber, D.: The Enigma of Reason. Harvard University Press, Cambridge (2017)

    Book  Google Scholar 

  38. Mittelstadt, B.: Principles alone cannot guarantee ethical AI. Nature Machine Intelligence 1, 501–507 (2019)

    Article  Google Scholar 

  39. Morley, J., Elhalal, A., Garcia, F., Kinsey, L., Moekander, J., Floridi, L.: Ethics as a service: a pragmatic operationalisation of AI ethics. Minds Mach. 31, 239–256 (2021)

    Article  Google Scholar 

  40. Dubljević, V., Douglas, S., Milojevich, J., Ajmeri, N.: Moral and social ramifications of autonomous vehicles: a qualitative study of the perceptions of professional drivers. Behav. Inf. Technol. 42, 1271–1278 (2023). https://doi.org/10.1080/0144929X.2022.2070078

    Article  Google Scholar 

  41. Morling, B.: Research Methods in Psychology: Evaluating a world of information. Norton & Company, New York (2018)

    Google Scholar 

  42. O’Neil, C.: Weapons of Math Destruction. Crown, New York (2016)

    Google Scholar 

  43. OpenAI: GPT-4 technical report. https://cdn.openai.com/papers/gpt-4.pdf. Accessed May 2023 (2023)

  44. Pflanzer, M., Traylor, Z., Lyons, J.B., Dubljevic, V., Nam, C.S.: Ethics in human-AI teaming: principles and perspectives. AI Ethics 3, 917–935 (2022)

    Article  Google Scholar 

  45. Polonioli, A., Vega-Mendoza, M., Blankinship, B., Carmel, D.: Reporting in experimental philosophy: current standards and recommendations for future practice. Rev. Philos. Psychol. 12, 49–73 (2021)

    Article  Google Scholar 

  46. Rahwan, I.: Society-in-the-loop: programming the algorithmic social contract. Ethics Inf. Technol. 20, 5–14 (2018)

    Article  Google Scholar 

  47. Rini, R.: Debunking debunking: a regress challenge for psychological threats to moral judgment. Philos. Stud. 173, 675–697 (2016)

    Article  Google Scholar 

  48. Rosenthal, J.: Experimental philosophy is useful—but not in a specific way. In: Luetge, L., Rusch, H., Uhl, M. (eds.) Experimental Ethics: Towards an Empirical Moral Philosophy, pp. 211–226. Palgrave Macmillan, Houndsmill, Basingstoke, Hampshire (2014)

    Google Scholar 

  49. Russell, S.: Human Compatible: Artificial Intelligence and the Problem of Control. Penguin, New York (2019)

    Google Scholar 

  50. Sauer, H.: Moral Judgments as Educated Intuitions. MIT Press, Cambridge (2017)

    Book  Google Scholar 

  51. Savulescu, J., Gyngell, C., Kahane, G.: Collective reflective equilibrium in practice (CREP) and controversial novel technologies. Bioethics 35, 652–663 (2021)

    Article  Google Scholar 

  52. Seligman, M.E.P.: Flourish : A Visionary New Understanding of Happiness and Well-Being. Free Press, New York (2011)

    Google Scholar 

  53. Sterelny, K., Fraser, B.: Evolution and moral realism. British Journal of the Philosophy of Science 68(4), 981–1006 (2016)

    Article  Google Scholar 

  54. Telkamp, J.B., Anderson, M.H.: The implications of diverse human moral foundations for assessing the ethicality of artificial intelligence. J. Bus. Ethics 178, 961–976 (2022)

    Article  Google Scholar 

  55. The White House: Blueprint for an AI bill of rights. https://www.whitehouse.gov/ostp/ai-bill-of-rights/. Accessed May 2024 (2022)

  56. Thompson, V., Turner, J.P., Pennycook, G.: Intuition, reason and metacognition. Cogn. Psychol. 63, 107–140 (2011)

    Article  Google Scholar 

  57. Umbrello, S., van de Poel, I.: Map** value sensitive design onto AI for social good principles. AI Ethics 1, 283–296 (2021)

    Article  Google Scholar 

  58. Wallach, W., Allen, C.: Moral Machines: Teaching Robots Right from Wrong. Oxford University Press, Oxford (2009)

    Book  Google Scholar 

  59. Webb, H., Patel, M., Rovatsos, M., Davoust, A., Ceppi, S., Koene, A., Dowthwaite, L., Portillo, V.: “It would be pretty immoral to choose a random algorithm”: opening up algorithmic interpretability and trasparency. J. Inf. Commun. Ethics Soc. 17(2), 210–228 (2019)

    Article  Google Scholar 

  60. Wong, D.: Moral Relativity. University of California Press, Berkeley (1984)

    Book  Google Scholar 

  61. Wright, J.C.: Tracking instability in our philosophical judgments: is it intuitive? Philos. Psychol. 26(4), 485–501 (2013)

    Article  Google Scholar 

Download references

Funding

Funding was provided by NSF Division of Social and Economic Sciences (grant no. 2043612, awarded to V.D).

Author information

Authors and Affiliations

Authors

Contributions

D.C.: ideation, conceptualization, and draft preparation. M.P.: review and editing. V.D.: supervision, review, and editing.

Corresponding author

Correspondence to Veljko Dubljević.

Ethics declarations

Conflict of interest

No conflict of interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cecchini, D., Pflanzer, M. & Dubljević, V. Aligning artificial intelligence with moral intuitions: an intuitionist approach to the alignment problem. AI Ethics (2024). https://doi.org/10.1007/s43681-024-00496-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s43681-024-00496-5

Keywords

Navigation