Log in

Ought we align the values of artificial moral agents?

  • Original Research
  • Published:
AI and Ethics Aims and scope Submit manuscript

A Correction to this article was published on 04 December 2023

This article has been updated

Abstract

In the near future, the capabilities of commonly used artificial systems will reach a level where we will be able to permit them to make moral decisions autonomously as part of their proper daily functioning—autonomous cars, personal assistants, household robots, stock trading bots, autonomous weapons, etc. are examples of the types of systems that will deal with simple to complex moral situations that require some level of moral judgment. In the research field of machine ethics, we distinguish several types of artificial moral agents, each of which has a different level of moral agency. In this paper, we focus on the moral agency of Explicit and Full-blown artificial moral agents. We form an opinion regarding their level of moral agency, and then examine the question of whether it is morally right to align the values of (artificial) moral agents. If we assume or are able to determine that certain types of artificial agents are indeed moral agents, then we ought to examine whether it is morally right to construct them in such a way that they are “committed” to human values. We discuss an analogy to human moral agents and the implications of granting or denying moral agency from artificial agents.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Similar content being viewed by others

Change history

Notes

  1. See [1, 2] for discussions of existential risks.

  2. Other (supplemental) approaches include the Capability Control Method. See for example [1, 3].

  3. See [2,3,4,5]. See [

    It is important to note that deployed AMAs will constantly evolve their moral understanding by receiving feedback from the environment (as current machine learning- based systems also do). They will probably make errors of moral judgment, as humans do, but probably not as often, and will have to learn from them and face the consequences (see the previous footnote).

  4. See [40].

  5. See the entry “Astroethics”, Scholarly Community Encyclopedia.

    https://encyclopedia.pub/item/revision/ff0a4b96955079658b08c3442f8f98d1, Accessed 02–11-2022.

References

  1. Bostrom, N.: Superintelligence: paths, dangers, strategies. Oxford University Press, Oxford (2014)

    Google Scholar 

  2. Russell, S.: Human Compatible AI and the Problem of Control. Penguin Random House LLC. (2019)

  3. Russell, S.: Provably beneficial artificial intelligence. Stuart Russell’s papers on Berkeley’s edu site. https://people.eecs.berkeley.edu/~russell/papers/russell-bbvabook17-pbai.pdf (2017). Accessed 05 Jan 2023

  4. Christiano, P.: Clarifying ‘AI alignment’. ai-alignment’s site. https://ai-alignment.com/clarifying-ai-alignment-cec47cd69dd6 (2018). Accessed 05 Jan 2023

  5. Yudkowsky, E.: AI Alignment: Why It’s Hard, and Where to Start. Machine Intelligence Research Institute website. https://intelligence.org/2016/12/28/ai-alignment-why-its-hard-and-where-to-start/ (2016). Accessed 05 Jan 2023

  6. Yampolskiy, R. V.: On Controllability of AI. Ar**v abs/2008.04071 (2020)

  7. Gabriel, I.: Artificial intelligence, values, and alignment. Mind. Mach. 30, 411–437 (2020)

    Article  Google Scholar 

  8. Bostrom, N.: Ethical issues in advanced artificial intelligence. Nick Bostrom’s site. https://nickbostrom.com/ethics/ai. (2003). Accessed 05 Jan 2023

  9. Allen, C., Varner, G., Zinser, J.: Prolegomena to any future artificial moral agent. J. Exp. Theor. Artif. Intell. 12, 251–261 (2000)

    Article  Google Scholar 

  10. Allen, C., Wallach, W.: Moral machines: contradiction in terms, or abdication of human responsibility? In: Lin, P., Abney, K., Bekey, G. (eds.) Robot ethics: the ethical and social implications of robotics, pp. 55–68. MIT Press, Cambridge (2011)

    Google Scholar 

  11. Moor, J.H.: The nature, importance, and difficulty of machine ethics. Intelligent Systems, IEEE 21(4), 18–21 (2006)

    Article  Google Scholar 

  12. Moor, J. H.: Four kinds of ethical robots. Philosophy Now (2009)

  13. Allen, C., Smit, I., Wallach, W.: Artificial morality: top-down, bottom-up, and hybrid approaches. Ethics Inf. Technol. 7, 149–155 (2005)

    Article  Google Scholar 

  14. Block, N.: Troubles with functionalism. In: Block, N. (ed.) Readings in the philosophy of psychology, vol. 1, pp. 268–305. Harvard University Press, Cambridge, MA (1980)

    Google Scholar 

  15. Block, N.: Are absent qualia impossible? Philos. Rev. 89, 257–274 (1980)

    Article  Google Scholar 

  16. Shoemaker, S.: Functionalism and qualia. Philos. Stud. 27, 291–315 (1975)

    Article  Google Scholar 

  17. Jackson, F.: Epiphenomenal qualia. Philos. Quart. 32, 127–136 (1982)

    Article  Google Scholar 

  18. Chalmers, D.: The conscious mind. In: Search of a fundamental theory. Oxford University Press, New York and Oxford (1996)

    Google Scholar 

  19. Lewis, D.: Mad pain and Martian pain. In: Block, N. (ed.) Readings in the philosophy of psychology, vol. I, pp. 216–222. Harvard University Press (1980)

    Google Scholar 

  20. Behdadi, D., Munthe, C.: A normative approach to artificial moral agency. Mind. Mach. 30, 195–218 (2020). https://doi.org/10.1007/s11023-020-09525-8

    Article  Google Scholar 

  21. Everitt, T., Lea, G., Hutter, M.: AGI Safety Literature Review. In: International Joint Conference on Artificial Intelligence (IJCAI) (2018). ar**v: 1805.01109

  22. Marcus, G., Davis, E.: Rebooting AI: building artificial intelligence we can trust. Vintage Books (2020)

    Google Scholar 

  23. Marcus, G., Davis, E.: GPT-3, Bloviator: OpenAI’s language generator has no idea what it’s talking about. MIT Technology Review. https://www.technologyreview.com/2020/08/22/1007539/gpt3-openai-language-generator-artificial-intelligence-ai-opinion/ (2020). Accessed 11 May 2022

  24. Marcus, G.: The next decade in AI: four steps towards robust artificial intelligence. (2020). https://arxiv.org/abs/2002.06177

  25. Marcus, G.: Deep learning is hitting a wall. https://nautil.us/deep-learning-is-hitting-a-wall-14467/ (2022). Accessed on 13 May 2022

  26. Scholkopf, B., et al.: Toward causal representation learning. Proc. IEEE 109, 612–663 (2021)

    Article  Google Scholar 

  27. Bengio, Y., et al.: A meta-transfer objective for learning to disentangle causal mechanisms (2020). Ar**v abs/1901.10912

  28. Ramplin, S., Ayob, G.: Moral responsibility in psychopathy: a clinicophilosophical case discussion. BJPsych Advances 23(3), 187–195 (2017). https://doi.org/10.1192/apt.bp.115.015321

    Article  Google Scholar 

  29. Christian, B.: The alignment problem: machine learning and human values. WW Norton & Company (2020)

    Google Scholar 

  30. Ng, A. Y. and Russell, S. J.: Algorithms for Inverse Reinforcement Learning. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML '00). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 663–670 (2000)

  31. Koch, J. and Langosco, L.: Discussion: Objective Robustness and Inner Alignment Terminology.AI Alignment Forum. https://www.alignmentforum.org/posts/pDaxobbB9FG5Dvqyv/discussion-objective-robustness-and-inner-alignment (2021). Accessed 13 Nov 2022

  32. Hubinger, E.: Inner Alignment, Outer Alignment, and Proposals for Building Safe Advanced AI.Podcast episode, Futureoflife. https://futureoflife.org/podcast/evan-hubinger-on-inner-alignment-outer-alignment-and-proposals-for-building-safe-advanced-ai/ (2020). Accessed 13 Nov 2022

  33. Asilomar, A.I.: Principles (2017). In Principles developed in conjunction with the 2017 Asilomar conference [Benevolent AI 2017]

  34. Routley, R.: Against the inevitability of human chauvinism. In: Goodpater, K.E., Sayre, K.M. (eds.) Ethics and problems of the 21st century, pp. 36–59. University of Notre Dame Press (1979)

    Google Scholar 

  35. Bostrom, N., Yudkowsky, E.: The ethics of artificial intelligence. In: Frankish, K., Ramsey, W. (eds.) The Cambridge handbook of artificial intelligence, pp. 316–334. Cambridge University Press, Cambridge (2014)

    Chapter  Google Scholar 

  36. Good, I.J.: Speculations concerning the first ultraintelligent machine. In: Alt, F.L., Rubinof, M. (eds.) Advances in computers 6. Academic Press, Cambridge, MA (1965)

    Google Scholar 

  37. Vinge, V. Technological Singularity. https://frc.ri.cmu.edu/~hpm/book98/com.ch1/vinge.singularity.html (1993). Accessed 26 Oct 2022

  38. Chalmers, D.: The singularity: a philosophical analysis. J. Conscious. Stud. 17(9–10), 7–65 (2010)

    Google Scholar 

  39. Firt, E.: Motivational defeaters of self-modifying AGIs. J. Conscious. Stud. 24(5–6), 150–169 (2017)

    Google Scholar 

  40. Carson, T.: The Golden Rule. International Encyclopedia of Ethics (2022). https://doi.org/10.1002/9781444367072.wbiee188.pub2

  41. Kant, I.: Groundwork of the Metaphysic of Morals. tr. H. J. Paton (New York: Harper, 1948)

  42. Wallach, W., Allen, C.: Moral machines: teaching robots right from wrong. Oxford University Press, Oxford (2008)

    Google Scholar 

  43. Johnson, D.: Computer systems: moral entities but not moral agents. Ethics Inf. Technol. 8(4), 195–204 (2006)

    Article  Google Scholar 

  44. Floridi, L., Sanders, J.W.: On the morality of artificial agents. Mind. Mach. 14(3), 349–379 (2004)

    Article  Google Scholar 

  45. Walsh, E.: Moral emotions. In: Shackelford, T.K., Weekes-Shackelford, V.A. (eds.) Encyclopedia of evolutionary psychological science. Springer, Cham (2021). https://doi.org/10.1007/978-3-319-19650-3_650

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Erez Firt.

Ethics declarations

Conflict of interest

On behalf of all the authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The Author name was tagged incorrectly. The given name is “Erez” and the family name is “Firt”.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Firt, E. Ought we align the values of artificial moral agents?. AI Ethics 4, 273–282 (2024). https://doi.org/10.1007/s43681-023-00264-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s43681-023-00264-x

Keywords

Navigation