Embodied Cognition and Multi-Agent Behavioral Emergence

  • Conference paper
  • First Online:
Unifying Themes in Complex Systems IX (ICCS 2018)

Part of the book series: Springer Proceedings in Complexity ((SPCOM))

Included in the following conference series:

Abstract

Autonomous systems embedded in our physical world need real-world interaction in order to function, but they also depend on it as a means to learn. This is the essence of artificial Embodied Cognition, in which machine intelligence is tightly coupled to sensors and effectors and where learning happens from continually experiencing the dynamic world as time-series data, received and processed from a situated and contextually-relative perspective. From this stream, our engineered agents must perceptually discriminate, deal with noise and uncertainty, recognize the causal influence of their actions (sometimes with significant and variable temporal lag), pursue multiple and changing goals that are often incompatible with each other, and make decisions under time pressure. To further complicate matters, unpredictability caused by the actions of other adaptive agents makes this experiential data stochastic and statistically non-stationary. Reinforcement Learning approaches to these problems often oversimplify many of these aspects, e.g., by assuming stationarity, collapsing multiple goals into a single reward signal, using repetitive discrete training episodes, or removing real-time requirements. Because we are interested in develo** dependable and trustworthy autonomy, we have been studying these problems by retaining all these inherent complexities and only simplifying the agent’s environmental bandwidth requirements. The Multi-Agent Research Basic Learning Environment (MARBLE) is a computational framework for studying the nuances of cooperative, competitive, and adversarial learning, where emergent behaviors can be better understood through carefully controlled experiments. In particular, we are using it to evaluate a novel reinforcement learning long-term memory data structure based on probabilistic suffix trees. Here, we describe this research methodology, and report on the results of some early experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 117.69
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 160.49
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
EUR 160.49
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Posterior probabilities are computed from maximum entropy priors initialized by setting the alpha parameter in a multi-modal Dirichlet distribution.

References

  1. Unity 3d game engine. https://unity3d.com/public-relations

  2. Anderson, M.L.: Embodied cognition: a field guide. Artif. Intell. 149(1), 91–130 (2003)

    Article  Google Scholar 

  3. Axelrod, R., Hamilton, W.D.: The evolution of cooperation. Science 211(4489), 1390–1396 (1981)

    Article  ADS  MathSciNet  Google Scholar 

  4. Bach, J.: Principles of Synthetic Intelligence PSI: An Architecture of Motivated Cognition, vol. 4. Oxford University Press, Oxford (2009)

    Google Scholar 

  5. Begleiter, R., El-Yaniv, R., Yona, G.: On prediction using variable order Markov models. J. Artif. Intell. Res. 22, 385–421 (2004)

    Article  MathSciNet  Google Scholar 

  6. Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: OpenAI Gym (2016). http://arxiv.org/abs/1606.01540v1

  7. Brooks, R.: A robust layered control system for a mobile robot. IEEE J. Robot. Autom. 2(1), 14–23 (1986)

    Article  Google Scholar 

  8. Chung, M., Buro, M., Schaeffer, J.: Monte Carlo planning in RTS games. In: Proceedings of IEEE 2005 Symposium on Computational Intelligence and Games, pp. 117–125 (2005)

    Google Scholar 

  9. Coad, P.: Object-oriented patterns. Commun. ACM 35(9), 152–159 (1992)

    Article  Google Scholar 

  10. Dean, T.L., Boddy, M.S.: An analysis of time-dependent planning. In: Proceedings of the Seventh AAAI National Conference on Artificial Intelligence, vol. 88, pp. 49–54. AAAI Press, Saint Paul (1988)

    Google Scholar 

  11. Domingos, P.: The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World. Basic Books, New York (2015)

    Google Scholar 

  12. Hawkins, J., Blakeslee, S.: On Intelligence: How a New Understanding of the Brain Will Lead to the Creation of Truly Intelligent Machines. Macmillan, London (2007)

    Google Scholar 

  13. Jennings, N.R., Sycara, K., Wooldridge, M.: A roadmap of agent research and development. Auton. Agent. Multi Agent Syst. 1(1), 7–38 (1998)

    Article  Google Scholar 

  14. Laird, J.E., Newell, A., Rosenbloom, P.S.: Soar: an architecture for general intelligence. Artif. Intell. 33(1), 1–64 (1987)

    Article  Google Scholar 

  15. Machado, M.C., Bellemare, M.G., Talvitie, E., Veness, J., Hausknecht, M., Bowling, M.: Revisiting the arcade learning environment: evaluation protocols and open problems for general agents. J. Artif. Intell. Res. 61, 523–562 (2018)

    Article  MathSciNet  Google Scholar 

  16. Mitchell, M.: Complexity: A Guided Tour. Oxford University Press, Oxford (2009)

    Google Scholar 

  17. Mukherjee, S.: The Gene: An Intimate History. Simon and Schuster, New York (2017)

    Google Scholar 

  18. Nguyen, P., Sunehag, P., Hutter, M.: Context tree maximizing reinforcement learning. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence. Association for the Advancement of Artificial Intelligence (2012)

    Google Scholar 

  19. Norman, M.D., Koehler, M.T., Pitsko, R.: Applied complexity science: enabling emergence through heuristics and simulations. In: Mittal, S., Diallo, S., Tolk, A. (eds.) Emergent Behavior in Complex Systems Engineering: A Modeling and Simulation Approach, pp. 201–226. Wiley, Hoboken (2018)

    Chapter  Google Scholar 

  20. Ontañón, S., Barriga, N.A., Silva, C.R., Moraes, R.O., Lelis, L.H.: The first microRTS artificial intelligence competition. AI Mag. 39(1), 75–83 (2018)

    Article  Google Scholar 

  21. Patel, A.: Red blob games, hexagonal grid reference. https://www.redblobgames.com/grids/hexagons/

  22. Schank, R.C.: Dynamic Memory Revisited. Cambridge University Press, New York (1999)

    Google Scholar 

  23. Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., Chen, Y., Lillicrap, T., Hui, F., Sifre, L., van den Driessche, G., Graepel, T., Hassabis, D.: Mastering the game of Go without human knowledge. Nature 550(7676), 354–371 (2017)

    Article  ADS  Google Scholar 

  24. Silvey, P.E.: Leveling up: strategies to achieve integrated cognitive architectures. In: Fall Symposium Series - A Standard Model of Mind: AAAI Technical Report FS-17-05, AAAI 2017, pp. 460–465 (2017)

    Google Scholar 

  25. Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3(1), 9–44 (1988)

    Google Scholar 

  26. Volf, P.A., Willems, F.M.: A study of the context tree maximizing method. In: Proceedings of 16th Benelux Symposium on Information Theory, Nieuwerkerk Ijsel, Netherlands, pp. 3–9 (1995)

    Google Scholar 

  27. Watkins, C.J.C.H.: Learning from delayed rewards. Ph.D. thesis, King’s College, Cambridge (1989)

    Google Scholar 

  28. Wilson, M.: Six views of embodied cognition. Psychon. Bull. Rev. 9(4), 625–636 (2002)

    Article  Google Scholar 

Download references

Acknowledgements and Disclaimer

The authors wish to thank Jason F. Kutarnia and Brittany A. Tracy for their assistance with this research. Approved for Public Release; Distribution Unlimited. Case Number 18-1473.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paul E. Silvey .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Silvey, P.E., Norman, M.D. (2018). Embodied Cognition and Multi-Agent Behavioral Emergence. In: Morales, A., Gershenson, C., Braha, D., Minai, A., Bar-Yam, Y. (eds) Unifying Themes in Complex Systems IX. ICCS 2018. Springer Proceedings in Complexity. Springer, Cham. https://doi.org/10.1007/978-3-319-96661-8_20

Download citation

Publish with us

Policies and ethics

Navigation