Abstract
Due to the deregulation of power systems worldwide, bidding behavior simulation research has gained prominence. One crucial element in these studies is accurately defining the individual reward function (or objective function). Considering the information barriers between market participants and researchers, the common way is to develop reward functions based on theoretical assumptions, which will inevitably cause deviations from the real world. However, since market data have gradually become transparent in recent years, especially data regarding historical bidding behaviors, it is feasible to introduce data-driven methods to identify the individual reward functions that are hidden in raw bidding data. Thus, this chapter proposes a data-driven reward function identification framework with three procedures. First, the bidding decision processes of participants are formulated as a standard Markov decision process. Second, a deep inverse reinforcement learning method that is based on maximum entropy is introduced to identify individual reward functions, whose high-dimensional nonlinearity could be saved in multilayer perceptions (MLPs). Third, a deep Q-network method is customized to simulate the individual bidding behaviors based on the obtained MLP-based reward functions. The effectiveness and feasibility of the proposed framework and methods are tested based on real market data from the Australian electricity market.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Moiseeva, Ekaterina, and Mohammad Reza Hesamzadeh. 2018. Bayesian and robust Nash equilibria in hydrodominated systems under uncertainty. IEEE Transactions on Sustainable Energy 9 (2): 818–830.
Pozo, David, Enzo Sauma, and Javier Contreras. 2017. Basic theoretical foundations and insights on bilevel models and their applications to power systems. Annals of Operations Research 254 (1–2): 303–334.
Nan-Peng, Yu., Chen-Ching. Liu, and James Price. 2010. Evaluation of market rules using a multi-agent system method. IEEE Transactions on Power Systems 25 (1): 470–479.
Ye, Yujian, Dawei Qiu, Mingyang Sun, Dimitrios Papadaskalopoulos, and Goran Strbac. 2020. Deep reinforcement learning for strategic bidding in electricity markets. IEEE Transactions on Smart Grid 11 (2): 1343–1355.
Kiani, Arman, and Anuradha Annaswamy. 2014. Equilibrium in wholesale energy markets: Perturbation analysis in the presence of renewables. IEEE Transactions on Smart Grid 5 (1): 177–187.
Cao, Jun, Dan Harrold, Zhong Fan, Thomas Morstyn, and Kang Li. 2020. Deep reinforcement learning-based energy storage arbitrage with accurate lithium-ion battery degradation model. IEEE Transactions on Smart Grid 11 (5): 4513–4521.
Chen, Tao, and Su. Wencong. 2018. Local energy trading behavior modeling with deep reinforcement learning. IEEE Access 6: 62806–62814.
Hanchen, Xu., Hongbo Sun, Daniel Nikovski, Shoichi Kitamura, Kazuyuki Mori, and Hiroyuki Hashimoto. 2019. Deep reinforcement learning for joint bidding and pricing of load serving entity. IEEE Transactions on Smart Grid 10 (6): 6366–6375.
Ding, Huajie, Pierre Pinson, Hu. Zechun, Jianhui Wang, and Yonghua Song. 2017. Optimal offering and operating strategy for a large wind-storage system as a price maker. IEEE Transactions on Power Systems 32 (6): 4904–4913.
Ruiz, Carlos, Antonio J Conejo, and Yves Smeers. 2012. Equilibria in an oligopolistic electricity pool with stepwise offer curves. IEEE Transactions on Power Systems 27 (2) :752–761.
Ye, Yujian, Dawei Qiu, **g Li, and Goran Strbac. 2019. Multi-period and multi-spatial equilibrium analysis in imperfect electricity markets: A novel multi-agent deep reinforcement learning approach. IEEE Access 7: 130515–130529.
Liang, Yanchang, Chunlin Guo, Zhaohao Ding, and Huichun Hua. 2020. Agent-based modeling in electricity market using deep deterministic policy gradient algorithm. IEEE Transactions on Power Systems 35 (6): 4180–4192.
Guo, Hongye, Qixin Chen, Gu. Yuxuan, Mohammad Shahidehpour, Qing **a, and Chongqing Kang. 2020. A data-driven pattern extraction method for analyzing bidding behaviors in power markets. IEEE Transactions on Smart Grid 11 (4): 3509–3521.
Piot, Bilal, Matthieu Geist, and Olivier Pietquin. 2017. Bridging the gap between imitation learning and inverse reinforcement learning. IEEE Transactions on Neural Networks and Learning Systems 28 (8): 1814–1826.
Ziebart, Brian D, Andrew Maas, J. Andrew Bagnell, and Anind K. Dey. 2008. Maximum entropy inverse reinforcement learning. In Proceedings of the 23rd National Conference on Artificial Intelligence - volume 3, AAAI’08, 1433–1438. AAAI Press.
Wulfmeier, Markus, Peter Ondruska, and Ingmar Posner. 2015. Maximum entropy deep inverse reinforcement learning. ar**v preprint ar**v:1507.04888.
You, Changxi, Lu. Jianbo, Dimitar Filev, and Panagiotis Tsiotras. 2019. Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning. Robotics and Autonomous Systems 114: 1–18.
Lerner, Itamar, Ravi Sojitra, and Mark Gluck. 2018. Inverse reinforcement learning for video games. Aging 10 (12).
Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. 2015. Human-level control through deep reinforcement learning. Nature 518 (7540): 529–533.
Haili Song, C-C Liu, Jacques Lawarrée, and Robert W Dahlgren. 2000. Optimal electricity supply bidding by markov decision process. IEEE Transactions on Power Systems 15 (2): 618–624.
Alger, Matthew. 2016. Deep inverse reinforcement learning. Technical report. https://cs.anu.edu.au/courses/csprojects/15S2/Final_presentations/Presentations
Claessens, Bert J, Peter Vrancx, and Frederik Ruelens. Convolutional neural networks for automatic state-time feature extraction in reinforcement learning applied to residential load control. IEEE Transactions on Smart Grid 9 (4): 3259–3269.
Mocanu, Elena, Decebal Constantin Mocanu, Phuong H Nguyen, Antonio Liotta, Michael E Webber, Madeleine Gibescu, and Johannes G Slootweg. 2019. On-line building energy optimization using deep reinforcement learning. IEEE Transactions on Smart Grid 10 (4): 3698–3708.
Wan, Zhiqiang, Hepeng Li, Haibo He, and Danil Prokhorov. 2019. Model-free real-time EV charging scheduling based on deep reinforcement learning. IEEE Transactions on Smart Grid 10 (5): 5246–5257.
Graham, Paul, Jenny Hayward, James Foster, Oliver Story, and Lisa Havas. 2019. Gencost 2018: Updated projections of electricity generation technology costs. https://doi.org/10.25919/5c587da8cafe7.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2021 Science Press
About this chapter
Cite this chapter
Chen, Q., Guo, H., Zheng, K., Wang, Y. (2021). Reward Function Identification of GENCOs. In: Data Analytics in Power Markets. Springer, Singapore. https://doi.org/10.1007/978-981-16-4975-2_13
Download citation
DOI: https://doi.org/10.1007/978-981-16-4975-2_13
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-4974-5
Online ISBN: 978-981-16-4975-2
eBook Packages: EnergyEnergy (R0)