Automated negotiation is a key form of interaction in systems composed of multiple autonomous agents with different preferences. Such interactions aim to reach agreements through an iterative process of making offers. With the growth of Peer-to-Peer (P2P) energy markets due to th
...
Automated negotiation is a key form of interaction in systems composed of multiple autonomous agents with different preferences. Such interactions aim to reach agreements through an iterative process of making offers. With the growth of Peer-to-Peer (P2P) energy markets due to the development and deployment of a variety of small-scale electricity generation and storage devices (DERs), automated negotiation is seen as one of the advanced techniques that can improve the efficiency of energy distribution with the consideration of preferences of different entities. Opponent modeling is one of the essential abilities of automated negotiation agents that can further benefit automated negotiation. This project introduces a new opponent modeling technique considering the specific characteristics of P2P energy markets. These particular characteristics are a) Two automated negotiation agents can negotiate with each other many times, and b) The preferences of the users of agents are decided mainly by their energy consumption patterns, which usually do not have massive fluctuation across the year. The proposed opponent modeling method is developed from the idea of modeling the policy of a Reinforcement Learning agent. It uses a neural network to approximate the bidding strategy of the opposite automated negotiation agent. The network is learned based on the observations of offers exchanged in negotiations. With the learned network, the negotiation agent can predict the future actions of its opponent and make better decisions. We evaluated our opponent modeling with an existing automated negotiation system designed for off-grid energy trading. In experiments, the introduced opponent modeling always performs better than a random-guess model while modeling basic bidding strategies. Its performance is stable in dynamic environments where its opponent’s preference and bidding strategy may change randomly. It is also proved that the introduced method has potential for further improvement with the help of advanced opponent modeling techniques, which model the preference profile of the opponent. With our new opponent modeling method, the automated negotiators who take part in the P2P energy markets should be able to find better joint agreements that are preferred by both itself and its opponents. And in this case, a better joint agreement means a more efficient way of distributing energy.