Massive terminal users have brought explosive need of data residing at edge of overall network. Multiple Mobile Edge Computing (MEC) servers are built in/near base station to meet this need. However, optimal distribution of these servers to multiple users in real time is still a
...
Massive terminal users have brought explosive need of data residing at edge of overall network. Multiple Mobile Edge Computing (MEC) servers are built in/near base station to meet this need. However, optimal distribution of these servers to multiple users in real time is still a problem. Reinforcement Learning (RL) as a framework to solve interaction problem is a promising solution. In order to apply RL based algorithm into a multi-agent environment, we propose an iterative scheme: select individual users with priorities to interact with the environment iteratively one at a time Furthermore, we tried to optimize the overall system performance based on this scheme. Hence, we construct three objective system performance indicators: average processing cost, delay and energy consumption, improve the existing Deep Q-learning Network (DQN) by using the cost as reward function, changing the fixed exploitation rate into dynamic one that associated with reward and episode time. In order to explore the performance potential of the proposed algorithm, we have simulated the proposed algorithm, DQN algorithm and greedy algorithm under different users and data sizes. The results show that the proposed algorithm had reduced at least 12% of system average processing cost comparing to the greedy algorithm. It also outperform the greedy algorithm and DQN algorithm in delay and energy consumption significantly.