Optimal control via reinforcement learning with symbolic policy approximation