Learning optimal gait parameters using the episodic Natural Actor-Critic method