Interval Q-Learning: Balancing Deep and Wide Exploration