Generalized Optimistic Q-Learning with Provable Efficiency