Efficient exploration with Double Uncertain Value Networks