Contrastive Learning of Visual Representations from Unlabeled Videos