From Supervised to Reinforcement Learning: an Inverse Optimization Approach