Maximizing Information Gain in Partially Observable Environments via Prediction Rewards