Michał Grzejdziak-Zdziarski

Conference paper (1)

1 records found

The Vanishing Empirical Variance in Randomly Initialized Deep ReLU Networks

Conference paper (2026) - Michał Grzejdziak-Zdziarski (author) , D.M.J. Tax (author) , M. Loog (author)

Neural networks are typically initialized such that the hidden pre-activations’ theoretical variance remains constant to avoid the vanishing and exploding gradient problem. This condition is necessary to train very deep networks, but numerous analyses show this to be insufficient ...