Predicting streamflow with LSTM networks using global datasets

More Info
expand_more

Abstract

Streamflow predictions remain a challenge for poorly gauged and ungauged catchments. Recent research has shown that deep learning methods based on Long Short-Term Memory (LSTM) cells outperform process-based hydrological models for rainfall-runoff modeling, opening new possibilities for prediction in ungauged basins (PUB). These studies usually feature local datasets for model development, while predictions in ungauged basins at a global scale require training on global datasets. In this study, we develop LSTM models for over 500 catchments from the CAMELS-US data base using global ERA5 meteorological forcing and global catchment characteristics retrieved with the HydroMT tool. Comparison against an LSTM trained with local datasets shows that, while the latter generally yields superior performances due to the higher spatial resolution meteorological forcing (overall median daily NSE 0.54 vs. 0.71), training with ERA5 results in higher NSE in most catchments of Western and North-Western US (median daily NSE of 0.83 vs. 0.78). No significant changes in performance occur when substituting local with global data sources for deriving the catchment characteristics. These results encourage further research to develop LSTM models for worldwide predictions of streamflow in ungauged basins using available global datasets. Promising directions include training the models with streamflow data from different regions of the world and with higher quality meteorological forcing.