Prediction of river flow rates is an essential task for both flood protection and optimal water resource management. The high uncertainty associated with basin characteristics, hydrological processes, and climatic factors affecting river flows make streamflow prediction a very challenging problem. These reasons, together with the increasingly wide availability of data relating to flow rates and rainfall, frequently lead to a preference for data-driven models over physically based or conceptual forecasting models. This study shows the results of an in-depth comparison between two different daily streamflow prediction models: a novel simpler model based on the stacking of the Random Forest and Multilayer Perceptron algorithms, using the Elastic Net algorithm as meta-learner, and a more complex model based on bi-directional Long Short-Term Memory (LSTM) networks. Bayesian optimization was employed in selecting hyperparameters. The two prediction models were compared through the analysis of four different case studies: the Bacchiglione River, the Raccoon River, the Wilson River, and the Trent River. The two models showed comparable forecasting capabilities. In the most favourable cases, both models demonstrated very high accuracy (R2 greater than 0.93, Mean absolute percentage error approximately equal to 10%). The stacked model outperformed the bi-directional LSTM network model in several cases in predicting peak flow rates but was less accurate in forecasting low flow rates. In addition, its computation times are significantly shorter. The prediction accuracy of both models decreased as the forecast horizon increased. The length of the time series plays an essential role in developing models with satisfactory forecasting capabilities. In this study, the effectiveness of forecasting models was not influenced by the river regime. However, a high variance of the input dataset and a large number of outliers in the time series can reduce the accuracy of prediction models.

Stacked machine learning algorithms and bidirectional long short-term memory networks for multi-step ahead streamflow forecasting: A comparative study

Granata F.
;
Di Nunno F.;de Marinis G.
2022-01-01

Abstract

Prediction of river flow rates is an essential task for both flood protection and optimal water resource management. The high uncertainty associated with basin characteristics, hydrological processes, and climatic factors affecting river flows make streamflow prediction a very challenging problem. These reasons, together with the increasingly wide availability of data relating to flow rates and rainfall, frequently lead to a preference for data-driven models over physically based or conceptual forecasting models. This study shows the results of an in-depth comparison between two different daily streamflow prediction models: a novel simpler model based on the stacking of the Random Forest and Multilayer Perceptron algorithms, using the Elastic Net algorithm as meta-learner, and a more complex model based on bi-directional Long Short-Term Memory (LSTM) networks. Bayesian optimization was employed in selecting hyperparameters. The two prediction models were compared through the analysis of four different case studies: the Bacchiglione River, the Raccoon River, the Wilson River, and the Trent River. The two models showed comparable forecasting capabilities. In the most favourable cases, both models demonstrated very high accuracy (R2 greater than 0.93, Mean absolute percentage error approximately equal to 10%). The stacked model outperformed the bi-directional LSTM network model in several cases in predicting peak flow rates but was less accurate in forecasting low flow rates. In addition, its computation times are significantly shorter. The prediction accuracy of both models decreased as the forecast horizon increased. The length of the time series plays an essential role in developing models with satisfactory forecasting capabilities. In this study, the effectiveness of forecasting models was not influenced by the river regime. However, a high variance of the input dataset and a large number of outliers in the time series can reduce the accuracy of prediction models.
File in questo prodotto:
File Dimensione Formato  
HYDROL_128431.pdf

solo utenti autorizzati

Tipologia: Documento in Pre-print
Licenza: Copyright dell'editore
Dimensione 2.45 MB
Formato Adobe PDF
2.45 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11580/93422
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 78
social impact