Comparison of Machine Learning Models for Hazardous Gas Dispersion Prediction in Field Cases

Journal Article (2018)
Author(s)

Rongxiao Wang (National University of Defense Technology)

B. Chen (National University of Defense Technology)

S. Qiu (National University of Defense Technology, TU Delft - Web Information Systems)

Zhengqiu Zhu (National University of Defense Technology)

Yiduo Wang (National University of Defense Technology)

Yiping Wang (Naval 902 Factory)

Xiaogang Qiu (National University of Defense Technology)

Research Group
Web Information Systems
Copyright
© 2018 Rongxiao Wang, B. Chen, S. Qiu, Zhengqiu Zhu, Yiduo Wang, Yiping Wang, Xiaogang Qiu
DOI related publication
https://doi.org/10.3390/ijerph15071450
More Info
expand_more
Publication Year
2018
Language
English
Copyright
© 2018 Rongxiao Wang, B. Chen, S. Qiu, Zhengqiu Zhu, Yiduo Wang, Yiping Wang, Xiaogang Qiu
Research Group
Web Information Systems
Issue number
7
Volume number
15
Pages (from-to)
1-19
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Dispersion prediction plays a significant role in the management and emergency response to hazardous gas emissions and accidental leaks. Compared with conventional atmospheric dispersion models, machine leaning (ML) models have both high accuracy and efficiency in terms of prediction, especially in field cases. However, selection of model type and the inputs of the ML model are still essential problems. To address this issue, two ML models (i.e., the back propagation (BP) network and support vector regression (SVR) with different input selections (i.e., original monitoring parameters and integrated Gaussian parameters) are proposed in this paper. To compare the performances of presented ML models in field cases, these models are evaluated using the Prairie Grass and Indianapolis field data sets. The influence of the training set scale on the performances of ML models is analyzed as well. Results demonstrate that the integrated Gaussian parameters indeed improve the prediction accuracy in the Prairie Grass case. However, they do not make much difference in the Indianapolis case due to their inadaptability to the complex terrain conditions. In addition, it can be summarized that the SVR shows better generalization ability with relatively small training sets, but tends to under-fit the training data. In contrast, the BP network has a stronger fitting ability, but sometimes suffers from an over-fitting problem. As a result, the model and input selection presented in this paper will be of great help to environmental and public health protection in real applications.