Detecting Fog using Machine Learning and Investigating the Possibilities of Generating Synthetic Data

Blom, Joris

Detecting Fog using Machine Learning and Investigating the Possibilities of Generating Synthetic Data

Title

Detecting Fog using Machine Learning and Investigating the Possibilities of Generating Synthetic Data

Author

Blom, Joris (TU Delft Electrical Engineering, Mathematics and Computer Science)

Contributor

Fokkink, R.J. (mentor)
van Gijzen, M.B. (mentor)

Degree granting institution

Delft University of Technology

Programme

Applied Mathematics

Date

2023-05-17

Abstract

Fog plays a major role in chain collisions. Proper fog detection is essential for the Dutch road authority to anticipate foggy weather conditions. Dozens of stations in the Netherlands can measure fog. However, fog can be a very local phenomenon. Therefore, more local measurements are needed. There are about 5,000 traffic cameras in the Netherlands. Several studies on detecting fog on traffic cameras have been done. The most successful studies used machine learning classification models to detect fog. The biggest challenge they face is the extreme imbalance, limited diversity, and limited accuracy of the dataset. Obtaining adequate precision is one of the primary challenges since the extreme imbalance of the dataset significantly impacts precision. The main objective of this research is to improve the dataset and investigate many machine learning configurations. Another objective is to examine the possibilities of generating synthetic data.

This thesis uses a clever (re)labeling method, significantly improving the dataset's quality. However, it turned out that the dataset still has its limitations. A large portion of false positives are caused by labeling errors. After comparing several machine learning models, it follows that a 9-layer ResNET model is optimal. Adding more layers will not result in better performance. Unexpectedly, initializing ResNET with pre-trained weights actually decreases performance. In addition, the effect of oversampling and/or using a weighted binary cross-entropy loss is investigated. Just oversampling leads to overfitting, but using a weighted binary cross-entropy loss isn't ideal either. The best performance is achieved by combining weighted binary cross-entropy loss with oversampling. Decision threshold optimization substantially improved the results. The experiments allowed for selecting the ideal configuration, which substantially increased performance. The best-performing configuration achieved a strong correlation in the Matthews correlation coefficient.

Finally, the possibilities of generating synthetic data are investigated. ADASYN and SMOTe seem attractive at first sight, but from a recent study, it follows that they don't work better than random oversampling. One of the most promising ideas for generating synthetic data is to add fog to clear images. In this thesis, a conceptual algorithm is designed to add artificial fog to clear images. Most generated images look convincing, but there is much room for improvement.

Subject

Machine learning
Deep Learning
Neural Network

To reference this document use:

http://resolver.tudelft.nl/uuid:053a52e4-065a-4d60-b600-8d5a721ad9c2

Part of collection

Student theses

Document type

master thesis

Rights

Files

PDF

Master_Thesis.pdf

5.92 MB

Close viewer