Detection and Classification of Faults in Residential PV Systems with a Synthetic PV Training Database

A Machine Learning-Based Approach Using the PVMD Toolbox to Generate Synthetic PV Yield Data

More Info
expand_more

Abstract

In this thesis, a new photovoltaic fault detection and classification method is proposed. It combines the generation of a synthetic photovoltaic training database and the use of a machine learning model to detect and classify faults in small-scale residential PV systems. The database was generated in Matlab, and the machine learning modeling was done with the scikit-learn library for Python. From the modeled PV systems, solely power yield is used as an indicator, combined with system age and meteorological conditions. Using these features, four types of machine learning models are used to detect malfunctioning PV systems and classify short-circuit faults and open-circuit faults. This thesis also shows the benefit of a synthetic PV training base as opposed to alternative methods, with increasing performance due to control of database balance.

The result of this thesis is a method that can be used to construct a model for detection and classification of photovoltaic faults, specific to a single residential PV system. Malfunctioning systems can be detected with an accuracy of 80.4% using a random forest algorithm. For fault type classification, an F1-score of 0.759 was achieved, also using a random forest.