Outlier detection in non-Gaussian distributions

More Info
expand_more

Abstract

In this thesis we are going to study outlier detection methods and propose a new method. Classical outlier detection is typically based on the assumption that the data is from a Gaussian/normal distribution. When the underlying distribution of a random sample is heavy tailed, so not normal , it is likely to have some extreme observations which would be identified as outlier using the classical procedure. This paper aims to address this issue by proposing a procedure
to identify real ‘outliers’ for heavy tailed data set. We first dive in the some existing methods and see how they work, try to understand them, simulate them and see their shortcomings in the case of a heavy tailed distribution. Then we study Extreme Value Theory (EVT) which we shall use to set up our proposed method of detecting outliers. Once we have constructed the proposed method, we are going to simulate and compare it with the existing methods. The goal in the case of normality is that the new method is not worse than the existing ones, at least not extremely, and in the case of a heavy tailed function to work better.

Files