In this report the uniform deconvolution problem will be discussed. It is a statistical problem where the observations we would like to make are distorted by an independent additive noise sampled from a standard uniform distribution. The sampling density is therefore the convolut
...
In this report the uniform deconvolution problem will be discussed. It is a statistical problem where the observations we would like to make are distorted by an independent additive noise sampled from a standard uniform distribution. The sampling density is therefore the convolution of the distribution function of interest and the standard uniform density. However, we would like to know the distribution of the data without the uniform convolution, or in other words, we want to infer the deconvolution of the distribution of the observables by the uniform density, based on the observable noisy data. We will first define this problem mathematically and find a connection between the distribution of the observed data and the distribution that we would like to know. Having constructed this relation, we aim to estimate the unknown distribution based the noisy observations. It turns out that this problem is related to the current status problem. This is a problem in which patients are tested for a disease to find out the time of onset of the disease. However, the observations only tell us whether patients are infected or not at the time of the test, but not the time of onset of the disease. We would like to know the times of onset as we can use these to estimate the distribution of the time of onset. To estimate this distribution, we will use the connection to the to the uniform deconvolution problem. Thus, when the relation between the distribution of the observations and the unknown distribution of the variables of interest is created, we would like to estimate the distribution function only using finitely many noisy observations. To estimate this, estimators like the MLE or MoM cannot be used as we would need to assume that the family of possible distributions can be parameterized smoothly by a Euclidean parameter. In our case, the family of possible distributions is much larger and does not satisfy this assumption, so we have to use nonparametric estimation methods. We will show two different methods of this nonparametric estimation; the nonparametric maximum likelihood estimator and the kernel density estimator. For the nonparametric maximum likelihood estimator, we first constructs the (log-)likelihood function of the problem and then we maximizes it over all possible distribution functions in the allowed family. This maximizing function is defined as the NPMLE. There is not a general method to obtain this distribution so we will have to use specific properties of our problem. Then we will use the method of isotonic regression to obtain the maximum likelihood estimator for the unknown distribution function. The other method that will be used to estimate the unknown distribution in a nonparametric way is kernel density estimation. This method constructs a distribution function more directly than the NPMLE. The intuition behind this estimator is simple, if there are a lot of observations near a value, the probability density in that value should be high, and the other way around. Finally, we will simulate the uniform deconvolution problem and solve it using the two discussed methods. The methods and their respective strengths and limitations will be compared. The kernel density estimator turns out to give a more smooth and accurate estimate but depends on an optimally chosen parameter which, in practice, cannot be calculated exactly and differs from case to case. The nonparametric maximum likelihood estimator does not depend on any parameter and the way of calculating it is the same in every case, but is only able to produce step functions.