Determine and explain confidence in predicting violations on inland ships in the Netherlands

More Info
expand_more

Abstract

For real-world problems even the most complex machine learning models can only achieve a certain accuracy. This makes it important to understand why a specific prediction is made. Explanations can provide human decision support by allowing human experts to assess the reasoning of the model as well as the correctness. Specifically, in this thesis, we consider the problem of predicting violations on inland ships in the Netherlands to help inspectors in the Dutch government
deciding which ship to inspect. The main contribution is determining confidence in a prediction separately from probability and using this confidence estimation for deciding which prediction of violation to select as well as to explain.
With the limited number of inspectors and a large number of inland ships in the Netherlands, the global performance on all ships is less relevant. Instead, deciding the most qualitative predictions is more useful. Therefore, a measure of model confidence is determined to improve upon the traditional ranking based on probability. In the evaluation of this approach, no significant difference is found between the ranking based on probability for complex ensemble models. However,
for simpler, more interpretable models, there is a significant improvement in using model confidence to re-rank. The determination of confidence is further used to create explanations from the context of confidence. The goal of these explanations is to help an inspector in deciding whether to inspect an inland ship. This novel explanation approach justifies the confidence in a prediction by expressing features contributing towards the confidence. We perform a human-grounded user study evaluation to identify the task effectiveness, perceived usefulness and user trust compared to the explanations from the traditional context of probability. The results of the user study suggest the explanations of the confidence to be particularly useful for problems with a lower accuracy.