Building a generalisable ML pipeline at ING

Master thesis (2022)

Authors

N. Bauman Electrical Engineering, Mathematics and Computer Science

Contributors

Luis Cruz Software Engineering - (mentor)

A. van Deursen Software Technology (graduation committee member)

J. Yang Web Information Systems - (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

To reference this document use:

http://resolver.tudelft.nl/uuid:35c850eb-1d03-4185-a8c5-4469b2112327

More Info

expand_more

Published Date

08-08-2022

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Advances in data science have caused an increase in the use
of Artificial Intelligence (AI), specifically Machine Learning (ML), throughout
various fields. Not only in research but in the industry as well, has ML been
receiving increasing amounts of interest. Many companies rely on ML models to
increase the efficiency of existing processes or offer new services and
products. The industry, however, is facing several additional challenges
compared to the academic context. One of those challenges is applying the
Development Operations (DevOps) model to an ML application, also referred to as
MLOps. This thesis sets out to find the specific challenges that practitioners
encounter while operationalising ML models. To do so, we perform a single-case
case study on an ML pipeline built by the Trade & Communication Surveillance
team at the ING bank. This case study consists of conducting a set of interviews
and performing a manual code inspection of the pipeline. The team faces
challenges ranging from having insufficient time for operationalising each ML
project individually to operating in the highlyregulated fintech context. Their
pipeline is able to deploy a single ML model but it does not generalise well to
other projects. We present the first version of an application that mitigates
these challenges. The application is able to deploy ML models to the
development environment at ING and can be operated by data scientists to reduce
the effort of operationalising an ML model.

Files

MSc_Thesis_Niels_Bauman.pdf

(.pdf | 0.811 Mb)