Building a generalisable ML pipeline at ING

Master Thesis (2022)
Author(s)

N. Bauman (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Luis Cruz – Mentor (TU Delft - Software Engineering)

A van Deursen – Graduation committee member (TU Delft - Software Technology)

Jie Yang – Graduation committee member (TU Delft - Web Information Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2022 Niels Bauman
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Niels Bauman
Graduation Date
08-08-2022
Awarding Institution
Delft University of Technology
Programme
['Computer Science | Software Technology']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Advances in data science have caused an increase in the use
of Artificial Intelligence (AI), specifically Machine Learning (ML), throughout
various fields. Not only in research but in the industry as well, has ML been
receiving increasing amounts of interest. Many companies rely on ML models to
increase the efficiency of existing processes or offer new services and
products. The industry, however, is facing several additional challenges
compared to the academic context. One of those challenges is applying the
Development Operations (DevOps) model to an ML application, also referred to as
MLOps. This thesis sets out to find the specific challenges that practitioners
encounter while operationalising ML models. To do so, we perform a single-case
case study on an ML pipeline built by the Trade & Communication Surveillance
team at the ING bank. This case study consists of conducting a set of interviews
and performing a manual code inspection of the pipeline. The team faces
challenges ranging from having insufficient time for operationalising each ML
project individually to operating in the highlyregulated fintech context. Their
pipeline is able to deploy a single ML model but it does not generalise well to
other projects. We present the first version of an application that mitigates
these challenges. The application is able to deploy ML models to the
development environment at ING and can be operated by data scientists to reduce
the effort of operationalising an ML model.



Files

MSc_Thesis_Niels_Bauman.pdf
(pdf | 0.811 Mb)
License info not available