The State of Data Streaming Practices at ING

None, None

The State of Data Streaming Practices at ING

Master Thesis (2021)

Author(s)

K.P. Kanya Paramita Koesoemo (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

A. Katsifodimos – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

G. Siachamis – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

M. Fragkoulis – Coach (TU Delft - Electrical Engineering, Mathematics and Computer Science)

A. van Deursen – Coach (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Jerry Brons – Coach (ING Bank)

Faculty

Electrical Engineering, Mathematics and Computer Science

Big Data Literature Review Stream Processing Streaming Analytics Industry Practices Survey Research

To reference this document use

https://resolver.tudelft.nl/uuid:273adbff-ed2c-407c-b50d-22bbf7311dc1

More Info

expand_more

Publication Year

2021

Language

English

Graduation Date

26-11-2021

Awarding Institution

Delft University of Technology

Project

ING AI For Fintech

Programme

Computer Science, Data Science and Technology

Faculty

Electrical Engineering, Mathematics and Computer Science

Downloads counter

361

Collections

thesis

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The development of data stream processing has become one of the key themes in the database and distributed system community throughout the world as data has grown on a large scale and in a range of industries over the last several years. Because data stream processing is a relatively new breakthrough in data-driven approaches, several teams at ING are investigating its possibilities. Thus, this thesis aims to provide insight on data stream processing practices at ING using research survey methodology. We conducted an extensive study that included a review of data streaming academic publications, online questionnaire distributed to 45 practitioners at ING, and in-depth interviews with 5 streaming practitioners. Our survey research aimed at understanding: (i) the use cases of data streaming; (ii) the types of streamed data users have; (iii) the streaming tasks and computation users run on their stream; (iv) the machine learning task users performed in their streams; and (v) the streaming software and tools used to process their streams. Results from academic review became the basis of designing the questionnaire. We discussed the answers of the participants to our questionnaire by highlighting common trends and challenges they faced. Through our interviews, we were able to get detailed answers on some of our questions. Our research discovered several interesting observations regarding data stream processing in practice. Particularly, real-time monitoring and event categorization are the popular use case for data streaming, data contained in streams represent a diverse range of entities and is homogeneous in format, type and category, machine learning implementation in streaming environment is prevalence, Apache Kafka is a commonly used stream processing engine and complexity of data streaming implementation is the challenge most expressed by our participants.

Files

MScThesis_KanyaParamitaKoesoem... (pdf)

(pdf | 4.33 Mb)

License info not available