A plug-in infrastructure for the CodeFeedr project

Bachelor Thesis (2018)
Author(s)

J.C. Kuijpers (TU Delft - Electrical Engineering, Mathematics and Computer Science)

J.J.R. Quist (TU Delft - Electrical Engineering, Mathematics and Computer Science)

W.D. Zorgdrager (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

T.E.P.M.F. Abeel – Mentor

Gousios Georgios – Graduation committee member

He Wang – Graduation committee member

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2018 Jos Kuijpers, Joris Quist, Wouter Zorgdrager
More Info
expand_more
Publication Year
2018
Language
English
Copyright
© 2018 Jos Kuijpers, Joris Quist, Wouter Zorgdrager
Graduation Date
02-07-2018
Awarding Institution
Delft University of Technology
Programme
Computer Science
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

CodeFeedr is a research project at the software engineering division of the Delft University of Technology in collaboration with the Software Improvement Group. The research focuses on a software infrastructure which serves software practitioners in utilizing data-driven decision making. Currently, frameworks like Apache Flink are capable of high-performance data streaming. However, these frameworks have a lot of overhead in setting up, and adding new streaming queries takes a lot of time. They also have several limitations in combining real-time data with historical data and doing aggregations on streams from multiple sources. The developed product is a plug-in framework on top of Apache Flink, that provides a pipelining system for streaming queries. This product includes abstractions for well-known sources like GitHub, TravisCI and Twitter as well as support for historical data in mongoDB. With this framework the users can spend their efforts on actually writing streaming queries instead of setting up environments, input sources and output destinations. The product also includes orchestration tools for running streaming jobs on a distributed system.

Files

BachelorProjectReport.pdf
(pdf | 2.28 Mb)
License info not available