Relating big data and data quality in financial service organizations

Conference Paper (2018)
Author(s)

Agung Wahyudi (TU Delft - Information and Communication Technology)

Adiska Farhani (Student TU Delft)

Marijn Janssen (TU Delft - Information and Communication Technology)

Research Group
Information and Communication Technology
Copyright
© 2018 A. Wahyudi, Adiska Farhani, M.F.W.H.A. Janssen
DOI related publication
https://doi.org/10.1007/978-3-030-02131-3_45
More Info
expand_more
Publication Year
2018
Language
English
Copyright
© 2018 A. Wahyudi, Adiska Farhani, M.F.W.H.A. Janssen
Research Group
Information and Communication Technology
Bibliographical Note
Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.@en
Pages (from-to)
504-519
ISBN (print)
9783030021306
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Today’s financial service organizations have a data deluge. A number of V’s are often used to characterize big data, whereas traditional data quality is characterized by a number of dimensions. Our objective is to investigate the complex relationship between big data and data quality. We do this by comparing the big data characteristics with data quality dimensions. Data quality has been researched for decades and there are well-defined dimensions which were adopted, whereas big data characteristics represented by eleven V’s were used to characterize big data. Literature review and ten cases in financial service organizations were invested to analyze the relationship between data quality and big data. Whereas the big data characteristics and data quality have been viewed as separated domain ours findings show that these domains are intertwined and closely related. Findings from this study suggest that variety is the most dominant big data characteristic relating with most data quality dimensions, such as accuracy, objectivity, believability, understandability, interpretability, consistent representation, accessibility, ease of operations, relevance, completeness, timeliness, and value-added. Not surprisingly, the most dominant data quality dimension is value-added which relates with variety, validity, visibility, and vast resources. The most mentioned pair of big data characteristic and data quality dimension is Velocity-Timeliness. Our findings suggest that term ‘big data’ is misleading as that mostly volume (‘big’) was not an issue and variety, validity and veracity were found to be more important.

Files

Wahyudi2018_Chapter_RelatingBi... (pdf)
(pdf | 1.37 Mb)
- Embargo expired in 12-04-2019
License info not available