Data pipeline quality

development and validation of a quality assessment tool for data-driven algorithms and artificial intelligence in healthcare

Journal Article (2026)
Author(s)

Eris van Twist (Erasmus MC)

Brian van Winden (Erasmus MC)

Rogier de Jonge (Erasmus MC)

H. Rob Taal (Erasmus MC)

Matthijs de Hoog (Erasmus MC)

Alfred Schouten (TU Delft - Biomechanical Engineering)

David Tax (TU Delft - Pattern Recognition and Bioinformatics)

Jan Willem Kuiper (Erasmus MC)

Department
Biomechanical Engineering
DOI related publication
https://doi.org/10.1136/bmjhci-2025-101608
More Info
expand_more
Publication Year
2026
Language
English
Department
Biomechanical Engineering
Issue number
1
Volume number
33
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

OBJECTIVES: To develop and validate a tool for standardised quality assessment of data-driven algorithms in healthcare, focusing on the underlying data pipeline. METHODS: Data Assessment Tool for Algorithm Critical Appraisal and Robust Evidence (DATA-CARE) was iteratively developed from the established Quality In Prognosis Studies framework, selected after reviewing 10 existing quality assessment tools for observational and artificial intelligence studies. DATA-CARE evaluates five quality domains of the data pipeline: study population, data, algorithm, outcome and report transparency. Each domain comprises three to five quality criteria. With a total score of 75 points, study quality is categorised as low (<45), moderate (45-59) or high (≥60). DATA-CARE was validated during a systematic review on data-driven algorithms using continuous physiological monitoring data within the paediatric intensive care unit. Two independent reviewers performed quality assessment using DATA-CARE of included studies. Tool validation was evaluated using inter-rater agreement and intraclass correlation coefficient (ICC). RESULTS: DATA-CARE demonstrated robust inter-rater agreement (93.5%) with ICC 0.98 (95% CI 0.96 to 0.99). Of 3858 screened studies, 31 were reviewed in the use case, describing diverse algorithms. Studies were predominantly low (32.3%) to moderate (41.9%) and sporadically (25.8%) high quality. DISCUSSION: Predominance of low-to-moderate quality studies reveals critical barriers to clinical implementation of data-driven algorithms, including low quality data capture and processing, lacking validation strategies and non-transparent reporting of findings. CONCLUSIONS: DATA-CARE allows standardised and reliable critical appraisal for a wide variety of algorithms, addressing current gaps in standardised and reproducible algorithm development.