J.H. Krijthe | TU Delft Repository

Hip Morphology–Based Osteoarthritis Risk Prediction Models

Development and External Validation Using Individual Participant Data From the World COACH Consortium

Journal article (2026) - Myrthe A. van den Berg , Fleur Boel , Michiel M.A. van Buuren , Noortje S. Riedstra , Jinchi Tang , Harbeer Ahedi , Nigel K. Arden , J.H. Krijthe , Rintje Agricola ,More authors...

Objective
This study aims to develop hip morphology-based radiographic hip osteoarthritis (RHOA) risk prediction models and investigates the added predictive value of hip morphology measurements and the generalizability to different populations.

Methods
We combin ...

Objective
This study aims to develop hip morphology-based radiographic hip osteoarthritis (RHOA) risk prediction models and investigates the added predictive value of hip morphology measurements and the generalizability to different populations.

Methods
We combined data from nine prospective cohort studies participating in the Worldwide Collaboration on OsteoArthritis prediCtion for the Hip (World COACH) consortium. RHOA grades were harmonized, and incident RHOA was defined as hips without definite RHOA at baseline that developed definite RHOA within four to eight years. Baseline hip morphology was quantified with automatically and uniformly determined lateral center edge angle and alpha angle measurements on anteroposterior radiographs. Discriminative performance of generalized linear mixed model (GLMM) definitions with and without hip morphology measurements was determined with stratified cross-validation. With leave-one-cohort-out cross-validation, the generalizability to unseen populations of hip morphology–based GLMMs and random forest (RF) models was evaluated.

Results
From the included 35,984 hips without definite RHOA at baseline, 4.7% developed incident RHOA within four to eight years. The GLMM with cohort-specific intercept, considering baseline demographics, RHOA grade, and hip morphology measurements, showed a mean area under the receiver operating characteristic curve (AUC) of 0.80 (±0.01) in stratified cross-validation. Using a marginal intercept decreased performance by 0.1 in AUC. Similar results were found for a GLMM without hip morphology measurements. Leave-one-cohort-out cross-validation showed comparable discrimination (AUC between 0.56–0.88) and calibration performance for hip morphology-based GLMMs and RF models.

Conclusion
In hips free of definite RHOA, our AUCs for the incident RHOA models showed good predictive performance in similar populations. However, the added predictive value of the morphology measurements was small, and model performance was heterogeneous in leave-one-cohort-out cross-validation.

Prediction of Platelet Transfusion Outcomes in Preterm Newborns With Severe Thrombocytopenia

Journal article (2026) - Jim M. Smit , David M. Kent , Jesse H. Krijthe

Tacrolimus Exposure is Associated with Acute Rejection in the Early Phase After Kidney Transplantation

A Joint Modeling Approach

Journal article (2025) - Maaike R. Schagen , Alvaro Assis de Souza , Karin Boer , Jesse H. Krijthe , Rachida Bouamar , Andrew P. Stubbs , Dennis A. Hesselink , Brenda C.M. de Winter

Background: – Reports regarding the relationship between tacrolimus exposure and the risk of acute kidney allograft rejection are conflicting. This may be explained by the previous use of methodological approaches that disregarded important factors in the analysis of longitudinal ...

The Risks of Risk Assessment

Causal Blind Spots When Using Prediction Models for Treatment Decisions

Journal article (2025) - Nan van Geloven , Ruth H. Keogh , Wouter van Amsterdam , Giovanni Cinà , Jesse H. Krijthe , Niels Peek , Kim Luijken , Sara Magliacane , Paweł Morzywołek ,More Authors...

Clinicians increasingly rely on prediction models to guide treatment choices. Most prediction models, however, are developed using observational data that include some patients who have already received the treatment the prediction model is meant to inform. Special attention to t ...

C-reactive protein-guided treatment in pneumonia

Charting a personalised approach – Authors’ reply

Journal article (2025) - Jim M. Smit , Jesse H. Krijthe , Gianfranco U. Meduri , Pierre François Dequin , Harin Karunajeewa , Antoni Torres , Marcel J.T. Reinders , Henrik Endeman , Philip A. Van Der Zee

We appreciate the opportunity to further clarify our findings in response to the insightful comments from Shota Yamamoto and colleagues and Luis Felipe Reyes and Ignacio Martin-Loeches regarding our recent community-acquired pneumonia (CAP) study. [...]

Switching from controlled to assisted mechanical ventilation

A multi-center retrospective study (SWITCH)

Journal article (2025) - Jim M. Smit , Jasper Van Bommel , Diederik A.M.P.J. Gommers , Marcel J.T. Reinders , Michel E. Van Genderen , Jesse H. Krijthe , Annemijn H. Jonkman

Background
Switching from controlled to assisted ventilation is crucial in the trajectory of intensive care unit (ICU) stay, but no guidelines exist. We described current practices, analyzed patient characteristics associated with switch success or failure, and explored the f ...

Background
Switching from controlled to assisted ventilation is crucial in the trajectory of intensive care unit (ICU) stay, but no guidelines exist. We described current practices, analyzed patient characteristics associated with switch success or failure, and explored the feasibility to predict switch failure.

Methods
In this retrospective study, we obtained highly granular longitudinal ICU data sets from three medical centers, covering demographics, severity scores, vital signs, ventilation, and laboratory parameters. The primary endpoint was switch success, considering a switch attempt to be successful if a patient did not return to controlled ventilation for the next 72 h while alive, and to be failed otherwise. We compared the characteristics of patients with successful vs. failed first switch attempts at ICU admission, immediately before, and 3 h after the attempt. We trained LASSO logistic regression models to predict switch failure.

Results
In 4524/6715 (67%) patients attempting a switch, the first attempt failed. The first switch attempt, regardless of success or failure, was generally made at normalized PaCO2 and pH levels, with PEEP < 10 cmH2O and PaO2/FiO2 indicating mild injury. Despite very similar baseline disease severity, switch failure was associated with significantly worse outcomes, including a 28-day mortality of 27% vs. 16% and median ventilator-free days of 16 vs. 22 (p < 0.001). Failed attempts were initiated significantly earlier than successful ones (median 1.8 vs. 1.3 days, p < 0.001). Before the switch, PaO2/FiO2, if measured at PEEP > 10 cmH2O, and respiratory system compliance was lower in patients with switch failure (median 185 vs. 205 mmHg, p < 0.001; 39 vs. 41 mL/cmH2O, P = 0.001), and post-switch, patients with switch failure experienced greater deterioration in gas exchange and minimal improvement in ventilatory parameters post-switch. Contrary to our hypotheses, patient characteristics for failed vs. successful switches were surprisingly similar, resulting in prediction models with limited discriminative performance.

Conclusions
Approximately two-thirds of attempts to switch patients to assisted ventilation fail, which are associated with significantly worse clinical outcomes, despite similar baseline disease severity. Contrary to our hypotheses, patients with successful and failed attempts showed similar characteristics, making switch failure difficult to predict. These findings underscore the importance of preventing switch failures and, given the retrospective nature of this study, highlight the need for prospective studies to better understand the reasons for switch failure and when spontaneous breathing can be safely initiated.

Are Interactive Visualizations in Machine Learning Education Helping Students?

Conference paper (2025) - Ilinca Rențea , Gosia Migut , Jesse Krijthe

With the fast integration of Machine Learning (ML) across industries, effective pedagogical strategies are essential for teaching this complex and evolving field. Machine Learning is now widely integrated into various university programs and introduced at earlier educational stag ...

Falsification of Unconfoundedness by Testing Independence of Causal Mechanisms

Journal article (2025) - Rickard K.A. Karlsson , Jesse H. Krijthe

A major challenge in estimating treatment effects in observational studies is the reliance on untestable conditions such as the assumption of no unmeasured confounding. In this work, we propose an algorithm that can falsify the assumption of no unmeasured confounding in a setting ...

A comparative study of methods for dynamic survival analysis

Journal article (2025) - Wieske K. de Swart , Marco Loog , Jesse H. Krijthe

Introduction: Dynamic survival analysis has become an effective approach for predicting time-to-event outcomes based on longitudinal data in neurology, cognitive health, and other health-related domains. With advancements in machine learning, several new methods have been introdu ...

Analyzing PaO₂/FiO₂?

Mind the interaction with PEEP!

Journal article (2025) - J. M. Smit , J. H. Krijthe , J. Van Bommel , M. E. Van Genderen , M. J.T. Reinders , A. H. Jonkman

Passive Monitoring of Parkinson Tremor in Daily Life

A Prototypical Network Approach

Journal article (2025) - Luc J.W. Evers , Yordan P. Raykov , Tom M. Heskes , Jesse H. Krijthe , Bastiaan R. Bloem , Max A. Little

Objective and continuous monitoring of Parkinson’s disease (PD) tremor in free-living conditions could benefit both individual patient care and clinical trials, by overcoming the snapshot nature of clinical assessments. To enable robust detection of tremor in the context of limit ...

Predicting benefit from adjuvant therapy with corticosteroids in community-acquired pneumonia

A data-driven analysis of randomised trials

Journal article (2025) - Jim M. Smit , Philip A. Van Der Zee , Dominic Snijders , Wim G. Boersma , Paola Confalonieri , Francesco Salton , Diederik A.M.P.J. Gommers , Marcel J.T. Reinders , Jesse H. Krijthe ,More Authors...

Background: Despite several randomised controlled trials (RCTs) on the use of adjuvant treatment with corticosteroids in patients with community-acquired pneumonia (CAP), the effect of this intervention on mortality remains controversial. We aimed to evaluate heterogeneity of tre ...

Background: Despite several randomised controlled trials (RCTs) on the use of adjuvant treatment with corticosteroids in patients with community-acquired pneumonia (CAP), the effect of this intervention on mortality remains controversial. We aimed to evaluate heterogeneity of treatment effect (HTE) of adjuvant treatment with corticosteroids on 30-day mortality in patients with CAP. Methods: In this individual patient data meta-analysis, we included RCTs published before July 1, 2024, comparing adjuvant treatment with corticosteroids versus placebo in patients hospitalised with CAP. The primary endpoint was 30-day all-cause mortality, collected across all trials, and analyses followed the intention-to-treat principle. We analysed HTE using risk and effect modelling. For risk modelling, patients were classified as having less severe or severe CAP based on the pneumonia severity index (PSI), comparing PSI class I–III versus class IV–V. For effect modelling, we trained a corticosteroid-effect model on six trials and externally validated it using data from two trials, received after model preregistration. This model classified patients into two groups: no predicted benefit and predicted benefit from adjuvant treatment with corticosteroids. The literature search was registered on PROSPERO, CRD42022380746. Findings: We included eight RCTs with 3224 patients. Across all eight trials, 246 (7·6%) patients died within 30 days (106 [6·6%] of 1618 in the corticosteroid group vs 140 [8·7%] of 1606 in the placebo group; odds ratio [OR] 0·72 [95% CI 0·56–0·94], p=0·017). The corticosteroid-effect model, which selected C-reactive protein (CRP), showed significant HTE during external validation in the two most recent trials. In these trials, 154 (11·4%) of 1355 patients died within 30 days (88 [13·1%] of 671 in the placebo group vs 66 [9·6%] of 684 in the corticosteroid group; OR 0·71 [95% CI 0·50–0·99], p=0·044). Among patients predicted to have no benefit (CRP ≤204 mg/L, n=725), no significant effect was observed (OR 0·98 [95% CI 0·63–1·50]), whereas for those with predicted benefit (CRP >204 mg/L, n=630), 39 (13·0%) of 301 patients died in the placebo group compared with 20 (6·1%) of 329 in the corticosteroid group (0·43 [0·25–0·76], p_interaction=0·026). No significant HTE was found between less severe CAP (PSI class I–III, n=229) and severe CAP (PSI class IV–V, n=1126). Corticosteroid therapy significantly increased hyperglycaemia risk (44 [12·8%] of 344 in the placebo group vs 84 [24·8%] of 339 in the corticosteroid group; OR 2·50 [95% CI 1·63–3·83], p<0·0001) and hospital re-admission risk (30 [3·7%] of 814 in the placebo group vs 57 [7·0%] of 819 in the corticosteroid group; 1·95 [1·24–3·07], p=0·0038). Interpretation: Overall, adjuvant therapy with corticosteroids significantly reduces 30-day mortality in patients hospitalised with CAP. The treatment effect varied significantly among subgroups based on CRP concentrations, with a substantial mortality reduction observed only in patients with high baseline CRP. Funding: None.

Causal clarity in statistical software

Journal article (2025) - Maurice N. Korf , Nan Van Geloven , Jesse H. Krijthe , Jeremy A. Labrecque

High-dimensional machine learning models for prediction of heart failure in more than 400 000 men and women from the UK Biobank

Journal article (2025) - Thomas F. Kok , Navin Suthahar , Jesse H. Krijthe , Rudolf A. De Boer , Eric Boersma , Isabella Kardys

Aims We aimed to compare performances of conventional survival models with machine learning (ML) survival models for incident heart failure (HF) in men and women without prevalent HF, cardiomyopathy (CM) or ischaemic heart disease (IHD), and to identify potential high-risk precur ...

Aims We aimed to compare performances of conventional survival models with machine learning (ML) survival models for incident heart failure (HF) in men and women without prevalent HF, cardiomyopathy (CM) or ischaemic heart disease (IHD), and to identify potential high-risk precursors overlooked by conventional survival models. Methods and results We predicted 10-year risk of incident HF in 266 306 women (2894 events) and 212 061 men (4213 events). We constructed multivariable Cox models, first using ∼ 400 baseline characteristics, and subsequently only those remaining after LASSO stability selection. We also used Random Survival Forest (RSF) and eXtreme Gradient Survival Boosting (XGBoost). Performances were assessed using internal cross validation and hold-out sets, with C-indices, calibration curves and net-benefit analyses. Model performances were comparable during internal validation: XGBoost (C-index ± SE) (men: 0.79 ± 0.0040, women: 0.83 ± 0.0023) showed similar performance to the multivariable Cox model (men: 0.80 ± 0.0031, women: 0.83 ± 0.0022) and Cox models after LASSO stability selection, while RSF showed numerically slightly lower performance (men: 0.78 ± 0.0025, women: 0.81 ± 0.0015). Findings were similar in the hold-out sets. Age, cystatin-C, lifetime treatments/medications, other heart disease, systolic blood pressure, and spirometry measures were identified as high-risk factors in both model types for both sexes. Additionally, sex-specific and model-specific risk factors were identified. Conclusion Machine learning models and Cox proportional hazard models performed well and similarly for 10-year incident HF risk prediction in the general population. However, sex-specific and model-specific risk predictors were found. Spirometry measures, rarely included in existing models, were identified as important risk factors. Our results suggest that ML models for HF prediction in the general population reveal insights that would otherwise remain unnoticed.

When accurate prediction models yield harmful self-fulfilling prophecies

Journal article (2025) - Wouter A.C. van Amsterdam , Nan van Geloven , Jesse H. Krijthe , Rajesh Ranganath , Giovanni Cinà

Prediction models are popular in medical research and practice. Many expect that by predicting patient-specific outcomes, these models have the potential to inform treatment decisions, and they are frequently lauded as instruments for personalized, data-driven healthcare. We show ...

Sub-phenotyping in critical care

A valuable strategy or methodologically fragile path?

Journal article (2025) - Jim M. Smit , Annemijn H. Jonkman , Jesse H. Krijthe

In her pioneering work, Calfee et al. [1] addressed the clinical and biological heterogeneity of acute respiratory distress syndrome (ARDS), a factor likely contributing to the poor track record of randomized trials (RCTs) in this patient population. Using latent class (or profil ...

Risk-Based Decision Making

Estimands for Sequential Prediction Under Interventions

Journal article (2024) - Kim Luijken , Paweł Morzywołek , Wouter van Amsterdam , Giovanni Cinà , Jeroen Hoogland , Ruth Keogh , Jesse H. Krijthe , Sara Magliacane , Nan van Geloven ,More authors...

Prediction models are used among others to inform medical decisions on interventions. Typically, individuals with high risks of adverse outcomes are advised to undergo an intervention while those at low risk are advised to refrain from it. Standard prediction models do not always ...

Causal inference using observational intensive care unit data

A scoping review and recommendations for future practice

Review (2023) - J. M. Smit , J. H. Krijthe , W. M.R. Kant , J. A. Labrecque , M. Komorowski , D. A.M.P.J. Gommers , J. van Bommel , M. J.T. Reinders , M. E. van Genderen

This scoping review focuses on the essential role of models for causal inference in shaping actionable artificial intelligence (AI) designed to aid clinicians in decision-making. The objective was to identify and evaluate the reporting quality of studies introducing models for ca ...

Putting Causal Identification to the Test: Falsification using Multi-Environment Data

Preprint (2023) - R.K.A. Karlsson , S. Creastă , J.H. Krijthe

We study the problem of falsifying the assumptions behind a set of broadly applied causal identification strategies: namely back-door adjustment, front-door adjustment, and instrumental variable estimation. While these assumptions are untestable from observational data in general ...

Also for k-means

More data does not imply better performance

Journal article (2023) - Marco Loog , Jesse H. Krijthe , Manuele Bicego

Arguably, a desirable feature of a learner is that its performance gets better with an increasing amount of training data, at least in expectation. This issue has received renewed attention in recent years and some curious and surprising findings have been reported on. In essence ...