M.J.T. Reinders
Please Note
210 records found
1
Advancing protein design is crucial for breakthroughs in medicine and biotechnology. Traditional approaches for protein sequence representation often rely solely on the 20 canonical amino acids, limiting the representation of non-canonical amino acids and residues that undergo post-translational modifications. This work explores discrete diffusion models for generating novel protein sequences using the all-atom chemical representation SELFIES. By encoding the atomic composition of each amino acid in the protein, this approach expands the design possibilities beyond standard sequence representations. Using a modified ByteNet architecture within the discrete diffusion D3PM framework, we evaluate the impact of this all-atom representation on protein quality, diversity, and novelty, compared to conventional amino acid-based models. To this end, we develop a comprehensive assessment pipeline to determine whether generated SELFIES sequences translate into valid proteins containing both canonical and non-canonical amino acids. Additionally, we examine the influence of two noise schedules within the diffusion process—uniform (random replacement of tokens) and absorbing (progressive masking)—on generation performance. While models trained on the all-atom representation struggle to consistently generate fully valid proteins, the successfully generated proteins show improved novelty and diversity compared to their amino acid-based model counterparts. Furthermore, the all-atom representation achieves structural foldability results comparable to those of amino acid-based models. Lastly, our results highlight the absorbing noise schedule as the most effective for both representations. Data and code are available at https://github.com/Intelligent-molecular-systems/All-Atom-Protein-Sequence-Generation.
BACKGROUND: Alternative splicing contributes to molecular diversity across brain cell types. RNA-binding proteins (RBPs) regulate splicing, but the genome-wide mechanisms underlying cell-type-specific splicing remain poorly understood. RESULTS: Here, we want to unravel cell-type-specific splicing mechanisms by using RBP binding sites and/or the genomic sequence to predict exon inclusion in neurons and glia as measured by long-read single-cell data in the human hippocampus and frontal cortex. We found that exon inclusion of variable exons is harder to predict in neurons compared to glia in both brain regions. Comparing neurons and glia, the position of RBP binding sites in alternatively spliced exons in neurons differ more from non-variable exons indicating distinct splicing mechanisms. Model interpretation pinpointed RBPs, including QKI, potentially regulating alternative splicing between neurons and glia. Finally, we accurately predict and prioritize the effect of splicing QTLs. CONCLUSIONS: Our results indicate that the splicing mechanisms in variable exons in neurons diverged more from the standard mechanisms. Splicing in neurons might be less sequence-dependent and influenced more by, for instance, chromatin accessibility or methylation. Taken together, these results highlight new insights into the mechanisms regulating cell-type-specific alternative splicing in the brain.
Background: Nutritional weight-loss interventions are known to reduce bone mineral density (BMD), which can be prevented by adding (resistance) exercise training. However, this combined effect is not well studied in non-obese adults. In addition, the association between biomarkers and metabolite-based composite health markers with changes in BMD in such an intervention has not been studied as thoroughly. Objective: The aims of the current study were to investigate the effect of a combined nutritional and activity lifestyle intervention on lumbar spine and total body BMD in healthy middle-aged to older adults, and to relate these effects to a selection of immune-metabolic biomarkers, muscle mass and fat mass measurements, and two composite metabolite-based health scores. Methods: In this ancillary study of the single-arm Growing Old TOgether (GOTO) trial (trial registration number GOTNL3301 [https://onderzoekmetmensen.nl/nl/trial/27183], NL-OMON27183), 134 participants (mean age 62.9 years, 49% female) undertook a 13-week lifestyle modification, incorporating 12.5% caloric restriction and 12.5% increase in physical activity. The impact on lumbar spine and total body BMD was evaluated using dual-energy X-ray absorptiometry (DEXA). The intervention effect on BMD was related to changes in immune-metabolic biomarkers and two metabolite-based immune-metabolic health scores. Results: The trial significantly reduced bodyweight with 3.3 and 3.4 kg, consisting of 1.4 and 1.1 kg lean mass, in males (fdr < 0.001) and females (fdr < 0.001), respectively. Lean mass reduced by 1.4 kg in males (fdr < 0.001) and 1.1 kg in females (fdr < 0.001), whereas total body fat% reduced significantly with −1.5% (fdr < 0.001) in males and −1.5% (fdr < 0.001) in females. In males, lumbar spine BMD increased with 3.0% (fdr < 0.001) and total body BMD with 0.7% (fdr = 0.002). In females, the lumbar spine BMD had a trend in the upwards direction (1.2%, fdr = 0.09) and the total body BMD remained stable (0.4%, fdr = 0.07). In males, the increase in lumbar spine BMD was significantly associated with decreased weight (fdr = 0.001) and with decreased body and trunk fat% (fdr = 0.001, fdr = 0.001) and improved immune-metabolic health (fdr = 0.02). Males with higher BMD but a poor metabolite-based health score at baseline had a stronger increase in lumbar spine BMD (fdr = 0.03). Conclusions: A combined nutritional and activity lifestyle intervention significantly improved BMD of males with good bone health at baseline while at the same time improving metabolic health. Nutritional weight-loss interventions may not harm BMD when combined with exercise.
Rheumatic Digital Twin
Proposed Machine Learning–Based Multimodal Framework to Inform Clinical Decision-Making
Rheumatic diseases are chronic, immune-mediated conditions characterized by significant heterogeneity in presentation and disease course. However, current clinical approaches often rely on snapshot-based assessments that fail to capture the complex longitudinal evolution of these conditions. To address these limitations and support the implementation of precision medicine, we present the design for the Rheumatic Digital Twin, a novel, modular conceptual framework intended to integrate heterogeneous multimodal data, ranging from electronic health records and clinical notes to imaging and omics, into a dynamic, computational representation of the patient journey. Our theoretical architecture addresses challenges related to data silos and variable availability of data modalities through a multistage approach that envisions the use of domain-specific foundation models to independently process distinct data modalities. To effectively model the temporal progression inherent in chronic diseases, the proposed design utilizes Transformer architectures, leveraging self-attention mechanisms to treat patient events, such as lab results or medication changes, as sequential data tokens. We describe how these unimodal representations would subsequently be fused via joint embedding techniques to construct a shared, multimodal representational space. Envisioned to function analogously to a recommender system, the Rheumatic Digital Twin framework is modeled to map patients into a latent space where proximity reflects clinical and biological similarity. By identifying “nearest neighbors,” historical patients with comparable trajectories, the system aims to enable in silico cohorting, theoretically allowing clinicians to forecast key clinical events, predict treatment responses, and identify likely disease courses based on the outcomes of similar peers.
Many molecular aging biomarkers have been developed to capture heterogeneity in individual aging rates. Yet, systematic comparison of the modeling choices underlying these biomarkers has been limited. In this study, we trained aging biomarkers on the Rockwood frailty index (FI) and all-cause mortality using UK Biobank Olink proteomics and metabolomics (1H-NMR) data (n = 40,696). We systematically established the impact of model choice, target outcome, and molecular data source on several age-related outcomes. From this, we developed two aging biomarkers, ProteinFrailty (ProtFI) and ProteinMortality (ProtMort), which are both ElasticNet models that use a minimal set of proteins to predict FI and mortality, respectively. In particular, ProtFI outperformed established aging biomarkers in relation to diverse outcomes, including incident cardiovascular disease, handgrip strength, and self-rated health, both in internal validation and two Dutch external cohorts (n = 995, n = 500). Our findings show that an efficient frailty-trained proteomic biomarker robustly predicts age-related decline.
Work smarter, not harder
Achieve expert-level diagnosis extraction from medical records with optimal prompting of large language models
Traditional statistical approaches have advanced our understanding of the genetics of complex diseases, yet are limited to linear additive models. Here we applied machine learning (ML) to genome-wide data from 41,686 individuals in the largest European consortium on Alzheimer’s disease (AD) to investigate the effectiveness of various ML algorithms in replicating known findings, discovering novel loci, and predicting individuals at risk. We utilised Gradient Boosting Machines (GBMs), biological pathway-informed Neural Networks (NNs), and Model-based Multifactor Dimensionality Reduction (MB-MDR) models. ML approaches successfully captured all genome-wide significant genetic variants identified in the training set and 22% of associations from larger meta-analyses. They highlight 6 novel loci which replicate in an external dataset, including variants which map to ARHGAP25, LY6H, COG7, SOD1 and ZNF597. They further identify novel association in AP4E1, refining the genetic landscape of the known SPPL2A locus. Our results demonstrate that machine learning methods can achieve predictive performance comparable to classical approaches in genetic epidemiology and have the potential to uncover novel loci that remain undetected by traditional GWAS. These insights provide a complementary avenue for advancing the understanding of AD genetics.
Switching from controlled to assisted mechanical ventilation
A multi-center retrospective study (SWITCH)
Switching from controlled to assisted ventilation is crucial in the trajectory of intensive care unit (ICU) stay, but no guidelines exist. We described current practices, analyzed patient characteristics associated with switch success or failure, and explored the feasibility to predict switch failure.
Methods
In this retrospective study, we obtained highly granular longitudinal ICU data sets from three medical centers, covering demographics, severity scores, vital signs, ventilation, and laboratory parameters. The primary endpoint was switch success, considering a switch attempt to be successful if a patient did not return to controlled ventilation for the next 72 h while alive, and to be failed otherwise. We compared the characteristics of patients with successful vs. failed first switch attempts at ICU admission, immediately before, and 3 h after the attempt. We trained LASSO logistic regression models to predict switch failure.
Results
In 4524/6715 (67%) patients attempting a switch, the first attempt failed. The first switch attempt, regardless of success or failure, was generally made at normalized PaCO2 and pH levels, with PEEP < 10 cmH2O and PaO2/FiO2 indicating mild injury. Despite very similar baseline disease severity, switch failure was associated with significantly worse outcomes, including a 28-day mortality of 27% vs. 16% and median ventilator-free days of 16 vs. 22 (p < 0.001). Failed attempts were initiated significantly earlier than successful ones (median 1.8 vs. 1.3 days, p < 0.001). Before the switch, PaO2/FiO2, if measured at PEEP > 10 cmH2O, and respiratory system compliance was lower in patients with switch failure (median 185 vs. 205 mmHg, p < 0.001; 39 vs. 41 mL/cmH2O, P = 0.001), and post-switch, patients with switch failure experienced greater deterioration in gas exchange and minimal improvement in ventilatory parameters post-switch. Contrary to our hypotheses, patient characteristics for failed vs. successful switches were surprisingly similar, resulting in prediction models with limited discriminative performance.
Conclusions
Approximately two-thirds of attempts to switch patients to assisted ventilation fail, which are associated with significantly worse clinical outcomes, despite similar baseline disease severity. Contrary to our hypotheses, patients with successful and failed attempts showed similar characteristics, making switch failure difficult to predict. These findings underscore the importance of preventing switch failures and, given the retrospective nature of this study, highlight the need for prospective studies to better understand the reasons for switch failure and when spontaneous breathing can be safely initiated. ...
Switching from controlled to assisted ventilation is crucial in the trajectory of intensive care unit (ICU) stay, but no guidelines exist. We described current practices, analyzed patient characteristics associated with switch success or failure, and explored the feasibility to predict switch failure.
Methods
In this retrospective study, we obtained highly granular longitudinal ICU data sets from three medical centers, covering demographics, severity scores, vital signs, ventilation, and laboratory parameters. The primary endpoint was switch success, considering a switch attempt to be successful if a patient did not return to controlled ventilation for the next 72 h while alive, and to be failed otherwise. We compared the characteristics of patients with successful vs. failed first switch attempts at ICU admission, immediately before, and 3 h after the attempt. We trained LASSO logistic regression models to predict switch failure.
Results
In 4524/6715 (67%) patients attempting a switch, the first attempt failed. The first switch attempt, regardless of success or failure, was generally made at normalized PaCO2 and pH levels, with PEEP < 10 cmH2O and PaO2/FiO2 indicating mild injury. Despite very similar baseline disease severity, switch failure was associated with significantly worse outcomes, including a 28-day mortality of 27% vs. 16% and median ventilator-free days of 16 vs. 22 (p < 0.001). Failed attempts were initiated significantly earlier than successful ones (median 1.8 vs. 1.3 days, p < 0.001). Before the switch, PaO2/FiO2, if measured at PEEP > 10 cmH2O, and respiratory system compliance was lower in patients with switch failure (median 185 vs. 205 mmHg, p < 0.001; 39 vs. 41 mL/cmH2O, P = 0.001), and post-switch, patients with switch failure experienced greater deterioration in gas exchange and minimal improvement in ventilatory parameters post-switch. Contrary to our hypotheses, patient characteristics for failed vs. successful switches were surprisingly similar, resulting in prediction models with limited discriminative performance.
Conclusions
Approximately two-thirds of attempts to switch patients to assisted ventilation fail, which are associated with significantly worse clinical outcomes, despite similar baseline disease severity. Contrary to our hypotheses, patients with successful and failed attempts showed similar characteristics, making switch failure difficult to predict. These findings underscore the importance of preventing switch failures and, given the retrospective nature of this study, highlight the need for prospective studies to better understand the reasons for switch failure and when spontaneous breathing can be safely initiated.
Federated learning is an upcoming machine learning paradigm which allows data from multiple sources to be used for training of classifiers without the data leaving the source it originally resides. This can be highly valuable for use cases such as medical research, where gathering data at a central location can be quite complicated due to privacy and legal concerns of the data. In such cases, federated learning has the potential to vastly speed up the research cycle. Although federated and central learning have been compared from a theoretical perspective, an extensive experimental comparison of performances and learning behavior still lacks. We have performed a comprehensive experimental comparison between federated and centralized learning. We evaluated various classifiers on various datasets exploring influences of different sample distributions as well as different class distributions across the clients. The results show similar performances under a wide variety of settings between the federated and central learning strategies. Federated learning is able to deal with various imbalances in the data distributions. It is sensitive to batch effects between different datasets when they coincide with location, similar to central learning, but this setting might go unobserved more easily. Federated learning seems to be robust to various challenges such as skewed data distributions, high data dimensionality, multiclass problems, and complex models. Taken together, the insights from our comparison gives much promise for applying federated learning as an alternative to sharing data. Code for reproducing the results in this work can be found at: https://github.com/swiergarst/FLComparison.
Rheumatoid arthritis (RA) is a heterogeneous disease with variable symptoms, prognosis, and treatment response, necessitating refined patient classification. We applied multimodal deep learning and clustering to identify distinct RA phenotypes using baseline clinical data from 1,387 patients in the Leiden Rheumatology clinic. Four Joint Involvement Patterns (JIP) emerged: foot-predominant arthritis, seropositive oligoarticular disease, seronegative hand arthritis, and polyarthritis. Findings were validated in clinical trial data (n = 307) and an independent secondary care cohort (n = 515). Clusters showed high stability and significant differences in remission rates (P = 0.007) and methotrexate failure (P < 0.001). JIP-hand patients had superior outcomes (particularly in ACPA-positive patients) versus JIP-foot (HR:0.37, P < 0.001) and JIP-poly (HR:0.33, P = 0.005), independent of baseline disease activity and clinical markers. Synovial histology analysis (n = 194) revealed distinct inflammatory patterns across clusters, hinting at different underlying biological mechanisms. These validated RA phenotypes based on joint involvement patterns may enable targeted research into disease mechanisms and personalized treatment strategies.
C-reactive protein-guided treatment in pneumonia
Charting a personalised approach – Authors’ reply
Background: Fractional exhaled nitric oxide (FeNO) is a noninvasive method to determine the degree of airway inflammation. Handheld devices such as the Vivatmo Me are used for home monitoring. Differences were found between the Vivatmo Me and standard measurements with the NIOX VERO. Therefore, we aimed to determine the accuracy of the Vivatmo Me for FeNO measurements. Methods: Adult patients with an appointment for FeNO-measurement according to regular care, were invited to perform the FeNO measurement with both devices. From these measurements the FeNO values were compared, and the device user-friendliness was determined. Results: One hundred and sixty-four patients were included. The number of attempts needed for a successful measurement and the failure rate were higher with the Vivatmo Me. Although the measurements were highly correlated, a significant difference (p < 0.001) was found between FeNO values measured with both devices. From the Vivatmo measurements, 32% did not fall within the claimed accuracy ranges. A linear correction on the FeNO values reduced this number. Conclusion: Our findings indicate that the Vivatmo Me does not comply with the claimed accuracy of clinical FeNO measurements and the measurement is challenging to perform. By applying the proposed correction, the comparative validity of the FeNO measurement improves and therefore its clinical usefulness.
PredLyP
A computational tool for predicting tissue-specific (phago-)lysosomal post-digestion peptides
Peptides are versatile tools in immunotherapy, serving as vaccines and targets for specific immunotherapeutic strategies. Peptides engage immune cells like macrophages and T cells, enabling precise modulation of immune responses. In this context, we highlight the utility of macrophages, innate immune cells involved in constant surveillance, for detecting their phagolysosomal content as a minimally-invasive biomarker strategy. Analyzing proteolytic patterns in phagolysosomes offers a high-sensitivity approach to assess tissue homeostasis and tissue disruption, such as in cancer. Despite their potential, a major challenge lies in the lack of comprehensive tools for predicting cutting sites across phagolysosomal proteases. Therefore, we developed the computational tool PredLyP (abbreviation for “prediction of lysosomal proteases”) to identify cutting sites of phagolysosomal proteases, which are essential enzymes involved in protein degradation within (phago)lysosomes, to predict the potential peptides generated from the input proteins. Unlike existing tools, PredLyP utilizes Position Specific Scoring Matrices derived from amino acid sequences, physical (charge and hydropathy) and structural (secondary structure and solvent accessibility) features. Moreover, it incorporates a sequential cutter functionality that mimics the ordered action of proteases, providing predictive insights into substrate fragment generation. Comparisons with other tools demonstrate the superior sensitivity of PredLyP, enabling accurate prediction of complete and partial digestion fragments, a critical requirement for real-world applications in proteomics, antibody development, and immune system research. Overall, PredLyP represents a robust tool for advancing our understanding of proteolytic processes in phagolysosomes and their implications in health and disease.
BADDADAN
Mechanistic modelling of time-series gene module expression
Plants respond to stresses like drought and heat through complex gene regulatory networks (GRNs). To improve resilience, understanding these is crucial, but large-scale GRNs (>100 genes) are difficult to model using ordinary differential equations (ODEs) due to the high number of parameters that have to be estimated. Here we solve this problem by introducing BADDADAN, which uses machine learning to identify gene modules—groups of co-expressed and/or co-regulated genes—and constructs an ODE model that predicts gene module dynamics under stress. By integrating time-series gene expression data with prior co-expression data it finds modules that are both coherent and interpretable. We demonstrate BADDADAN on heat and drought datasets of A. thaliana, modelling over 1,000 genes, recovering known mechanistic insights, and proposing new hypotheses. By combining machine learning with mechanistic modelling, BADDADAN deepens our understanding of stress-related GRNs in plants and potentially other organisms.
Genetic testing of common and rare variants in dementia patients from a memory clinic
Dementia-related genetic testing in memory clinic
Many types of dementia have high heritability, which creates opportunities for DNA diagnostics. Clinicians sporadically test for causal genetic variants. However, in addition to causal genetic mutations, an increasing number of both common and rare risk factors are being identified, especially for Alzheimer’s disease (AD). Here, we describe and evaluate diagnostic performance of combining genetic risk factors for AD to assist memory clinic clinicians.
Methods
A retrospective analysis of 998 consecutive patients (mean age 62.1, 40.3% females, 63.3% dementia) was conducted over 2.5 years in a Dutch memory clinic. The patients underwent a complete genetic risk assessment, including whole-exome sequencing and array genotyping. We examined known pathogenic genetic variants for all dementia types and their correlation with clinical diagnoses. We evaluated a combined genetic score (GS) based on all genetic risk factors for AD - namely APOE genotypes, candidate risk rare variants in 11 genes, and a polygenic risk score (PRS) based on 82 common variants. Then, we analyzed the discriminatory characteristics of the GS.
Results
Causal pathogenic variants were rare, present in 3.4% of individuals, but genetic testing would have altered the diagnosis in over half of the carriers. Candidate rare risk variants were more common, identified in 31.6% of patients. Both APOE genotypes and the PRS were independently associated with AD, and gene-specific interaction was found between TREM2 and AD-PRS (β = -1.16, p = 0.015). Patients with a high GS were 7 times more likely receive an AD diagnosis compared to those with a low GS (p = 2.5E-07).
Conclusion
Overall, this study highlights the potential of integrating genetic risk factors into clinical practice to enhance AD diagnosis, though the improvement in diagnostic accuracy was moderate. The findings underscore the importance of genetic testing in diagnosis while also recognizing its limitations. ...
Many types of dementia have high heritability, which creates opportunities for DNA diagnostics. Clinicians sporadically test for causal genetic variants. However, in addition to causal genetic mutations, an increasing number of both common and rare risk factors are being identified, especially for Alzheimer’s disease (AD). Here, we describe and evaluate diagnostic performance of combining genetic risk factors for AD to assist memory clinic clinicians.
Methods
A retrospective analysis of 998 consecutive patients (mean age 62.1, 40.3% females, 63.3% dementia) was conducted over 2.5 years in a Dutch memory clinic. The patients underwent a complete genetic risk assessment, including whole-exome sequencing and array genotyping. We examined known pathogenic genetic variants for all dementia types and their correlation with clinical diagnoses. We evaluated a combined genetic score (GS) based on all genetic risk factors for AD - namely APOE genotypes, candidate risk rare variants in 11 genes, and a polygenic risk score (PRS) based on 82 common variants. Then, we analyzed the discriminatory characteristics of the GS.
Results
Causal pathogenic variants were rare, present in 3.4% of individuals, but genetic testing would have altered the diagnosis in over half of the carriers. Candidate rare risk variants were more common, identified in 31.6% of patients. Both APOE genotypes and the PRS were independently associated with AD, and gene-specific interaction was found between TREM2 and AD-PRS (β = -1.16, p = 0.015). Patients with a high GS were 7 times more likely receive an AD diagnosis compared to those with a low GS (p = 2.5E-07).
Conclusion
Overall, this study highlights the potential of integrating genetic risk factors into clinical practice to enhance AD diagnosis, though the improvement in diagnostic accuracy was moderate. The findings underscore the importance of genetic testing in diagnosis while also recognizing its limitations.
Background and ObjectivesIdentifying genetic causes of dementia in patients visiting memory clinics is important for patient care and family planning. Traditional clinical selection criteria for genetic testing may miss carriers of pathogenic variants in dementia-related genes. This study aimed identify how many carriers we are missing and to optimize criteria for selecting patients for genetic counseling in memory clinics.MethodsIn this clinical cohort study, we retrospectively genetically tested patients during 2.5 years (2010-2012) visiting the Alzheimer Center Amsterdam, a specialized memory clinic. Genetic tests consisted of a 54-gene dementia panel, focusing on Class IV/V variants per American College of Medical Genetics and Genomics guidelines, including APP duplications and the C9ORF72 repeat expansion. We determined the prevalence of pathogenic variants and propose new eligibility criteria for genetic testing in memory clinics. The eligibility criteria were prospectively applied for 1 year (2021-2022), and results were compared with the retrospective cohort.ResultsGenetic tests were retrospectively performed in in 1,022 of 1,138 patients (90%) who consecutively visited the memory clinic. Among these, 1,022 patients analyzed (mean age 62.1 ± 8.9 years; 40.4% were female), 34 pathogenic variant carriers were identified (3.3%), with 24 being symptomatic. Previous clinical criteria would have identified only 15 carriers (44% of all carriers, 65% of symptomatic carriers). The proposed criteria increased identification to 22 carriers (62.5% of all carriers, 91% of symptomatic carriers). In the prospective cohort, 148 (28.7%) of 515 patients were eligible for testing under the new criteria. Of the 90 eligible patients who consented to testing, 13 pathogenic carriers were identified, representing a 73% increase compared with the previous criteria.DiscussionWe found that patients who visit a memory clinic and carry a pathogenic genetic variant are often not eligible for genetic testing. The proposed new criteria improve the identification of patients with a genetic cause for their cognitive complaints. In systems without practical or financial barriers to genetic testing, the new criteria can enhance personalized care. In other countries where the health care systems differs and in other genetic ancestry groups, the performance of the criteria may be different.