D. de Ridder | TU Delft Repository

PCADD

SNV prioritisation in Sus scrofa

Journal article (2020) - Christian Groß, Martijn Derks, Hendrik Jan Megens, Mirte Bosse, Martien A.M. Groenen, Marcel Reinders, Dick De Ridder

Background: In animal breeding, identification of causative genetic variants is of major importance and high economical value. Usually, the number of candidate variants exceeds the number of variants that can be validated. One way of prioritizing probable candidates is by evaluating their potential to have a deleterious effect, e.g. by predicting their consequence. Due to experimental difficulties to evaluate variants that do not cause an amino-acid substitution, other prioritization methods are needed. For human genomes, the prediction of deleterious genomic variants has taken a step forward with the introduction of the combined annotation dependent depletion (CADD) method. In theory, this approach can be applied to any species. Here, we present pCADD (p for pig), a model to score single nucleotide variants (SNVs) in pig genomes. Results: To evaluate whether pCADD captures sites with biological meaning, we used transcripts from miRNAs and introns, sequences from genes that are specific for a particular tissue, and the different sites of codons, to test how well pCADD scores differentiate between functional and non-functional elements. Furthermore, we conducted an assessment of examples of non-coding and coding SNVs, which are causal for changes in phenotypes. Our results show that pCADD scores discriminate between functional and non-functional sequences and prioritize functional SNVs, and that pCADD is able to score the different positions in a codon relative to their redundancy. Taken together, these results indicate that based on pCADD scores, regions with biological relevance can be identified and distinguished according to their rate of adaptation. Conclusions: We present the ability of pCADD to prioritize SNVs in the pig genome with respect to their putative deleteriousness, in accordance to the biological significance of the region in which they are located. We created scores for all possible SNVs, coding and non-coding, for all autosomes and the X chromosome of the pig reference sequence Sscrofa11.1, proposing a toolbox to prioritize variants and evaluate sequences to highlight new sites of interest to explain biological functions that are relevant to animal breeding. ...

Background: In animal breeding, identification of causative genetic variants is of major importance and high economical value. Usually, the number of candidate variants exceeds the number of variants that can be validated. One way of prioritizing probable candidates is by evaluating their potential to have a deleterious effect, e.g. by predicting their consequence. Due to experimental difficulties to evaluate variants that do not cause an amino-acid substitution, other prioritization methods are needed. For human genomes, the prediction of deleterious genomic variants has taken a step forward with the introduction of the combined annotation dependent depletion (CADD) method. In theory, this approach can be applied to any species. Here, we present pCADD (p for pig), a model to score single nucleotide variants (SNVs) in pig genomes. Results: To evaluate whether pCADD captures sites with biological meaning, we used transcripts from miRNAs and introns, sequences from genes that are specific for a particular tissue, and the different sites of codons, to test how well pCADD scores differentiate between functional and non-functional elements. Furthermore, we conducted an assessment of examples of non-coding and coding SNVs, which are causal for changes in phenotypes. Our results show that pCADD scores discriminate between functional and non-functional sequences and prioritize functional SNVs, and that pCADD is able to score the different positions in a codon relative to their redundancy. Taken together, these results indicate that based on pCADD scores, regions with biological relevance can be identified and distinguished according to their rate of adaptation. Conclusions: We present the ability of pCADD to prioritize SNVs in the pig genome with respect to their putative deleteriousness, in accordance to the biological significance of the region in which they are located. We created scores for all possible SNVs, coding and non-coding, for all autosomes and the X chromosome of the pig reference sequence Sscrofa11.1, proposing a toolbox to prioritize variants and evaluate sequences to highlight new sites of interest to explain biological functions that are relevant to animal breeding.

Global DNA Compaction in Stationary-Phase Bacteria Does Not Affect Transcription

Journal article (2018) - Richard Janissen, Mathia M.A. Arens, Nynke H. Dekker, Elio A. Abbondanzieri, Anne S. Meyer, Natalia N. Vtyurina, Zaïda Rivai, Nicholas D. Sunday, Behrouz Eslami-Mossallam, Alexey A. Gritsenko, Liedewij Laan, Dick de Ridder, Irina Artsimovitch

In stationary-phase Escherichia coli, Dps (DNA-binding protein from starved cells) is the most abundant protein component of the nucleoid. Dps compacts DNA into a dense complex and protects it from damage. Dps has also been proposed to act as a global regulator of transcription. Here, we directly examine the impact of Dps-induced compaction of DNA on the activity of RNA polymerase (RNAP). Strikingly, deleting the dps gene decompacted the nucleoid but did not significantly alter the transcriptome and only mildly altered the proteome during stationary phase. Complementary in vitro assays demonstrated that Dps blocks restriction endonucleases but not RNAP from binding DNA. Single-molecule assays demonstrated that Dps dynamically condenses DNA around elongating RNAP without impeding its progress. We conclude that Dps forms a dynamic structure that excludes some DNA-binding proteins yet allows RNAP free access to the buried genes, a behavior characteristic of phase-separated organelles. Despite markedly condensing the bacterial chromosome, the nucleoid-structuring protein Dps selectively allows access by RNA polymerase and transcription factors at normal rates while excluding other factors such as restriction endonucleases. ...

A survey of functional genomic variation in domesticated chickens

Journal article (2018) - Martijn F.L. Derks, Hendrik-Jan Megens, Martien A.M. Groenen, Mirte Bosse, Jeroen Visscher, Katrijn Peeters, Marco C.A.M. Bink, Addie Vereijken, Christian Gross, Dick de Ridder, Marcel Reinders

Background: Deleterious genetic variation can increase in frequency as a result of mutations, genetic drift, and genetic hitchhiking. Although individual effects are often small, the cumulative effect of deleterious genetic variation can impact population fitness substantially. In this study, we examined the genome of commercial purebred chicken lines for deleterious and functional variations, combining genotype and whole‑genome sequence data.
Results: We analysed over 22,000 animals that were genotyped on a 60 K SNP chip from four purebred lines (two white egg and two brown egg layer lines) and two crossbred lines. We identified 79 haplotypes that showed a significant deficit in homozygous carriers. This deficit was assumed to stem from haplotypes that potentially harbour lethal recessive variations. To identify potentially deleterious mutations, a catalogue of over 10 million variants was derived from 250 whole‑genome sequenced animals from three purebred white‑egg layer lines. Out of 4219 putative delete rious variants, 152 mutations were identified that likely induce embryonic lethality in the homozygous state. Inferred deleterious variation showed evidence of purifying selection and deleterious alleles were generally overrepresented in regions of low recombination. Finally, we found evidence that mutations, which were inferred to be evolutionally intolerant, likely have positive effects in commercial chicken populations.
Conclusions: We present a comprehensive genomic perspective on deleterious and functional genetic variation in egg layer breeding lines, which are under intensive selection and characterized by a small effective population size. We show that deleterious variation is subject to purifying selection and that there is a positive relationship between recombination rate and purging efficiency. In addition, multiple putative functional coding variants were discovered in selective sweep regions, which are likely under positive selection. Together, this study provides a unique molecular
perspective on functional and deleterious variation in commercial egg‑laying chickens, which can enhance current genomic breeding practices to lower the frequency of undesirable variants in the population.

...

Background: Deleterious genetic variation can increase in frequency as a result of mutations, genetic drift, and genetic hitchhiking. Although individual effects are often small, the cumulative effect of deleterious genetic variation can impact population fitness substantially. In this study, we examined the genome of commercial purebred chicken lines for deleterious and functional variations, combining genotype and whole‑genome sequence data.
Results: We analysed over 22,000 animals that were genotyped on a 60 K SNP chip from four purebred lines (two white egg and two brown egg layer lines) and two crossbred lines. We identified 79 haplotypes that showed a significant deficit in homozygous carriers. This deficit was assumed to stem from haplotypes that potentially harbour lethal recessive variations. To identify potentially deleterious mutations, a catalogue of over 10 million variants was derived from 250 whole‑genome sequenced animals from three purebred white‑egg layer lines. Out of 4219 putative delete rious variants, 152 mutations were identified that likely induce embryonic lethality in the homozygous state. Inferred deleterious variation showed evidence of purifying selection and deleterious alleles were generally overrepresented in regions of low recombination. Finally, we found evidence that mutations, which were inferred to be evolutionally intolerant, likely have positive effects in commercial chicken populations.
Conclusions: We present a comprehensive genomic perspective on deleterious and functional genetic variation in egg layer breeding lines, which are under intensive selection and characterized by a small effective population size. We show that deleterious variation is subject to purifying selection and that there is a positive relationship between recombination rate and purging efficiency. In addition, multiple putative functional coding variants were discovered in selective sweep regions, which are likely under positive selection. Together, this study provides a unique molecular
perspective on functional and deleterious variation in commercial egg‑laying chickens, which can enhance current genomic breeding practices to lower the frequency of undesirable variants in the population.

Predicting variant deleteriousness in non-human species

Applying the CADD approach in mouse

Journal article (2018) - Christian Groß, Dick de Ridder, Marcel Reinders

Background: Predicting the deleteriousness of observed genomic variants has taken a step forward with the introduction of the Combined Annotation Dependent Depletion (CADD) approach, which trains a classifier on the wealth of available human genomic information. This raises the question whether it can be done with less data for non-human species. Here, we investigate the prerequisites to construct a CADD-based model for a non-human species. Results: Performance of the mouse model is competitive with that of the human CADD model and better than established methods like PhastCons conservation scores and SIFT. Like in the human case, performance varies for different genomic regions and is best for coding regions. We also show the benefits of generating a species-specific model over lifting variants to a different species or applying a generic model. With fewer genomic annotations, performance on the test set as well as on the three validation sets is still good. Conclusions: It is feasible to construct species-specific CADD models even when annotations such as epigenetic markers are not available. The minimal requirement for these models is the availability of a set of genomes of closely related species that can be used to infer an ancestor genome and substitution rates for the data generation. ...

Sequence features of viral and human Internal Ribosome Entry Sites predictive of their activity

Journal article (2017) - Alexey A. Gritsenko, Shira Weingarten-Gabbay, Shani Elias-Kirma, Ronit Nir, Dick de Ridder, Eran Segal

Translation of mRNAs through Internal Ribosome Entry Sites (IRESs) has emerged as a prominent mechanism of cellular and viral initiation. It supports cap-independent translation of select cellular genes under normal conditions, and in conditions when cap-dependent translation is inhibited. IRES structure and sequence are believed to be involved in this process. However due to the small number of IRESs known, there have been no systematic investigations of the determinants of IRES activity. With the recent discovery of thousands of novel IRESs in human and viruses, the next challenge is to decipher the sequence determinants of IRES activity. We present the first in-depth computational analysis of a large body of IRESs, exploring RNA sequence features predictive of IRES activity. We identified predictive k-mer features resembling IRES trans-acting factor (ITAF) binding motifs across human and viral IRESs, and found that their effect on expression depends on their sequence, number and position. Our results also suggest that the architecture of retroviral IRESs differs from that of other viruses, presumably due to their exposure to the nuclear environment. Finally, we measured IRES activity of synthetically designed sequences to confirm our prediction of increasing activity as a function of the number of short IRES elements. ...

Classification, Parameter Estimation and State Estimation

An Engineering Approach Using MATLAB, 2nd Edition

Book (2017) - Bangjun Lei, Guangzhu Xu, Ming Feng, Yaobin Zou, Ferdinand van der Heijden, D. de Ridder, David Tax

A practical introduction to intelligent computer vision theory, design, implementation, and technology.

The past decade has witnessed epic growth in image processing and intelligent computer vision technology. Advancements in machine learning methods-especially among adaboost varieties and particle filtering methods-have made machine learning in intelligent computer vision more accurate and reliable than ever before. The need for expert coverage of the state of the art in this burgeoning field has never been greater, and this book satisfies that need. Fully updated and extensively revised, this 2nd Edition of the popular guide provides designers, data analysts, researchers and advanced post-graduates with a fundamental yet wholly practical introduction to intelligent computer vision. The authors walk you through the basics of computer vision, past and present, and they explore the more subtle intricacies of intelligent computer vision, with an emphasis on intelligent measurement systems. Using many timely, real-world examples, they explain and vividly demonstrate the latest developments in image and video processing techniques and technologies for machine learning in computer vision systems, including:

PRTools5 software for MATLAB-especially the latest representation and generalization software toolbox for PRTools5
Machine learning applications for computer vision, with detailed discussions of contemporary state estimation techniques vs older content of particle filter methods
The latest techniques for classification and supervised learning, with an emphasis on Neural Network, Genetic State Estimation and other particle filter and AI state estimation methods
All new coverage of the Adaboost and its implementation in PRTools5.

A valuable working resource for professionals and an excellent introduction for advanced-level students, this 2nd Edition features a wealth of illustrative examples, ranging from basic techniques to advanced intelligent computer vision system implementations. Additional examples and tutorials, as well as a question and solution forum, can be found on a companion website. ...

A practical introduction to intelligent computer vision theory, design, implementation, and technology.

The past decade has witnessed epic growth in image processing and intelligent computer vision technology. Advancements in machine learning methods-especially among adaboost varieties and particle filtering methods-have made machine learning in intelligent computer vision more accurate and reliable than ever before. The need for expert coverage of the state of the art in this burgeoning field has never been greater, and this book satisfies that need. Fully updated and extensively revised, this 2nd Edition of the popular guide provides designers, data analysts, researchers and advanced post-graduates with a fundamental yet wholly practical introduction to intelligent computer vision. The authors walk you through the basics of computer vision, past and present, and they explore the more subtle intricacies of intelligent computer vision, with an emphasis on intelligent measurement systems. Using many timely, real-world examples, they explain and vividly demonstrate the latest developments in image and video processing techniques and technologies for machine learning in computer vision systems, including:

PRTools5 software for MATLAB-especially the latest representation and generalization software toolbox for PRTools5
Machine learning applications for computer vision, with detailed discussions of contemporary state estimation techniques vs older content of particle filter methods
The latest techniques for classification and supervised learning, with an emphasis on Neural Network, Genetic State Estimation and other particle filter and AI state estimation methods
All new coverage of the Adaboost and its implementation in PRTools5.

A valuable working resource for professionals and an excellent introduction for advanced-level students, this 2nd Edition features a wealth of illustrative examples, ranging from basic techniques to advanced intelligent computer vision system implementations. Additional examples and tutorials, as well as a question and solution forum, can be found on a companion website.

Single-molecule protein sequencing through fingerprinting: computational assessment

Journal article (2015) - Y Yao, MW Docter, J van Ginkel, D de Ridder, C Joo

Proteins are vital in all biological systems as they constitute the main structural and functional components of cells. Recent advances in mass spectrometry have brought the promise of complete proteomics by helping draft the human proteome. Yet, this commonly used protein sequencing technique has fundamental limitations in sensitivity. Here we propose a method for single-molecule (SM) protein sequencing. A major challenge lies in the fact that proteins are composed of 20 different amino acids, which demands 20 molecular reporters. We computationally demonstrate that it suffices to measure only two types of amino acids to identify proteins and suggest an experimental scheme using SM fluorescence. When achieved, this highly sensitive approach will result in a paradigm shift in proteomics, with major impact in the biological and medical sciences. ...

Data-driven codon optimization in Saccharomyces cerevisiae

Poster (2013) - Alexey Gritsenko, Frank Koopman, Marcel Reinders, Jean Marc Daran, Dick de Ridder

Conditional random fields for protein function prediction

Conference paper (2013) - T Gehrmann, M Loog, MJT Reinders, D de Ridder

Genome duplication and mutations in ACE2 cause multicellar, fast-sedimenting phenotypes in evolved Saccharomyces cerevisiae

Conference paper (2013) - B Oud, VG Guadalupe Medina, K Nijkamp, Dick de Ridder, JT Pronk, AJA van Maris, JM Daran

Laboratory evolution of the yeast Saccharomyces cerevisiae in bioreactor batch cultures yielded variants that grow as multicellular, fast-sedimenting clusters. Knowledge of the molecular basis of this phenomenon may contribute to the understanding of natural evolution of multicellularity and to manipulating cell sedimentation in laboratory and industrial applications of S. cerevisiae. Multicellular, fast-sedimenting lineages obtained from a haploid S. cerevisiae strain in two independent evolution experiments were analyzed by whole genome resequencing. The two evolved cell lines showed different frameshift mutations in a stretch of eight adenosines in ACE2, which encodes a transcriptional regulator involved in cell cycle control and mother-daughter cell separation. Introduction of the two ace2 mutant alleles into the haploid parental strain led to slow-sedimenting cell clusters that consisted of just a few cells, thus representing only a partial reconstruction of the evolved phenotype. In addition to single-nucleotide mutations, a whole-genome duplication event had occurred in both evolved multicellular strains. Construction of a diploid reference strain with two mutant ace2 alleles led to complete reconstruction of the multicellular-fast sedimenting phenotype. This study shows that whole-genome duplication and a frameshift mutation in ACE2 are sufficient to generate a fast-sedimenting, multicellular phenotype in S. cerevisiae. The nature of the ace2 mutations and their occurrence in two independent evolution experiments encompassing fewer than 500 generations of selective growth suggest that switching between unicellular and multicellular phenotypes may be relevant for competitiveness of S. cerevisiae in natural environments. ...

Predicting functional effect of human missense mutations

Poster (2013) - Bastiaan van den Berg, JM Thornton, Marcel Reinders, Dick de Ridder, TAP Beer

Our aim is to prioritize human missense mutations by their probability of being disease causing. Such a computational method could be used to obtain a reduced set of mutations with a relatively large fraction of disease related mutations, thereby aiding in the search for this type of mutation within a large mutation set.

Whereas a range of methods is available for this purpose, only few employ the availability of the 1000G data to obtain a set of neutral mutations. The novelty of our approach is the use of separate classifiers that were trained on a subset of mutations from one amino acid to any other amino acid. The combined performance of these classifiers show an improved performance compared to the often used prediction method PolyPhen2. ...

Using predictive models to engineer biology: a case study in codon optimization

Conference paper (2013) - A Gritsenko, MJT Reinders, D de Ridder

Local topological signatures for network-based prediction of biological function

Conference paper (2013) - W Winterbach, PFA Van Mieghem, MJT Reinders, H Wang, D de Ridder

A rational protein redesign method for improved secretion yields in Aspergillus niger

Poster (2012) - Bastiaan van den Berg, Marcel Reinders, J.M. van der Laan, J.A. Roubos, Dick de Ridder

Proposal for an enzyme redesign method to improve production rates in Aspergillus niger

Poster (2012) - Bastiaan van den Berg, Marcel Reinders, HJ Pel, J.A. Roubos, Dick de Ridder

Relating sequence properties to protein secretion

Poster (2011) - Bastiaan van den Berg, Marcel Reinders, HJ Pel, L Wu, J.A. Roubos, Dick de Ridder

Aspergillus niger is widely used for industrial enzyme production. Knowledge on high-level protein secretion could be useful to improve production rates. We used sequencebased classification methods to identify important properties
for successful high-level secretion, which will be used to redesign proteins for improved secretion. ...

Relating amino acid patterns to successful high-level protein secretion in Aspergillus niger

Poster (2011) - Bastiaan van den Berg, Marc Hulsman, Marcel Reinders, L Wu, HJ Pel, J.A. Roubos, Dick de Ridder

DNA Microarray studies of Hematopoietic Subpopulations

Book chapter (2009) - K Pike-Overzet, D de Ridder, T Schonewille, FJT Staal

Robust manifold learning

Report (2003) - D de Ridder, V Franc

Texture description by independent components

Conference paper (2002) - D de Ridder, RPW Duin, J Kittler

A model for probabilistic independent component subspace analysis is developed and applied to texture description. Experiments show it to perform comparably to a Gaussian model, and to be useful mainly for problems in which the detection of little occurring, high-frequency image elements is important. ...