Nanopore sequencing enables near-complete de novo assembly of Saccharomyces cerevisiae reference strain CEN.PK113-7D

Journal Article (2017)
Authors

Alex Salazar (Broad Institute of MIT and Harvard, TU Delft - Pattern Recognition and Bioinformatics)

A.R. de Vries (TU Delft - BT/Industriele Microbiologie)

Marcel van den Broek (TU Delft - BT/Industriele Microbiologie)

Melanie Wijsman (TU Delft - BT/Industriele Microbiologie)

Pilar de la Torre (TU Delft - BT/Industriele Microbiologie)

A. Brickwedde (TU Delft - BT/Industriele Microbiologie)

Nick Brouwers (TU Delft - BT/Industriele Microbiologie)

Jean-Marc Daran (TU Delft - BT/Industriele Microbiologie)

Thomas Abeel (TU Delft - Pattern Recognition and Bioinformatics, Broad Institute of MIT and Harvard)

Research Group
Pattern Recognition and Bioinformatics
Copyright
© 2017 A.N. Salazar, A.R. Gorter de Vries, M.A. van den Broek, M. Wijsman, P. de la Torre, A. Brickwedde, N. Brouwers, J.G. Daran, T.E.P.M.F. Abeel
To reference this document use:
https://doi.org/10.1093/femsyr/fox074
More Info
expand_more
Publication Year
2017
Language
English
Copyright
© 2017 A.N. Salazar, A.R. Gorter de Vries, M.A. van den Broek, M. Wijsman, P. de la Torre, A. Brickwedde, N. Brouwers, J.G. Daran, T.E.P.M.F. Abeel
Research Group
Pattern Recognition and Bioinformatics
Issue number
7
Volume number
17
Pages (from-to)
1-11
DOI:
https://doi.org/10.1093/femsyr/fox074
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The haploid Saccharomyces cerevisiae strain CEN.PK113–7D is a popular model system for metabolic engineering and systems biology research. Current genome assemblies are based on short-read sequencing data scaffolded based on homology to strain S288C. However, these assemblies contain large sequence gaps, particularly in subtelomeric regions, and the assumption of perfect homology to S288C for scaffolding introduces bias. In this study, we obtained a near-complete genome assembly of CEN.PK113–7D using only Oxford Nanopore Technology's MinION sequencing platform. Fifteen of the 16 chromosomes, the mitochondrial genome and the 2-μm plasmid are assembled in single contigs and all but one chromosome starts or ends in a telomere repeat. This improved genome assembly contains 770 Kbp of added sequence containing 248 gene annotations in comparison to the previous assembly of CEN.PK113–7D. Many of these genes encode functions determining fitness in specific growth conditions and are therefore highly relevant for various industrial applications. Furthermore, we discovered a translocation between chromosomes III and VIII that caused misidentification of a MAL locus in the previous CEN.PK113–7D assembly. This study demonstrates the power of long-read sequencing by providing a high-quality reference assembly and annotation of CEN.PK113–7D and places a caveat on assumed genome stability of microorganisms.