Introduction of a human- and keyboard-friendly N-glycan nomenclature

Journal Article (2024)
Author(s)

F. Altmann (BOKU-University of Natural Resources and Life Sciences)

Johannes Helm (BOKU-University of Natural Resources and Life Sciences)

M. Pabst (TU Delft - BT/Environmental Biotechnology)

Johannes Stadlmann (BOKU-University of Natural Resources and Life Sciences)

Research Group
BT/Environmental Biotechnology
DOI related publication
https://doi.org/10.3762/bjoc.20.53
More Info
expand_more
Publication Year
2024
Language
English
Research Group
BT/Environmental Biotechnology
Volume number
20
Pages (from-to)
607-620
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In the beginning was the word. But there were no words for N-glycans, at least, no simple words. Next to chemical formulas, the IUPAC code can be regarded as the best, most reliable and yet immediately comprehensible annotation of oligosaccharide structures of any type from any source. When it comes to N-glycans, the venerable IUPAC code has, however, been widely supplanted by highly simplified terms for N-glycans that count the number of antennae or certain components such as galactoses, sialic acids and fucoses and give only limited room for exact structure description. The highly illustrative - and fortunately now standardized - cartoon depictions gained much ground during the last years. By their very nature, cartoons can neither be written nor spoken. The underlying machine codes (e.g., GlycoCT, WURCS) are definitely not intended for direct use in human communication. So, one might feel the need for a simple, yet intelligible and precise system for alphanumeric descriptions of the hundreds and thousands of N-glycan structures. Here, we present a system that describes N-glycans by defining their terminal elements. To minimize redundancy and length of terms, the common elements of N-glycans are taken as granted. The preset reading order facilitates definition of positional isomers. The combination with elements of the condensed IUPAC code allows to describe even rather complex structural elements. Thus, this “proglycan” coding could be the missing link between drawn structures and software-oriented representations of N-glycan structures. On top, it may greatly facilitate keyboard-based mining for glycan substructures in glycan repositories.