Research & Faculty

Default Header Image

Evolution, substrate specificity and subfamily classification of glycoside hydrolase family 5 (GH5)

TitleEvolution, substrate specificity and subfamily classification of glycoside hydrolase family 5 (GH5)
Publication TypeJournal Article
Year of Publication2012
AuthorsAspeborg, H, Coutinho, PM, Wang, Y, Brumer, H, Henrissat, B
Date PublishedSEP 20
Type of ArticleArticle

Background: The large Glycoside Hydrolase family 5 (GH5) groups together a wide range of enzymes acting on beta-linked oligo- and polysaccharides, and glycoconjugates from a large spectrum of organisms. The long and complex evolution of this family of enzymes and its broad sequence diversity limits functional prediction. With the objective of improving the differentiation of enzyme specificities in a knowledge-based context, and to obtain new evolutionary insights, we present here a new, robust subfamily classification of family GH5. Results: About 80% of the current sequences were assigned into 51 subfamilies in a global analysis of all publicly available GH5 sequences and associated biochemical data. Examination of subfamilies with catalytically-active members revealed that one third are monospecific (containing a single enzyme activity), although new functions may be discovered with biochemical characterization in the future. Furthermore, twenty subfamilies presently have no characterization whatsoever and many others have only limited structural and biochemical data. Mapping of functional knowledge onto the GH5 phylogenetic tree revealed that the sequence space of this historical and industrially important family is far from well dispersed, highlighting targets in need of further study. The analysis also uncovered a number of GH5 proteins which have lost their catalytic machinery, indicating evolution towards novel functions. Conclusion: Overall, the subfamily division of GH5 provides an actively curated resource for large-scale protein sequence annotation for glycogenomics; the subfamily assignments are openly accessible via the Carbohydrate-Active Enzyme database at