
Lecture time
October 22, 2025 16:00-17:00
Lecture location
Conference Room 314, Building 1, Jinfeng Laboratory
Lecture title: Towards intelligent computing to understand carbohydrate-active enzymes (CAZyme)
Introduction to the speaker

Zheng Jinfang , associate researcher at Zhijiang Laboratory, independent PI. He has long been engaged in biological computing, bioinformatics algorithm development, and construction of biological databases. The developed tools, databases and models were published in top bioinformatics journals such as Nature genetics, Cell reports, Nature communication, etc., with a total of 28 articles. The total number of citations is 1261, and the total impact factor reaches 200. A series of tools and algorithms such as the algorithm dbCAN developed in the CAZyme field are very popular among researchers. Among them, dbCAN’s online platform accepts an average of 30,000 computing tasks every year and ranks 80/5980 in Database Commons, a global biological database catalog released by the National Center for Biological Information. Recently, the focus has been on artificial intelligence-driven biological discovery and innovation, and multiple large-scale models of proteins and cells have been developed to discover evolutionary laws in biology.
Lecture summary
In recent years, the gut microbiome has been recognized as “humanity’s second genome.” It is closely related to the host's nutritional metabolism, immune regulation, inflammatory response, etc. In the interaction process between microorganisms and the host, carbohydrate-active enzymes (Carbohydrate-Active enZymes, CAZymes) play a crucial role. They are responsible for degrading, modifying, and synthesizing polysaccharide structures, determining how microorganisms utilize dietary fiber, and affecting host health through metabolites such as short-chain fatty acids. However, there are many CAZyme families, diverse structures, and complex functions. Traditional manual or single statistical analysis methods are no longer able to meet the demand for large-scale and accurate analysis. With the rise of deep learning and large model technologies, AI provides new solutions for understanding and predicting microbial functions.
This report builds a systematic research framework around the three levels of "data-algorithm-model", aiming to use AI to deeply analyze CAZyme functions and promote the transformation of health applications. We established and continued to iterate the dbCAN3 integrated ecological platform (including CGC-Finder, dbCAN-PUL, eCAMI/dbCAN-sub, dbCAN-seq), systematically annotated approximately 0.5M CAZymes and 170,000 CGCs on 9,421 intestinal MAGs, and proposed a CGC substrate prediction strategy of "homology search + majority voting". At the same time, with the help of the deep structure graph learning method of AlphaFold2 + GCN/attention mechanism, the fine classification and function prediction of the CAZyme subfamily is achieved, which is significantly better than the traditional method, especially in the scenario of distant homologous sequences. Relying on these algorithms, we built a standardized process from sequencing data -> functional annotation -> abundance assessment -> visualization, providing a reproducible tool chain for microbiome functional analysis and precise nutritional intervention.
Everyone is welcome to actively participate