Portrait of author

Yuanqiang Zou:
Cultivation and sequencing of commensal bacteria of the human gut through culture-based and metagenomics analyses

Date: 30-04-2023    Supervisor: Karsten Kristiansen




Trillions of bacteria colonizing in human gastrointestinal tract have been increasingly recognized as key player in human health and disease, and the commensal bacteria co-evolved with their hosts during the lifespan. The number of microorganisms in the human gut is close to 1014, which is 1.3 times the number of human somatic cells, equivalent to 60% of the dry weight of stool. With the advances in sequencing technology over the past decade, the composition of the gut microbiome and the causality with complex metabolic traits have been well recognized. Metagenome approaches to reveal the taxonomic and functional diversity of the human microbiota have increased our understanding of the linkage between specific microbes and host health. Reference genomes are essential data for bacterial taxonomy annotation for reference-based metagenomic analysis, Metagenome assembled genomes (MAGs) have been successfully recovered from metagenomic data by many studies that expand the microbial diversity and improve the taxonomic annotation rate by providing numerous of uncultivated organisms. However, the quality and authenticity of MAGs limited the accuracy of high-resolution analysis of gut microbiome.

Bacterial reference genome and culture isolates enable a high-resolution characterization of the taxonomy for reference-based metagenomic analysis and a better understanding of the functions of gut microbiome, such as causative studies of host-microbiota interactions based on experimental validation. Culture-based methods for the gut microbiota can provide new insight of importance for the exploration of the gut microbiota. Recent studies have updated our understanding that many bacteria in the human gastrointestinal tract which were considered as unculturable have been cultured and the number of cultured bacterial species is increasing with the effort of culture-based microbiologists. Furthermore, the high-quality genomes sequenced from cultured bacteria will continuously expand the collection of reference genomes and fill the gap between the MAGs and the cultured genomes.

We conducted a large-scale bacteria cultivation and genome sequencing from fecal samples collected from healthy donors. In total, over 20,000 bacteria have been successfully isolated and an expanded reference genome catalogue, comprising 4,000 bacterial genomes, has been constructed. Of these, 3,324 genomes were qualified as high-quality. We further classified these high-quality genomes in to 527 species level clusters based on ANI, with a 95% value as the demarcation of species. The 527 clusters were belonging to 8 known phyla, 222 of these were taxonomic novelty. A comparison of these clusters to the Unified Human Gastrointestinal Genome (UHGG) collection, an integrated genome collection comprising 4,644 prokaryotic species included in the human gut, with 70% of the species lacking cultured representatives, revealed that 126 MAGs could be matched by our cultivated genomes. Functional insights, including carbohydrate-active enzymes, HMO-degrading CAZymes, and secondary metabolite biosynthetic gene clusters based on the genome data have been endowed with an ecological importance of the bacteria in human gastrointestinal tract. We also found that these genomes resolve a vast majority of new virus–bacteria linkages and improve the accuracy in uncovering the interaction between the bacteriophages and bacteria. Unexpectedly, the explored diversity of Collinsella aerofaciens showed a high genomic and functional variations across populations.

Moreover, the study of polyphasic taxonomy was conducted for 4 novel species, including strain TF01-11 as a novel species of Lachnospiraceae, AF73-05CM02 as a novel species of Christensenella, and AF52-21 and CM04-06 as two different novel species of Faecalibacterium, which can be used as potential novel probiotics for prevention of disease. Based on the characteristics of the phenotypic, chemotaxonomic, 16S rRNA gene sequence, and whole genome sequence, we found several significant differences of our novel strains with the type strains of the related known species. Most importantly, our novel strains shared low genomic similarities with the present type strains with the ANI values below 95%. Finally, we proposed the name of these novel species as Butyribacter intestini gen. nov., sp. nov (type strain TF01-11T = CGMCC 1.5203T = DSM 105140T), Christensenella intestinihominis sp. nov. (type strain AF73-05CM02T= CGMCC 1.5207T = DSM 103477T), Faecalibacterium butyricigenerans sp. nov. (type strain AF52-21T = CGMCC1.5206T = DSM 103434T), and Faecalibacterium longum sp. nov. (type strain CM04-06T = CGMCC 1.5208T = DSM 103432T), respectively.

The expanded genomic resource provides an essential resolution for better understanding of the diversity of the gut microbiome and will enable exploring the accurate functions by genomic analyses and experimental validation. The identified novel species will be used as new generation of probiotics and biotherapeutic approach for treatment of diseases.