Sequencing introduced false positive rare taxa lead to biased microbial community diversity, assembly, and interaction interpretation in amplicon studies

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Sequencing introduced false positive rare taxa lead to biased microbial community diversity, assembly, and interaction interpretation in amplicon studies. / Jia, Yangyang; Zhao, Shengguo; Guo, Wenjie; Peng, Ling; Zhao, Fang; Wang, Lushan; Fan, Guangyi; Zhu, Yuanfang; Xu, Dayou; Liu, Guilin; Wang, Ruoqing; Fang, Xiaodong; Zhang, He; Kristiansen, Karsten; Zhang, Wenwei; Chen, Jianwei.

In: Environmental Microbiomes, Vol. 17, 43, 2022.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Jia, Y, Zhao, S, Guo, W, Peng, L, Zhao, F, Wang, L, Fan, G, Zhu, Y, Xu, D, Liu, G, Wang, R, Fang, X, Zhang, H, Kristiansen, K, Zhang, W & Chen, J 2022, 'Sequencing introduced false positive rare taxa lead to biased microbial community diversity, assembly, and interaction interpretation in amplicon studies', Environmental Microbiomes, vol. 17, 43. https://doi.org/10.1186/s40793-022-00436-y

APA

Jia, Y., Zhao, S., Guo, W., Peng, L., Zhao, F., Wang, L., Fan, G., Zhu, Y., Xu, D., Liu, G., Wang, R., Fang, X., Zhang, H., Kristiansen, K., Zhang, W., & Chen, J. (2022). Sequencing introduced false positive rare taxa lead to biased microbial community diversity, assembly, and interaction interpretation in amplicon studies. Environmental Microbiomes, 17, [43]. https://doi.org/10.1186/s40793-022-00436-y

Vancouver

Jia Y, Zhao S, Guo W, Peng L, Zhao F, Wang L et al. Sequencing introduced false positive rare taxa lead to biased microbial community diversity, assembly, and interaction interpretation in amplicon studies. Environmental Microbiomes. 2022;17. 43. https://doi.org/10.1186/s40793-022-00436-y

Author

Jia, Yangyang ; Zhao, Shengguo ; Guo, Wenjie ; Peng, Ling ; Zhao, Fang ; Wang, Lushan ; Fan, Guangyi ; Zhu, Yuanfang ; Xu, Dayou ; Liu, Guilin ; Wang, Ruoqing ; Fang, Xiaodong ; Zhang, He ; Kristiansen, Karsten ; Zhang, Wenwei ; Chen, Jianwei. / Sequencing introduced false positive rare taxa lead to biased microbial community diversity, assembly, and interaction interpretation in amplicon studies. In: Environmental Microbiomes. 2022 ; Vol. 17.

Bibtex

@article{f7a713c91daa454185e31f8194cbdedb,
title = "Sequencing introduced false positive rare taxa lead to biased microbial community diversity, assembly, and interaction interpretation in amplicon studies",
abstract = "Background: Increasing studies have demonstrated potential disproportionate functional and ecological contributions of rare taxa in a microbial community. However, the study of the microbial rare biosphere is hampered by their inherent scarcity and the deficiency of currently available techniques. Sample-wise cross contaminations might be introduced by sample index misassignment in the most widely used metabarcoding amplicon sequencing approach. Although downstream bioinformatic quality control and clustering or denoising algorithms could remove sequencing errors and non-biological artifact reads, no algorithm could eliminate high quality reads from sample-wise cross contaminations introduced by index misassignment, making it difficult to distinguish between bona fide rare taxa and potential false positives in metabarcoding studies. Results: We thoroughly evaluated the rate of index misassignment of the widely used NovaSeq 6000 and DNBSEQ-G400 sequencing platforms using both commercial and customized mock communities, and observed significant lower (0.08% vs. 5.68%) fraction of potential false positive reads for DNBSEQ-G400 as compared to NovaSeq 6000. Significant batch effects could be caused by stochastically introduced false positive or false negative rare taxa. These false detections could also lead to inflated alpha diversity of relatively simple microbial communities and underestimated that of complex ones. Further test using a set of cow rumen samples reported differential rare taxa by different sequencing platforms. Correlation analysis of the rare taxa detected by each sequencing platform demonstrated that the rare taxa identified by DNBSEQ-G400 platform had a much higher possibility to be correlated with the physiochemical properties of rumen fluid as compared to NovaSeq 6000 platform. Community assembly mechanism and microbial network correlation analysis indicated that false positive or negative rare taxa detection could lead to biased community assembly mechanism and identification of fake keystone species of the community. Conclusions: We highly suggest proper positive/negative/blank controls, technical replicate settings, and proper sequencing platform selection in future amplicon studies, especially when the microbial rare biosphere would be focused.",
keywords = "Amplicon sequencing, Community assembly, Index misassignment, Keystone species, Microbial rare taxa",
author = "Yangyang Jia and Shengguo Zhao and Wenjie Guo and Ling Peng and Fang Zhao and Lushan Wang and Guangyi Fan and Yuanfang Zhu and Dayou Xu and Guilin Liu and Ruoqing Wang and Xiaodong Fang and He Zhang and Karsten Kristiansen and Wenwei Zhang and Jianwei Chen",
note = "Publisher Copyright: {\textcopyright} 2022, The Author(s).",
year = "2022",
doi = "10.1186/s40793-022-00436-y",
language = "English",
volume = "17",
journal = "Environmental Microbiomes",
issn = "1944-3277",
publisher = "BioMed Central Ltd.",

}

RIS

TY - JOUR

T1 - Sequencing introduced false positive rare taxa lead to biased microbial community diversity, assembly, and interaction interpretation in amplicon studies

AU - Jia, Yangyang

AU - Zhao, Shengguo

AU - Guo, Wenjie

AU - Peng, Ling

AU - Zhao, Fang

AU - Wang, Lushan

AU - Fan, Guangyi

AU - Zhu, Yuanfang

AU - Xu, Dayou

AU - Liu, Guilin

AU - Wang, Ruoqing

AU - Fang, Xiaodong

AU - Zhang, He

AU - Kristiansen, Karsten

AU - Zhang, Wenwei

AU - Chen, Jianwei

N1 - Publisher Copyright: © 2022, The Author(s).

PY - 2022

Y1 - 2022

N2 - Background: Increasing studies have demonstrated potential disproportionate functional and ecological contributions of rare taxa in a microbial community. However, the study of the microbial rare biosphere is hampered by their inherent scarcity and the deficiency of currently available techniques. Sample-wise cross contaminations might be introduced by sample index misassignment in the most widely used metabarcoding amplicon sequencing approach. Although downstream bioinformatic quality control and clustering or denoising algorithms could remove sequencing errors and non-biological artifact reads, no algorithm could eliminate high quality reads from sample-wise cross contaminations introduced by index misassignment, making it difficult to distinguish between bona fide rare taxa and potential false positives in metabarcoding studies. Results: We thoroughly evaluated the rate of index misassignment of the widely used NovaSeq 6000 and DNBSEQ-G400 sequencing platforms using both commercial and customized mock communities, and observed significant lower (0.08% vs. 5.68%) fraction of potential false positive reads for DNBSEQ-G400 as compared to NovaSeq 6000. Significant batch effects could be caused by stochastically introduced false positive or false negative rare taxa. These false detections could also lead to inflated alpha diversity of relatively simple microbial communities and underestimated that of complex ones. Further test using a set of cow rumen samples reported differential rare taxa by different sequencing platforms. Correlation analysis of the rare taxa detected by each sequencing platform demonstrated that the rare taxa identified by DNBSEQ-G400 platform had a much higher possibility to be correlated with the physiochemical properties of rumen fluid as compared to NovaSeq 6000 platform. Community assembly mechanism and microbial network correlation analysis indicated that false positive or negative rare taxa detection could lead to biased community assembly mechanism and identification of fake keystone species of the community. Conclusions: We highly suggest proper positive/negative/blank controls, technical replicate settings, and proper sequencing platform selection in future amplicon studies, especially when the microbial rare biosphere would be focused.

AB - Background: Increasing studies have demonstrated potential disproportionate functional and ecological contributions of rare taxa in a microbial community. However, the study of the microbial rare biosphere is hampered by their inherent scarcity and the deficiency of currently available techniques. Sample-wise cross contaminations might be introduced by sample index misassignment in the most widely used metabarcoding amplicon sequencing approach. Although downstream bioinformatic quality control and clustering or denoising algorithms could remove sequencing errors and non-biological artifact reads, no algorithm could eliminate high quality reads from sample-wise cross contaminations introduced by index misassignment, making it difficult to distinguish between bona fide rare taxa and potential false positives in metabarcoding studies. Results: We thoroughly evaluated the rate of index misassignment of the widely used NovaSeq 6000 and DNBSEQ-G400 sequencing platforms using both commercial and customized mock communities, and observed significant lower (0.08% vs. 5.68%) fraction of potential false positive reads for DNBSEQ-G400 as compared to NovaSeq 6000. Significant batch effects could be caused by stochastically introduced false positive or false negative rare taxa. These false detections could also lead to inflated alpha diversity of relatively simple microbial communities and underestimated that of complex ones. Further test using a set of cow rumen samples reported differential rare taxa by different sequencing platforms. Correlation analysis of the rare taxa detected by each sequencing platform demonstrated that the rare taxa identified by DNBSEQ-G400 platform had a much higher possibility to be correlated with the physiochemical properties of rumen fluid as compared to NovaSeq 6000 platform. Community assembly mechanism and microbial network correlation analysis indicated that false positive or negative rare taxa detection could lead to biased community assembly mechanism and identification of fake keystone species of the community. Conclusions: We highly suggest proper positive/negative/blank controls, technical replicate settings, and proper sequencing platform selection in future amplicon studies, especially when the microbial rare biosphere would be focused.

KW - Amplicon sequencing

KW - Community assembly

KW - Index misassignment

KW - Keystone species

KW - Microbial rare taxa

U2 - 10.1186/s40793-022-00436-y

DO - 10.1186/s40793-022-00436-y

M3 - Journal article

C2 - 35978448

AN - SCOPUS:85136152665

VL - 17

JO - Environmental Microbiomes

JF - Environmental Microbiomes

SN - 1944-3277

M1 - 43

ER -

ID: 318802509