SOAPMetaS: profiling large metagenome datasets efficiently on distributed clusters
Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt
Standard
SOAPMetaS : profiling large metagenome datasets efficiently on distributed clusters. / He, Shixu; Huang, Zhibo; Wang, Xiaohan; Fang, Lin; Li, Shengkang; Zhang, Yong; Zhang, Gengyun.
I: Bioinformatics, Bind 37, Nr. 7, 2021, s. 1021-1023.Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - SOAPMetaS
T2 - profiling large metagenome datasets efficiently on distributed clusters
AU - He, Shixu
AU - Huang, Zhibo
AU - Wang, Xiaohan
AU - Fang, Lin
AU - Li, Shengkang
AU - Zhang, Yong
AU - Zhang, Gengyun
PY - 2021
Y1 - 2021
N2 - Rapid increase of the data size in metagenome researches has raised the demand for new tools to process large datasets efficiently. To accelerate the metagenome profiling process in the scenario of big data, we developed SOAPMetaS, a marker gene-based multiple-sample metagenome profiling tool built on Apache Spark. SOAPMetaS demonstrates high performance and scalability to process large datasets. It can process 80 samples of FASTQ data, summing up to 416 GiB, in around half an hour; and the accuracy of species profiling results of SOAPMetaS is similar to that of MetaPhlAn2. SOAPMetaS can deal with a large volume of metagenome data more efficiently than common-used single-machine tools.
AB - Rapid increase of the data size in metagenome researches has raised the demand for new tools to process large datasets efficiently. To accelerate the metagenome profiling process in the scenario of big data, we developed SOAPMetaS, a marker gene-based multiple-sample metagenome profiling tool built on Apache Spark. SOAPMetaS demonstrates high performance and scalability to process large datasets. It can process 80 samples of FASTQ data, summing up to 416 GiB, in around half an hour; and the accuracy of species profiling results of SOAPMetaS is similar to that of MetaPhlAn2. SOAPMetaS can deal with a large volume of metagenome data more efficiently than common-used single-machine tools.
U2 - 10.1093/bioinformatics/btaa697
DO - 10.1093/bioinformatics/btaa697
M3 - Journal article
C2 - 32766813
VL - 37
SP - 1021
EP - 1023
JO - Bioinformatics (Online)
JF - Bioinformatics (Online)
SN - 1367-4811
IS - 7
ER -
ID: 272641995