SOAPMetaS: profiling large metagenome datasets efficiently on distributed clusters

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Standard

SOAPMetaS : profiling large metagenome datasets efficiently on distributed clusters. / He, Shixu; Huang, Zhibo; Wang, Xiaohan; Fang, Lin; Li, Shengkang; Zhang, Yong; Zhang, Gengyun.

I: Bioinformatics, Bind 37, Nr. 7, 2021, s. 1021-1023.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Harvard

He, S, Huang, Z, Wang, X, Fang, L, Li, S, Zhang, Y & Zhang, G 2021, 'SOAPMetaS: profiling large metagenome datasets efficiently on distributed clusters', Bioinformatics, bind 37, nr. 7, s. 1021-1023. https://doi.org/10.1093/bioinformatics/btaa697

APA

He, S., Huang, Z., Wang, X., Fang, L., Li, S., Zhang, Y., & Zhang, G. (2021). SOAPMetaS: profiling large metagenome datasets efficiently on distributed clusters. Bioinformatics, 37(7), 1021-1023. https://doi.org/10.1093/bioinformatics/btaa697

Vancouver

He S, Huang Z, Wang X, Fang L, Li S, Zhang Y o.a. SOAPMetaS: profiling large metagenome datasets efficiently on distributed clusters. Bioinformatics. 2021;37(7):1021-1023. https://doi.org/10.1093/bioinformatics/btaa697

Author

He, Shixu ; Huang, Zhibo ; Wang, Xiaohan ; Fang, Lin ; Li, Shengkang ; Zhang, Yong ; Zhang, Gengyun. / SOAPMetaS : profiling large metagenome datasets efficiently on distributed clusters. I: Bioinformatics. 2021 ; Bind 37, Nr. 7. s. 1021-1023.

Bibtex

@article{8eede65313714487ab46c1b208328924,
title = "SOAPMetaS: profiling large metagenome datasets efficiently on distributed clusters",
abstract = "Rapid increase of the data size in metagenome researches has raised the demand for new tools to process large datasets efficiently. To accelerate the metagenome profiling process in the scenario of big data, we developed SOAPMetaS, a marker gene-based multiple-sample metagenome profiling tool built on Apache Spark. SOAPMetaS demonstrates high performance and scalability to process large datasets. It can process 80 samples of FASTQ data, summing up to 416 GiB, in around half an hour; and the accuracy of species profiling results of SOAPMetaS is similar to that of MetaPhlAn2. SOAPMetaS can deal with a large volume of metagenome data more efficiently than common-used single-machine tools.",
author = "Shixu He and Zhibo Huang and Xiaohan Wang and Lin Fang and Shengkang Li and Yong Zhang and Gengyun Zhang",
year = "2021",
doi = "10.1093/bioinformatics/btaa697",
language = "English",
volume = "37",
pages = "1021--1023",
journal = "Bioinformatics (Online)",
issn = "1367-4811",
publisher = "Oxford University Press",
number = "7",

}

RIS

TY - JOUR

T1 - SOAPMetaS

T2 - profiling large metagenome datasets efficiently on distributed clusters

AU - He, Shixu

AU - Huang, Zhibo

AU - Wang, Xiaohan

AU - Fang, Lin

AU - Li, Shengkang

AU - Zhang, Yong

AU - Zhang, Gengyun

PY - 2021

Y1 - 2021

N2 - Rapid increase of the data size in metagenome researches has raised the demand for new tools to process large datasets efficiently. To accelerate the metagenome profiling process in the scenario of big data, we developed SOAPMetaS, a marker gene-based multiple-sample metagenome profiling tool built on Apache Spark. SOAPMetaS demonstrates high performance and scalability to process large datasets. It can process 80 samples of FASTQ data, summing up to 416 GiB, in around half an hour; and the accuracy of species profiling results of SOAPMetaS is similar to that of MetaPhlAn2. SOAPMetaS can deal with a large volume of metagenome data more efficiently than common-used single-machine tools.

AB - Rapid increase of the data size in metagenome researches has raised the demand for new tools to process large datasets efficiently. To accelerate the metagenome profiling process in the scenario of big data, we developed SOAPMetaS, a marker gene-based multiple-sample metagenome profiling tool built on Apache Spark. SOAPMetaS demonstrates high performance and scalability to process large datasets. It can process 80 samples of FASTQ data, summing up to 416 GiB, in around half an hour; and the accuracy of species profiling results of SOAPMetaS is similar to that of MetaPhlAn2. SOAPMetaS can deal with a large volume of metagenome data more efficiently than common-used single-machine tools.

U2 - 10.1093/bioinformatics/btaa697

DO - 10.1093/bioinformatics/btaa697

M3 - Journal article

C2 - 32766813

VL - 37

SP - 1021

EP - 1023

JO - Bioinformatics (Online)

JF - Bioinformatics (Online)

SN - 1367-4811

IS - 7

ER -

ID: 272641995