Joint identification of sex and sex-linked scaffolds in non-model organisms using low depth sequencing data
Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt
Standard
Joint identification of sex and sex-linked scaffolds in non-model organisms using low depth sequencing data. / Nursyifa, Casia; Brüniche-Olsen, Anna; Garcia-Erill, Genis ; Heller, Rasmus; Albrechtsen, Anders.
I: Molecular Ecology Resources, Bind 22, Nr. 2, 2022, s. 458-467.Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - Joint identification of sex and sex-linked scaffolds in non-model organisms using low depth sequencing data
AU - Nursyifa, Casia
AU - Brüniche-Olsen, Anna
AU - Garcia-Erill, Genis
AU - Heller, Rasmus
AU - Albrechtsen, Anders
PY - 2022
Y1 - 2022
N2 - Being able to assign sex to individuals and identify autosomal and sex-linked scaffolds are essential in most population genomic analyses. Non-model organisms often have genome assemblies at scaffold-level and lack characterization of sex-linked scaffolds. Previous methods to identify sex and sex-linked scaffolds have relied on synteny between the non-model organism and a closely related species or prior knowledge about the sex of the samples to identify sex-linked scaffolds. In the latter case, the difference in depth of coverage between the autosomes and the sex chromosomes are used. Here, we present “sex assignment through coverage” (SATC), a method to assign sex to samples and identify sex-linked scaffolds from next generation sequencing (NGS) data. The method works for species with a homogametic/heterogametic sex determination system and only requires a scaffold-level reference assembly and sampling of both sexes with whole genome sequencing (WGS) data. We use the sequencing depth distribution across scaffolds to jointly identify: (i) male and female individuals, and (ii) sex-linked scaffolds. This is achieved through projecting the scaffold depths into a low-dimensional space using principal component analysis (PCA) and subsequent Gaussian mixture clustering. We demonstrate the applicability of our method using data from five mammal species and a bird species complex. The method is freely available at https://github.com/popgenDK/SATC as R code and a graphical user interface (GUI).
AB - Being able to assign sex to individuals and identify autosomal and sex-linked scaffolds are essential in most population genomic analyses. Non-model organisms often have genome assemblies at scaffold-level and lack characterization of sex-linked scaffolds. Previous methods to identify sex and sex-linked scaffolds have relied on synteny between the non-model organism and a closely related species or prior knowledge about the sex of the samples to identify sex-linked scaffolds. In the latter case, the difference in depth of coverage between the autosomes and the sex chromosomes are used. Here, we present “sex assignment through coverage” (SATC), a method to assign sex to samples and identify sex-linked scaffolds from next generation sequencing (NGS) data. The method works for species with a homogametic/heterogametic sex determination system and only requires a scaffold-level reference assembly and sampling of both sexes with whole genome sequencing (WGS) data. We use the sequencing depth distribution across scaffolds to jointly identify: (i) male and female individuals, and (ii) sex-linked scaffolds. This is achieved through projecting the scaffold depths into a low-dimensional space using principal component analysis (PCA) and subsequent Gaussian mixture clustering. We demonstrate the applicability of our method using data from five mammal species and a bird species complex. The method is freely available at https://github.com/popgenDK/SATC as R code and a graphical user interface (GUI).
U2 - 10.1111/1755-0998.13491
DO - 10.1111/1755-0998.13491
M3 - Journal article
C2 - 34431216
VL - 22
SP - 458
EP - 467
JO - Molecular Ecology
JF - Molecular Ecology
SN - 0962-1083
IS - 2
ER -
ID: 279348408