Portrait of author

Zilong Li:
Efficient statistical and computational methods for large scale sequencing data

Date: 03-11-2023    Supervisor: Anders Albrechtsen

This thesis presents statistical and computational methods for analyzing large scale sequencing data in genomics and population genetics. Since the begin of the modern genomics, the data size has been growing exponentially. As the relevant methods in the field have high computational cost dealing with large data set, this thesis has a particular focus on computational efficiency and scalability of the proposed algorithms and implementations.

The thesis consists of four manuscripts covering topics in principal component analysis (PCA), genotype imputation and phasing, population structure and software development for genetic data.