TrustGWAS: A full-process workflow for encrypted GWAS using multi-key homomorphic encryption and pseudorandom number perturbation
Research output: Contribution to journal › Journal article › Research › peer-review
The statistical power of genome-wide association studies (GWASs) is affected by the effective sample size. However, the privacy and security concerns associated with individual-level genotype data pose great challenges for cross-institutional cooperation. The full-process cryptographic solutions are in demand but have not been covered, especially the essential principal-component analysis (PCA). Here, we present TrustGWAS, a complete solution for secure, large-scale GWAS, recapitulating gold standard results against PLINK without compromising privacy and supporting basic PLINK steps including quality control, linkage disequilibrium pruning, PCA, chi-square test, Cochran-Armitage trend test, covariate-supported logistic regression and linear regression, and their sequential combinations. TrustGWAS leverages pseudorandom number perturbations for PCA and multiparty scheme of multi-key homomorphic encryption for all other modules. TrustGWAS can evaluate 100,000 individuals with 1 million variants and complete QC-LD-PCA-regression workflow within 50 h. We further successfully discover gene loci associated with fasting blood glucose, consistent with the findings of the ChinaMAP project.
Original language | English |
---|---|
Journal | Cell Systems |
Volume | 13 |
Issue number | 9 |
Pages (from-to) | 752-767.e6 |
Number of pages | 23 |
ISSN | 2405-4712 |
DOIs | |
Publication status | Published - 2022 |
Bibliographical note
Publisher Copyright:
© 2022 Elsevier Inc.
- CKKS, genome privacy, GWAS, multi-key homomorphic encryption, privacy-preserving computation, pseudorandom number, TrustGWAS
Research areas
ID: 333435982