TrustGWAS: A full-process workflow for encrypted GWAS using multi-key homomorphic encryption and pseudorandom number perturbation

Research output: Contribution to journalJournal articleResearchpeer-review

  • YANG, Meng
  • Chuwen Zhang
  • Xiaoji Wang
  • Xingmin Liu
  • Shisen Li
  • Jianye Huang
  • Zhimin Feng
  • Xiaohui Sun
  • Fang Chen
  • Shuang Yang
  • Ming Ni
  • Lin Li
  • Yanan Cao
  • Feng Mu

The statistical power of genome-wide association studies (GWASs) is affected by the effective sample size. However, the privacy and security concerns associated with individual-level genotype data pose great challenges for cross-institutional cooperation. The full-process cryptographic solutions are in demand but have not been covered, especially the essential principal-component analysis (PCA). Here, we present TrustGWAS, a complete solution for secure, large-scale GWAS, recapitulating gold standard results against PLINK without compromising privacy and supporting basic PLINK steps including quality control, linkage disequilibrium pruning, PCA, chi-square test, Cochran-Armitage trend test, covariate-supported logistic regression and linear regression, and their sequential combinations. TrustGWAS leverages pseudorandom number perturbations for PCA and multiparty scheme of multi-key homomorphic encryption for all other modules. TrustGWAS can evaluate 100,000 individuals with 1 million variants and complete QC-LD-PCA-regression workflow within 50 h. We further successfully discover gene loci associated with fasting blood glucose, consistent with the findings of the ChinaMAP project.

Original languageEnglish
JournalCell Systems
Volume13
Issue number9
Pages (from-to)752-767.e6
Number of pages23
ISSN2405-4712
DOIs
Publication statusPublished - 2022

Bibliographical note

Publisher Copyright:
© 2022 Elsevier Inc.

    Research areas

  • CKKS, genome privacy, GWAS, multi-key homomorphic encryption, privacy-preserving computation, pseudorandom number, TrustGWAS

ID: 333435982