Organism- and disease-specific atlases of transcription start sites using Cap Analysis of Gene Expression (CAGE)

Research output: Book/ReportPh.D. thesisResearch

Modern high-throughput assays have enabled the study of gene expression and its regulationon an unprecedented, genome-wide scale. Cap Analysis of Gene Expression (CAGE) is one ofthe major assays for studying transcriptional regulation. CAGE can detect and quantify bothtranscription starts sites (TSSs) and enhancers independently from reference gene annotation.This means that in a single experiment, both gene expression and regulatory activity inintergenic regions, can be assayed when using CAGE.CAGE has greatly benefited from work by the FANTOM consortiums, who has produced a largenumber of CAGE datasets and developed new methods for analysing them. Based on theseadvances, it is now possible to use CAGE to study transcriptional regulation in diseases andadditional organism.In this thesis, we analyzed transcriptional regulation using CAGE in the eukaryotic modelorganism fission yeast and in the chronic disease inflammatory bowel disease (IBD), anddeveloped general tools for analysing CAGE data.Using CAGE data from 15 samples of fission yeast growing under a wide range of differentconditions, we generated an accurate, genome-wide atlas of TSSs. We showed that this atlasimproves and expands existing gene models, and how it can be used as an accurate startingpoint for analyzing many other types of genetic and epigenetic data. We identified TSSs thatchange expression between conditions, including cases where genes use alternative TSSs in acondition-dependent manner.We showed how CAGE can be used in large clinical studies by analyzing a dataset composedof 94 colonic biopsies obtained from patients suffering from IBD. We generated an accurateIBD-specific atlas of TSSs and enhancers, and used this atlas to describe the biologicalprocesses that distinguish subtypes of IBD. We showed how enhancers can be used to interpretthe regulatory function of intergenic regions, and that enhancers are highly enriched for geneticvariants associated to IBD. Lastly, we used the IBD-specific atlas of TSSs and enhancers toselect a small set of biomarkers that can be used to classify IBD patients in a clinical setting withhigh accuracy.Finally, we developed a novel tool for the analysis of CAGE data that enables and empowersfuture CAGE studies.
Original languageEnglish
PublisherDepartment of Biology, Faculty of Science, University of Copenhagen
Publication statusPublished - 2018

ID: 201233032