RNA degradation is an integral part in RNA metabolism and plays an important role in determining cellular steady state RNA levels. Here, we focus on an RNA degradation machine – the RNA exosome. It is a highly conserved 3’-5’ ribonucleolytic protein complex and a main player in eukaryotic nuclear RNA turnover. In the nucleus, its substrates are composed of a variety of species, including a multitude of long non-coding RNAs (lncRNAs), such as PROMoter uPstream Transcripts (PROMPTs), enhancer RNAs (eRNAs) and several stable nuclear RNAs such as ribosomal RNAs (rRNAs), and small nuclear RNAs (snRNAs). In the nucleoplasm of mammalian cells, the substrate specificity of the RNA exosome is achieved through two major adaptors – the nuclear exosome targeting (NEXT) complex and the poly(A) exosome targeting (PAXT) connection.
In this thesis, we investigated the regulation and functions of the RNA exosome from three aspects: the role of the RNA exosome in shaping transcriptome derived from protein-coding (pc) genes, the contribution of molecular features to the RNA exosome degradation pathways, and the function of the RNA exosome in embryonic stem cell (ESC) development.
Employing several complementary genome-wide techniques, we identified and characterized a number of exosome-sensitive transcripts produced within pc genes. We identified two types of genes that utilize a single annotated transcription start site (TSS), where the first type produces mainly exosome-sensitive full-length transcripts, and the second type mainly prematurely terminated transcripts. Many genes of the former type are immediate early genes and encode transcription factors; whereas many genes of the latter type are in head-to-head configurations with other pc genes, and likely transcribed due to strong transcription initiation of the genes on the upstream opposite strand. For genes with multiple active TSSs, the TSSs producing exosome-sensitive transcripts only have minor contributions to the overall gene expression, and the produced transcripts are often prematurely terminated. Our results revealed a complex sense transcription landscape within pc genes shaped by the RNA exosome, and suggest the nuclear RNA exosome plays a role in regulating the expression of some important pc genes.
Using machine learning approaches, we delineated molecular determinants of nuclear exosome degradation pathways. We found the molecular features of the transcript end site (TES) to be most predictive for distinguishing the NEXT and PAXT pathways, while TSS related features were only found to be distinct for NEXT targets but not able to distinguish PAXT from non-exosome targets.
By knockout of the PAXT component ZFC3H1, we observed impaired mouse ESC differentiation. In Zfc3h1-/- cells, besides known PAXT substrates, many polycomb repressive complex 2 (PRC2)-repressed genes were upregulated, accompanied with a decreased PRC2 binding and reduced H3K27me3 levels. Integrity of the PRC2 complex was observed to be decreased with increased levels of nonspecific RNA bound to PRC2. These results underscore the importance of controlling nuclear RNA levels during ESC development and suggest a potential way to regulate transcription by bulk RNA.