CReSIL: accurate identification of extrachromosomal circular DNA from long-read sequences

Research output: Contribution to journalJournal articleResearchpeer-review

Extrachromosomal circular DNA (eccDNA) of chromosomal origin is found in many eukaryotic species and cell types, including cancer, where eccDNAs with oncogenes drive tumorigenesis. Most studies of eccDNA employ short-read sequencing for their identification. However, short-read sequencing cannot resolve the complexity of genomic repeats, which can lead to missing eccDNA products. Long-read sequencing technologies provide an alternative to constructing complete eccDNA maps. We present a software suite, Construction-based Rolling-circle-amplification for eccDNA Sequence Identification and Location (CReSIL), to identify and characterize eccDNA from long-read sequences. CReSIL's performance in identifying eccDNA, with a minimum F1 score of 0.98, is superior to the other bioinformatic tools based on simulated data. CReSIL provides many useful features for genomic annotation, which can be used to infer eccDNA function and Circos visualization for eccDNA architecture investigation. We demonstrated CReSIL's capability in several long-read sequencing datasets, including datasets enriched for eccDNA and whole genome datasets from cells containing large eccDNA products. In conclusion, the CReSIL suite software is a versatile tool for investigating complex and simple eccDNA in eukaryotic cells.

Original languageEnglish
Article numberbbac422
JournalBriefings in Bioinformatics
Volume23
Issue number6
Number of pages11
ISSN1467-5463
DOIs
Publication statusPublished - 2022

    Research areas

  • CRESIL, eccDNA, long-read sequence, bioinformatic tool, AMPLIFICATION, MICRODNAS, ELEMENT

ID: 322876025