Computational and RNA Biology

Our thirteen strong research groups work in the fields of RNA biology, bioinformatics, computational biology, machine learning and population and statistical genetics, and our researchers consistently publish their studies in highly recognized international journals. The Section houses the Bioinformatics Centre and has state-of-the-art computational infrastructure and laboratory facilities.


We offer a 2-year Bioinformatics Master program to students with a background either in computer science or in molecular biology, biochemistry or biomedicine. Our students acquire thorough theoretical knowledge and hands-on experience in bioinformatics, including sequence analysis, protein and RNA structural analysis, genomics, phylogenetics, analysis of high-throughput big data and machine learning methods.


We strive towards a friendly, collaborative and professional work environment that promotes excellent research and personal development.



Computational Biology

We are interested in the mechanisms of gene regulation and the prediction of RNA and protein structure. Our research encompasses a combination of experimental genomics methods and computational biology including promoter and enhancer analysis, and development of complex probabilistic models to predict, design and validate structure based on machine learning methods.

Thomas Hamelryck Thomas Hamelryck, Associate Professor
KU profile page | Personal research page
We are engaged with predicting, designing and determining the 3D structure of RNA and proteins, by developing sophisticated probabilistic models that describe aspects of protein structure. These models are mainly based on machine learning methods (including dynamic Bayesian networks), and directional statistics, the statistics of angles, directions and orientations.

Albin Sandelin Albin Sandelin, Professor
KU profile page | Personal research page
The Sandelin lab is a computation/experimental group with scientists from many fields. We focus on gene regulation, transcriptomics, epigenetics and technological and informatics aspects. With the help of computers, we probe large biological datasets that are generated using novel genomics techniques. One of our strengths is our many collaborations with high-profile experimental laboratories, which supply data to be analyzed.

Robin Andersson Robin Andersson, Associate Professor
KU profile page | Personal research page
The Andersson lab aims to characterize and better understand the architectures of transcriptional regulation and the fundamental properties of enhancers and promoters. In particular, we focus on enhancer transcription and its association with regulatory activity. We take a genomics approach and use computational and statistical learning techniques to model transcriptional regulation based on large-scale sequencing data.

Ole Winther Ole Winther, Professor MSO
KU profile page
We have two focuses: developing machine learning and AI methods and applying them to genomic data in a clinical setting, and biological sequence analysis and medical informatics. The machine learning research is done with the jointly affiliated group at DTU Compute. Clinical genomic research is carried out in collaboration with Genomic Medicine, Rigshospitalet. An example of a current project is deep generative models for analysis of single cell RNAseq data.

Amelie Stein Amelie Stein, Assistant Professor
KU profile page
Our lab studies the consequences of sequence variants on proteins, focusing on their cellular stability and function. We perform high-throughput assays on protein variants and build on this data to develop and improve methods for prediction of variant consequences. We then apply these methods to determine whether genomic variants are likely to be pathogenic. Further we aim to integrate effects of multiple mutations for applications in protein engineering.

Shilpa Garg Shilpa Garg, Assistant Professor
KU profile page
My group develops new computational algorithms for solving fundamental biological problems of genome assembly, haplotyping and structural variants calling. Our methods and tools have applications in precision medicine and biodiversity and may also be relevant to large-scale international genome efforts such as the Human Pangenome project, Personal Genome Project and Darwin Tree of Life.


Population and Statistical Genetics

Our group develops and applies statistical and computational methods for analysis of genomic data in diverse organisms ranging from Greenlandic populations to ruminants and African mammals. We contribute to multiple fields including human and animal disease and treatment, livestock production, migration and speciation processes and complex population analyses.

Hans Siegismund Hans Siegismund, Associate Professor
KU profile page
We work on population genetics, phylogeography and speciation processes of large African mammals, mainly bovids and great apes.  Another research area includes the study of evolutionary genetics of Foot-and-mouth disease (FMD) virus in East Africa.

Anders Albrechtsen Anders Albrechtsen, Professor
KU profile page | Personal research page
Our group develops statistical and computational methods for analysis of genomic data including methods for performing multi-loci association studies, methods for detecting and correcting for population stratification, methods for detecting natural selection, loci dependent methods for modeling identity-by-descent and various methods for analysis of second generation sequencing data.

Ida Moltke Ida Moltke, Associate Professor
KU profile page | Personal research page
We develop and apply statistical methods to genomic data with the purpose of gaining insights into human disease, history and evolution. For instance, by studying DNA from the Greenlandic population we recently identified a genetic variant that explains 10-15% of all cases of type 2 diabetes in Greenland. We have also looked into the migration history of the Artic and are currently investigating how the Greenlanders have genetically adapted the Arctic cold and their very fat-rich diet consisting mainly of seal and fish.

Rasmus Heller Rasmus Heller, Assistant Professor
KU profile page | Personal research page
We study evolutionary and population genetics in wild mammals, focusing on ruminants. Our research interests include elucidating how climate, ecosystems and humans have influenced wild mammal populations in Africa, studying adaptive introgression in bovids and exploring how genomic data can be used to aid conservation. We are also involved in a ruminant genome project which aims to understand how ruminants evolved new anatomical structures, and how it has helped them become one of the most successful mammal lineages.


We employ and develop molecular, genetic and biochemical approaches to study how RNA-based mechanisms regulate gene expression and cellular development. We focus on RNA structure, RNA modifications and RNA-protein interactions as well as miRNA regulation in plants.

Peter Brodersen Peter Brodersen, Professor
KU profile page | Personal research page
Our group studies the mechanisms by which small RNAs regulate gene expression. We use the flowering plant Arabidopsis thaliana as a model system, and make particular use of molecular genetic and biochemical approaches in our work.

Jan Christiansen Jan Christiansen, Associate Professor
KU profile page | Personal research page
We are mainly focusing on post-transcriptional events such as RNA localization, RNA stability, and translational control with an emphasis on the role of cytoplasmic RNA-binding proteins expressed during fetal life and oncogenesis.

Jeppe Vinther Jeppe Vinther, Associate Professor
KU profile page | Personal research page
In our group we aim to determine how RNA structure and RNA-protein interactions affect basal cellular processes. This knowledge is important to understand ways of improving the efficiency and specificity of RNA based drugs.


The Bioinformatics Centre offers a range of free on-line services, open-access databases and open source software packages.

ANGSD is a software for analyzing next generation sequencing data. The software can handle a number of different input types from mapped reads to imputed genotype probabilities. Most methods take genotype uncertainty into account instead of basing the analysis on called genotypes. This is especially useful for low and medium depth data.
Companion to the paper "MicroRNA transfection and AGO-bound CLIP-seq datasets reveal distinct determinants of miRNA action" and provides predictions for miRNA targets for human and mouse using the two predictive models described in the paper: a model trained from microarray studies following transfection, and a model trained from PAR-CLIP datasets.
Asap logo Asap
A framework for promoter analysis based on a fast search engine using enhanced suffix arrays. The framework includes several statistics for calculating over-representation of motifs in a set of promoters from co-regulated genes compared to a background set.
BARNACLE is a Python library for RNA 3D structure prediction. It can be used for probabilistic sampling of RNA structures that are compatible with a given nucleotide sequence and that are RNA like on a local length scale.
BASILISK is a probabilistic model of the conformational space of amino acid side chains in proteins. Unlike rotamer libraries, BASILISK models the chi angles in continuous space, including the influence of the protein's backbone.
The Bayesembler is a Bayesian method for doing transcriptome assembly from RNA-seq data.
BayesMD is a flexible, fully Bayesian model for motif discovery consisting of motif, background and alignment modules. BayesMD can be customized to different kind of biological applications, e.g. microarray, ChIP-chip, ditag, CAGE data analysis by integrating appropriately chosen features and functionalities.
BloodSpot logo BloodSpot
BloodSpot is a database that provides gene expression profiles of genes and gene signatures in healthy and malignant hematopoiesis and includes data from both humans and mice.
BWA-PSSM is a probabilistic short read aligner based on the use of position specific scoring matrices (PSSM). Like many of the existing aligners it is fast and sensitive. Unlike most other aligners, however, it also adaptible in the sense that one can direct the alignment based on known biases within the data set. It is coded as a modification of the original BWA alignment program and shares the genome index structure as well as many of the command line options.
cWords is a method used for identification of over-represented words in a set of ranked sequences. By default it is aimed at human miRNA analysis; you give a ranked list of differential expression values of genes across two conditions and it finds words in 3' UTRs of lengths 6 and 7 over-represented in the most regulated genes.

Exploration of isoform switches in cancer
For easy and fast exploration of isoform switches in TCGA cancer data we generated three interactive online web-services, which can produce the isoform switch analysis plots for all genes with a isoform switch. To facilitate target exploration the gene can be selected/explored through three different angles:

HemaExplorer is an easy webtool for visualization of gene expression in the hematopoetic system. The webserver takes one or two genes as query and provides plots of the expression of the gene in cells involved in haematopoiesis. Currently the database contains options for the human normal myeloid system, human AML and mouse hematopoietic system.
JASPAR is the leading open-access database of matrix profiles describing the DNA-binding patterns of transcription factors and other proteins interacting with DNA in a sequence-specific manner.
Kaiju logo Kaiju
Kaiju is a program for the taxonomic assignment of high-throughput sequencing reads from whole-genome sequencing of metagenomic DNA.
MASTR performs multiple alignment and secondary structure prediction on a set of structural RNA sequences.
MoAn logo MoAn - Motif Annealer
MoAn is a discriminative pattern finder capable of using a very large negative set to greatly improve its predictive power. It is capable of handling sequences that are not repeat masked given that the negative set is a representative sample of promoters from the organism examined.
Mocapy logo Mocapy++
Mocapy++ is a Dynamic Bayesian Network toolkit, implemented in C++. It supports discrete, multinomial, Gaussian, Kent, Von Mises and Poisson nodes. Inference and learning is done by Gibbs sampling/Stochastic-EM.
  Phaistos logo Phaistos
Phaistos is a collection of tools for proteins structure prediction. It currently features the FB5DBN and TorusDBN models, which make it possible to sample protein structures compatible with a given amino acid and/or secondary structure sequence.
Phobius logo Phobius
Phobius is a server for prediction of transmembrane topology and signal peptides from the amino acid sequence of a protein.
Saqqaq logo Saqqaq Genome Project
The primary data from the saqqaq genome project: the sequencing of an Ancient Human Genome, obtained from a permafrost-preserved hair, about 4,000 years old, of a male palaeo-Eskimo of the Saqqaq culture, the earliest known settlers in Greenland.
Webserver for Aligning non-coding RNAs. WAR is an easy-to-use webserver that makes it possible to simultaneously use the best methods for aligning and predicting the consensus secondary structure for a set of non-coding RNA sequences.



























Section for Computational
and RNA Biology

Ole Maaløes Vej 5
DK-2200 Copenhagen N

Associate Professor Jeppe Vinther