Research highlights


Ruminant Genomes
Ruminants are a diverse group of mammals that includes families such as deer, cows, and goats. A large number of ruminant genomes were sequenced, and data analysis show large population reductions are coinciding with the migration of humans out of Africa. Also, authors found evidence for selection on cancer-related genes that may function in antler development in deer, and identified the genetic basis of adaptations that allow reindeer to survive in the harsh conditions of the Arctic. Adapted from abstract.
Large-scale ruminant genome sequencing provides insights into their evolution and distinct traits. Science 364 (6446), eaav6202 (2019)


Improving Accuracy in Variant Genotyping
Genotype estimates from short-read sequencing data are typically based on the alignment of reads to a linear reference, but reads originating from more complex variants often align poorly, resulting in biased genotype estimates. Here, we present a new method to efficiently perform unbiased, probabilistic genotyping across the variation spectrum. We also demonstrate that including a ‘variation-prior’ database containing already known variants significantly improves sensitivity. Adapted from abstract.
Accurate genotyping across variant classes and lengths using variant graphs. Nature Genetics 50, 1054–1059 (2018)


Greenlandic Genetic Variation
We show that a genetic mutation substantially increases the risk of type 2 diabetes for individuals who carry two copies of it. The mutation is common in native Greenlanders; hence, our finding has great potential to lead to better treatment of diabetes in this population.
Loss-of-function variants in ADCY3 increase risk of obesity and type 2 diabetes. Nature Genetics 50, 172–174 (2018)


Genome Denmark
We describe the construction of a reference genome based on high-coverage sequencing and de novo assemblies of 150 individuals with mate-pair libraries extending up to 20 kilobases. The reference genome is expected to strongly benefit precision medicine initiatives.
Sequencing and de novo assembly of 150 genomes from Denmark as a population reference. Nature 548, 87–91 (2017)


Promoter/Enhancer Atlas
We present a collection of active enhancers instrumental in the pursuit to understand regulation of differentiation and homeostasis, as these enhancers control temporal and cell-type-specific activation of gene expression in multicellular eukaryotes. We also present a comprehensive map over mammalian transcription start sites and their usage in human and mouse primary cells, cell lines and tissues. The functional annotation of mammalian cell-type-specific transcriptomes has wide applications in biomedical research.
An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014)
A promoter-level mammalian expression atlas. Nature 507, 462-70 (2014)


Bayesian Methods in Structural Bioinformatics
This is the first field-defining book on Bayesian methods in structural bioinformatics. The book provides an introduction to Bayesian statistics and concepts in machine learning and statistical physics. Chapters include describtion of state-of-the-art statistical methods in structural bioinformatics with a particular focus on statistical methods that have a clear interpretation in the framework of statistical physics. Adapted from abstract.
Bayesian Methods in Structural Bioinformatics, Statistics for Biology and Health. Springer-Verlag Berlin Heidelberg (2012)


Probabilistic Protein Structure Prediction
Protein structure prediction requires efficient probabilistic exploration of the structural space that correctly reflects the relative conformational stabilities. We have developed a fully probabilistic, continuous model of local protein structure in atomic detail. The model represents a significant theoretical and practical improvement over the widely used fragment assembly technique. Adapted from abstract.
A generative, probabilistic model of local protein structure. PNAS 105, 8932-8937 (2008)


Probabilistic Models of Proteins and Nucleic Acids
In this landmark textbook, authors describe probabilistic models used in large-scale sequence analysis. Examples are hidden Markov models used for analysing biological sequences, linguistic-grammar-based probabilistic models used for identifying RNA secondary structure, and probabilistic evolutionary models used for inferring phylogenies of sequences from different organisms. Adapted from abstract.
Biological sequence analysis. Probabilistic models of proteins and nucleic acids. Cambridge University Press (1998)