Automatic generation of gene finders for eukaryotic species

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Automatic generation of gene finders for eukaryotic species. / Terkelsen, Kasper Munch; Krogh, A.

In: BMC Bioinformatics, Vol. 7, No. 263, 2006.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Terkelsen, KM & Krogh, A 2006, 'Automatic generation of gene finders for eukaryotic species', BMC Bioinformatics, vol. 7, no. 263. https://doi.org/10.1186/1471-2105-7-263

APA

Terkelsen, K. M., & Krogh, A. (2006). Automatic generation of gene finders for eukaryotic species. BMC Bioinformatics, 7(263). https://doi.org/10.1186/1471-2105-7-263

Vancouver

Terkelsen KM, Krogh A. Automatic generation of gene finders for eukaryotic species. BMC Bioinformatics. 2006;7(263). https://doi.org/10.1186/1471-2105-7-263

Author

Terkelsen, Kasper Munch ; Krogh, A. / Automatic generation of gene finders for eukaryotic species. In: BMC Bioinformatics. 2006 ; Vol. 7, No. 263.

Bibtex

@article{f4e822c06c3611dcbee902004c4f4f50,
title = "Automatic generation of gene finders for eukaryotic species",
abstract = "BackgroundThe number of sequenced eukaryotic genomes is rapidly increasing. This means that over time it will be hard to keep supplying customised gene finders for each genome. This calls for procedures to automatically generate species-specific gene finders and to re-train them as the quantity and quality of reliable gene annotation grows.ResultsWe present a procedure, Agene, that automatically generates a species-specific gene predictor from a set of reliable mRNA sequences and a genome. We apply a Hidden Markov model (HMM) that implements explicit length distribution modelling for all gene structure blocks using acyclic discrete phase type distributions. The state structure of the each HMM is generated dynamically from an array of sub-models to include only gene features represented in the training set.ConclusionAcyclic discrete phase type distributions are well suited to model sequence length distributions. The performance of each individual gene predictor on each individual genome is comparable to the best of the manually optimised species-specific gene finders. It is shown that species-specific gene finders are superior to gene finders trained on other species.",
author = "Terkelsen, {Kasper Munch} and A. Krogh",
year = "2006",
doi = "10.1186/1471-2105-7-263",
language = "English",
volume = "7",
journal = "B M C Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central Ltd.",
number = "263",

}

RIS

TY - JOUR

T1 - Automatic generation of gene finders for eukaryotic species

AU - Terkelsen, Kasper Munch

AU - Krogh, A.

PY - 2006

Y1 - 2006

N2 - BackgroundThe number of sequenced eukaryotic genomes is rapidly increasing. This means that over time it will be hard to keep supplying customised gene finders for each genome. This calls for procedures to automatically generate species-specific gene finders and to re-train them as the quantity and quality of reliable gene annotation grows.ResultsWe present a procedure, Agene, that automatically generates a species-specific gene predictor from a set of reliable mRNA sequences and a genome. We apply a Hidden Markov model (HMM) that implements explicit length distribution modelling for all gene structure blocks using acyclic discrete phase type distributions. The state structure of the each HMM is generated dynamically from an array of sub-models to include only gene features represented in the training set.ConclusionAcyclic discrete phase type distributions are well suited to model sequence length distributions. The performance of each individual gene predictor on each individual genome is comparable to the best of the manually optimised species-specific gene finders. It is shown that species-specific gene finders are superior to gene finders trained on other species.

AB - BackgroundThe number of sequenced eukaryotic genomes is rapidly increasing. This means that over time it will be hard to keep supplying customised gene finders for each genome. This calls for procedures to automatically generate species-specific gene finders and to re-train them as the quantity and quality of reliable gene annotation grows.ResultsWe present a procedure, Agene, that automatically generates a species-specific gene predictor from a set of reliable mRNA sequences and a genome. We apply a Hidden Markov model (HMM) that implements explicit length distribution modelling for all gene structure blocks using acyclic discrete phase type distributions. The state structure of the each HMM is generated dynamically from an array of sub-models to include only gene features represented in the training set.ConclusionAcyclic discrete phase type distributions are well suited to model sequence length distributions. The performance of each individual gene predictor on each individual genome is comparable to the best of the manually optimised species-specific gene finders. It is shown that species-specific gene finders are superior to gene finders trained on other species.

U2 - 10.1186/1471-2105-7-263

DO - 10.1186/1471-2105-7-263

M3 - Journal article

C2 - 16712739

VL - 7

JO - B M C Bioinformatics

JF - B M C Bioinformatics

SN - 1471-2105

IS - 263

ER -

ID: 1092459