[x] close
Chapter category: Bioinformatics
Literature and Genome Data Mining for Prioritizing Disease-associated Genes
Chapter authors:
Carolina Perez-Iratxeta, Peer Bork and Miguel A. Andrade
[+] view image
The first step in understanding the molecular biology of an inherited disease is to identify
which gene or genes are carrying variants. This process starts with locating the
mutations in a chromosomal band, as narrow as possible, and follows with the manual
analysis of all the genes mapping in this region. Usually this is not an easy task, but it can be
facilitated by complementary computational approaches that evaluate all genes in a region of
interest. We present here a method that combines literature mining, gene annotations, and
sequence homology searches to prioritize candidate genes involved in a given genetic disorder.
The method progresses in two steps. Firstly, we compute associations of molecular and phenotypic
features as taken from MEDLINE. Secondly, for a disease with a given phenotype and
linked to a chromosomal region, sequence homology based searches are carried on the chromosomal
region to identify potential candidates that are scored using the precomputed associations.
The scoring of associations between biological concepts using links across databases can
be extended to other databases in Molecular Biology and to nondisease phenotypes.
» Access chapter for $19