On the application of estimation of distribution algorithms to multi-marker tagging SNP selection
Ikusi/ Ireki
Data
2009Egilea
Mendiburu Alberro, Alexander
Zaitlen, Noah
Eskin, Eleazar
Lozano Alonso, José Antonio
Laburpena
This paper presents an algorithm for the automatic selection of a
minimal subset of tagging single nucleotide polymorphisms (SNPs) using an estimation of distribution algorithm (EDA). The EDA stochastically searches the constrained space of possible feasible solutions and takes
advantage of the underlying topological structure defined by the SNP correlations to model the problem interactions. The algorithm is evaluated
across the HapMap reference panel data sets. The introduced algorithm
is effective for the identification of minimal multi-marker SNP sets, which
considerably reduce the dimension of the tagging SNP set in comparison
with single-marker sets. New reduced tagging sets are obtained for all the
HapMap SNP regions considered. We also show that the information extracted from the interaction graph representing the correlations between
the SNPs can help to improve the efficiency of the optimization algorithm.
keywords: SNPs, tagging SNP selection, multi-marker selection, estimation of distribution algorithms, HapMap.