MCDCalc: Markov Chain Molecular Descriptors Calculator for Medicinal Chemistry
Ver/
Fecha
2020Autor
Carracedo Reboredo, Paula
Corona, Ramiro
Fernández Lozano, Carlos
Tsiliki, Georgia
Sarimveis, Haralambos
Aranzamendi Uruburu, Eider
Arrasate Gil, Sonia
Sotomayor Anduiza, María Nuria
Lete Expósito, María Esther
Munteanu, Cristian R.
González Díaz, Humberto
Metadatos
Mostrar el registro completo del ítem
Current Topics in Medicinal Chemistry 20(4) : 305-317 (2020)
Resumen
Cheminformatics models are able to predict different outputs (activity, property, chemical reactivity) in single molecules or complex molecular systems (catalyzed organic synthesis, metabolic reactions, nanoparticles, etc.). Specifically, Cheminformatics prediction of complex catalytic enantiose- lective reactions is a major goal in organic synthesis research and chemical industry. Markov Chain Molecular Descriptors (MCDs) have been largely used to solve Cheminformatics problems. There are different types of Markov chain descriptors such as Markov-Shannon entropies (Shk), Markov Means (Mk), Markov Moments (πk), etc. However, there are other possible MCDs that have not been used be- fore. In addition, the calculation of MCDs is done very often using specific software not always avail- able for general users and there is not an R library public available for the calculation of MCDs. This fact limits the availability of MCMD-based Cheminformatics procedures. In this work, we developed the first library in R for the calculation of MCDs. We also report here the first public web server for the calculation of MCDs online. In addition, we also compiled a desktop version of the software for offline use. These tools called MCDCalc include the calculation of a new class of MCDs called Markov Singu- lar values SVmax. We also report the first Cheminformatics study of a set of enantioselective organic reactions using the new class of indices. Not only enantioselectivity but a study of biological activity has also been investigated. Firstly, we studied the enantiomeric excess ee(%)[Rcat] for 324 α- amidoalkylation reactions. These reactions have a complex mechanism depending on various factors. The model includes MCDs of the substrate, solvent, chiral catalyst, product along with values of time of reaction, temperature, load of catalyst, etc. We tested several Machine Learning regression algorithms. The Random Forest regression model has R2 > 0.90 in training and test. Secondly, the biological activ- ity of 5644 compounds against colorectal cancer was studied. We developed a very interesting model able to predict with Specificity and Sensitivity 70-82% the cases of preclinical assays in both training and validation series. The work shows the potential of the new tool for computational studies in organic and medicinal chemistry