TY - JFULL AU - Mária Stachová and Lukáš Sobíšek PY - 2012/9/ TI - Data Mining Classification Methods Applied in Drug Design T2 - International Journal of Pharmacological and Pharmaceutical Sciences SP - 357 EP - 361 VL - 6 SN - 1307-6892 UR - https://publications.waset.org/pdf/963 PU - World Academy of Science, Engineering and Technology NX - Open Science Index 68, 2012 N2 - Data mining incorporates a group of statistical methods used to analyze a set of information, or a data set. It operates with models and algorithms, which are powerful tools with the great potential. They can help people to understand the patterns in certain chunk of information so it is obvious that the data mining tools have a wide area of applications. For example in the theoretical chemistry data mining tools can be used to predict moleculeproperties or improve computer-assisted drug design. Classification analysis is one of the major data mining methodologies. The aim of thecontribution is to create a classification model, which would be able to deal with a huge data set with high accuracy. For this purpose logistic regression, Bayesian logistic regression and random forest models were built using R software. TheBayesian logistic regression in Latent GOLD software was created as well. These classification methods belong to supervised learning methods. It was necessary to reduce data matrix dimension before construct models and thus the factor analysis (FA) was used. Those models were applied to predict the biological activity of molecules, potential new drug candidates. ER -