Search results for: interpretable descriptors
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 177

Search results for: interpretable descriptors

147 In Silico Modeling of Drugs Milk/Plasma Ratio in Human Breast Milk Using Structures Descriptors

Authors: Navid Kaboudi, Ali Shayanfar

Abstract:

Introduction: Feeding infants with safe milk from the beginning of their life is an important issue. Drugs which are used by mothers can affect the composition of milk in a way that is not only unsuitable, but also toxic for infants. Consuming permeable drugs during that sensitive period by mother could lead to serious side effects to the infant. Due to the ethical restrictions of drug testing on humans, especially women, during their lactation period, computational approaches based on structural parameters could be useful. The aim of this study is to develop mechanistic models to predict the M/P ratio of drugs during breastfeeding period based on their structural descriptors. Methods: Two hundred and nine different chemicals with their M/P ratio were used in this study. All drugs were categorized into two groups based on their M/P value as Malone classification: 1: Drugs with M/P>1, which are considered as high risk 2: Drugs with M/P>1, which are considered as low risk Thirty eight chemical descriptors were calculated by ACD/labs 6.00 and Data warrior software in order to assess the penetration during breastfeeding period. Later on, four specific models based on the number of hydrogen bond acceptors, polar surface area, total surface area, and number of acidic oxygen were established for the prediction. The mentioned descriptors can predict the penetration with an acceptable accuracy. For the remaining compounds (N= 147, 158, 160, and 174 for models 1 to 4, respectively) of each model binary regression with SPSS 21 was done in order to give us a model to predict the penetration ratio of compounds. Only structural descriptors with p-value<0.1 remained in the final model. Results and discussion: Four different models based on the number of hydrogen bond acceptors, polar surface area, and total surface area were obtained in order to predict the penetration of drugs into human milk during breastfeeding period About 3-4% of milk consists of lipids, and the amount of lipid after parturition increases. Lipid soluble drugs diffuse alongside with fats from plasma to mammary glands. lipophilicity plays a vital role in predicting the penetration class of drugs during lactation period. It was shown in the logistic regression models that compounds with number of hydrogen bond acceptors, PSA and TSA above 5, 90 and 25 respectively, are less permeable to milk because they are less soluble in the amount of fats in milk. The pH of milk is acidic and due to that, basic compounds tend to be concentrated in milk than plasma while acidic compounds may consist lower concentrations in milk than plasma. Conclusion: In this study, we developed four regression-based models to predict the penetration class of drugs during the lactation period. The obtained models can lead to a higher speed in drug development process, saving energy, and costs. Milk/plasma ratio assessment of drugs requires multiple steps of animal testing, which has its own ethical issues. QSAR modeling could help scientist to reduce the amount of animal testing, and our models are also eligible to do that.

Keywords: logistic regression, breastfeeding, descriptors, penetration

Procedia PDF Downloads 41
146 A Comparative and Mixed Methods Study of Possible Selves of Adolescent Boys in an Observation Home and a Children's Home in India

Authors: Apurva Sapra

Abstract:

The aim of this research was to study and compare the nature of expected, feared and hoped-for selves in institutionalized adolescent boys in two residential settings – an observation home with children in conflict with the law, and a children’s home with children in need of care and protection. The study uses a concurrent mixed methods design, in which eight adolescent boys from each group, aged 13-17, were asked to respond to a questionnaire, followed by an in-depth interview. The questionnaire looked into the total scores on current, probable and hoped-for/feared positive and negative self-descriptors. Possible selves of both groups were found to be influenced by their unique histories, such as with their experience of violence, interaction with the police and emphasis given on education. Expected selves and hoped-for selves were similar within the two groups. However, they were more concrete and attainable in the observation home and more ambitious in the children’s home. Quantitative results showed that on the positive self-descriptors, the participants in the observation home had a slightly lower total score on the current parameter as on the probable and hoped-for parameters. The participants in the children’s home showed similar results on current and probable positive self-descriptors, with higher scores on the hoped-for parameter. For most of the negative self-descriptors, the current score for the observation home group was lower than the expected score, and for the children’s home group, they were feared slightly more than they were expected. Along with the nature of possible selves, the study also looked into threats and support to desired and feared possible selves, as well as strategies to attain the desired possible selves and avoid feared possible selves. While threats to possible selves were identified as external and internal in both groups, the participants in the children’s home tended to identify threats as external. The categories of support were similar across the two groups, although the nature of support provided differed. Strategies adopted by participants in the observation home could be clearly divided as past, present and future strategies, while those adopted by participants in the children’s home had an overlap with past and future strategies. The institution was perceived as having a negative influence for the future in the observation home group, but positive in the children’s home group. Limitations of the study and recommendations for future research, policy setting and the counselling profession are discussed.

Keywords: adolescents, expected self, feared self, hoped-for self, institutions, possible selves

Procedia PDF Downloads 206
145 Using Machine Learning to Classify Human Fetal Health and Analyze Feature Importance

Authors: Yash Bingi, Yiqiao Yin

Abstract:

Reduction of child mortality is an ongoing struggle and a commonly used factor in determining progress in the medical field. The under-5 mortality number is around 5 million around the world, with many of the deaths being preventable. In light of this issue, Cardiotocograms (CTGs) have emerged as a leading tool to determine fetal health. By using ultrasound pulses and reading the responses, CTGs help healthcare professionals assess the overall health of the fetus to determine the risk of child mortality. However, interpreting the results of the CTGs is time-consuming and inefficient, especially in underdeveloped areas where an expert obstetrician is hard to come by. Using a support vector machine (SVM) and oversampling, this paper proposed a model that classifies fetal health with an accuracy of 99.59%. To further explain the CTG measurements, an algorithm based on Randomized Input Sampling for Explanation ((RISE) of Black-box Models was created, called Feature Alteration for explanation of Black Box Models (FAB), and compared the findings to Shapley Additive Explanations (SHAP) and Local Interpretable Model Agnostic Explanations (LIME). This allows doctors and medical professionals to classify fetal health with high accuracy and determine which features were most influential in the process.

Keywords: machine learning, fetal health, gradient boosting, support vector machine, Shapley values, local interpretable model agnostic explanations

Procedia PDF Downloads 119
144 Images Selection and Best Descriptor Combination for Multi-Shot Person Re-Identification

Authors: Yousra Hadj Hassen, Walid Ayedi, Tarek Ouni, Mohamed Jallouli

Abstract:

To re-identify a person is to check if he/she has been already seen over a cameras network. Recently, re-identifying people over large public cameras networks has become a crucial task of great importance to ensure public security. The vision community has deeply investigated this area of research. Most existing researches rely only on the spatial appearance information from either one or multiple person images. Actually, the real person re-id framework is a multi-shot scenario. However, to efficiently model a person’s appearance and to choose the best samples to remain a challenging problem. In this work, an extensive comparison of descriptors of state of the art associated with the proposed frame selection method is studied. Specifically, we evaluate the samples selection approach using multiple proposed descriptors. We show the effectiveness and advantages of the proposed method by extensive comparisons with related state-of-the-art approaches using two standard datasets PRID2011 and iLIDS-VID.

Keywords: camera network, descriptor, model, multi-shot, person re-identification, selection

Procedia PDF Downloads 256
143 A Machine Learning-Based Model to Screen Antituberculosis Compound Targeted against LprG Lipoprotein of Mycobacterium tuberculosis

Authors: Syed Asif Hassan, Syed Atif Hassan

Abstract:

Multidrug-resistant Tuberculosis (MDR-TB) is an infection caused by the resistant strains of Mycobacterium tuberculosis that do not respond either to isoniazid or rifampicin, which are the most important anti-TB drugs. The increase in the occurrence of a drug-resistance strain of MTB calls for an intensive search of novel target-based therapeutics. In this context LprG (Rv1411c) a lipoprotein from MTB plays a pivotal role in the immune evasion of Mtb leading to survival and propagation of the bacterium within the host cell. Therefore, a machine learning method will be developed for generating a computational model that could predict for a potential anti LprG activity of the novel antituberculosis compound. The present study will utilize dataset from PubChem database maintained by National Center for Biotechnology Information (NCBI). The dataset involves compounds screened against MTB were categorized as active and inactive based upon PubChem activity score. PowerMV, a molecular descriptor generator, and visualization tool will be used to generate the 2D molecular descriptors for the actives and inactive compounds present in the dataset. The 2D molecular descriptors generated from PowerMV will be used as features. We feed these features into three different classifiers, namely, random forest, a deep neural network, and a recurring neural network, to build separate predictive models and choosing the best performing model based on the accuracy of predicting novel antituberculosis compound with an anti LprG activity. Additionally, the efficacy of predicted active compounds will be screened using SMARTS filter to choose molecule with drug-like features.

Keywords: antituberculosis drug, classifier, machine learning, molecular descriptors, prediction

Procedia PDF Downloads 359
142 Improved Performance in Content-Based Image Retrieval Using Machine Learning Approach

Authors: B. Ramesh Naik, T. Venugopal

Abstract:

This paper presents a novel approach which improves the high-level semantics of images based on machine learning approach. The contemporary approaches for image retrieval and object recognition includes Fourier transforms, Wavelets, SIFT and HoG. Though these descriptors helpful in a wide range of applications, they exploit zero order statistics, and this lacks high descriptiveness of image features. These descriptors usually take benefit of primitive visual features such as shape, color, texture and spatial locations to describe images. These features do not adequate to describe high-level semantics of the images. This leads to a gap in semantic content caused to unacceptable performance in image retrieval system. A novel method has been proposed referred as discriminative learning which is derived from machine learning approach that efficiently discriminates image features. The analysis and results of proposed approach were validated thoroughly on WANG and Caltech-101 Databases. The results proved that this approach is very competitive in content-based image retrieval.

Keywords: CBIR, discriminative learning, region weight learning, scale invariant feature transforms

Procedia PDF Downloads 148
141 Object Detection Based on Plane Segmentation and Features Matching for a Service Robot

Authors: António J. R. Neves, Rui Garcia, Paulo Dias, Alina Trifan

Abstract:

With the aging of the world population and the continuous growth in technology, service robots are more and more explored nowadays as alternatives to healthcare givers or personal assistants for the elderly or disabled people. Any service robot should be capable of interacting with the human companion, receive commands, navigate through the environment, either known or unknown, and recognize objects. This paper proposes an approach for object recognition based on the use of depth information and color images for a service robot. We present a study on two of the most used methods for object detection, where 3D data is used to detect the position of objects to classify that are found on horizontal surfaces. Since most of the objects of interest accessible for service robots are on these surfaces, the proposed 3D segmentation reduces the processing time and simplifies the scene for object recognition. The first approach for object recognition is based on color histograms, while the second is based on the use of the SIFT and SURF feature descriptors. We present comparative experimental results obtained with a real service robot.

Keywords: object detection, feature, descriptors, SIFT, SURF, depth images, service robots

Procedia PDF Downloads 512
140 Using Electronic Portfolio to Promote English Speaking Ability of EFL Undergraduate Students

Authors: Jiraporn Lao-Un, Dararat Khampusaen

Abstract:

Lack of exposure to English language in the authentic English setting naturally leads to a lack of fluency in the language. As a result, Thai EFL learners are struggling in meeting with the communication 'can do' descriptors of the Common European Framework of References (CEFR) required by the Ministry of Education. This initial phase of the ongoing study, employing the e-portfolio to promote the English speaking ability, probed into the effects of the use of e-portfolio on Thai EFL nursing students' speaking ability. Also, their opinions towards the use of e-portfolio to enhance their speaking ability were investigated. The participants were 44 undergraduate nursing students at a Thai College of Nursing. The participants undertook four lessons to promote their communication skills according to the CEFR criteria. Throughout the semester, the participants videotaped themselves while completing the four speaking tasks. The videos were then uploaded onto the e-portfolio website where the researcher provided them with the feedbacks. The video records were analyzed by the speaking rubric designed according to the CEFR 'can do' descriptors. Also, students were required to record self-reflections in video format and upload onto the same URL Students' oral self-reflections were coded to find out the perceptions towards the use of the e-portfolio in promoting their speaking ability. The results from the two research instruments suggested the effectiveness of the tool on improving speaking ability, learner autonomy and media literacy skills. In addition, the oral reflection videos revealed positive opinion towards the tool. The discussion offers the current status of English speaking ability among Thai EFL students. This reveals the gaps between the EFL speaking ability and the CEFR ‘can do’ descriptors. In addition, the author raises the light on integration of the 21st century IT tool to enhance these students’ speaking ability. Lastly, the theoretical implications and recommendation for further study in integrating electronic tools to promote language skills in the EFL context are offered for further research.

Keywords: EFL communication, EFL speaking, English communication, E-learning, E-portfolio, speaking ability, Thai EFL learners

Procedia PDF Downloads 136
139 Human Action Retrieval System Using Features Weight Updating Based Relevance Feedback Approach

Authors: Munaf Rashid

Abstract:

For content-based human action retrieval systems, search accuracy is often inferior because of the following two reasons 1) global information pertaining to videos is totally ignored, only low level motion descriptors are considered as a significant feature to match the similarity between query and database videos, and 2) the semantic gap between the high level user concept and low level visual features. Hence, in this paper, we propose a method that will address these two issues and in doing so, this paper contributes in two ways. Firstly, we introduce a method that uses both global and local information in one framework for an action retrieval task. Secondly, to minimize the semantic gap, a user concept is involved by incorporating features weight updating (FWU) Relevance Feedback (RF) approach. We use statistical characteristics to dynamically update weights of the feature descriptors so that after every RF iteration feature space is modified accordingly. For testing and validation purpose two human action recognition datasets have been utilized, namely Weizmann and UCF. Results show that even with a number of visual challenges the proposed approach performs well.

Keywords: relevance feedback (RF), action retrieval, semantic gap, feature descriptor, codebook

Procedia PDF Downloads 439
138 An Automated System for the Detection of Citrus Greening Disease Based on Visual Descriptors

Authors: Sidra Naeem, Ayesha Naeem, Sahar Rahim, Nadia Nawaz Qadri

Abstract:

Citrus greening is a bacterial disease that causes considerable damage to citrus fruits worldwide. Efficient method for this disease detection must be carried out to minimize the production loss. This paper presents a pattern recognition system that comprises three stages for the detection of citrus greening from Orange leaves: segmentation, feature extraction and classification. Image segmentation is accomplished by adaptive thresholding. The feature extraction stage comprises of three visual descriptors i.e. shape, color and texture. From shape feature we have used asymmetry index, from color feature we have used histogram of Cb component from YCbCr domain and from texture feature we have used local binary pattern. Classification was done using support vector machines and k nearest neighbors. The best performances of the system is Accuracy = 88.02% and AUROC = 90.1% was achieved by automatic segmented images. Our experiments validate that: (1). Segmentation is an imperative preprocessing step for computer assisted diagnosis of citrus greening, and (2). The combination of shape, color and texture features form a complementary set towards the identification of citrus greening disease.

Keywords: citrus greening, pattern recognition, feature extraction, classification

Procedia PDF Downloads 145
137 Predicting Potential Protein Therapeutic Candidates from the Gut Microbiome

Authors: Prasanna Ramachandran, Kareem Graham, Helena Kiefel, Sunit Jain, Todd DeSantis

Abstract:

Microbes that reside inside the mammalian GI tract, commonly referred to as the gut microbiome, have been shown to have therapeutic effects in animal models of disease. We hypothesize that specific proteins produced by these microbes are responsible for this activity and may be used directly as therapeutics. To speed up the discovery of these key proteins from the big-data metagenomics, we have applied machine learning techniques. Using amino acid sequences of known epitopes and their corresponding binding partners, protein interaction descriptors (PID) were calculated, making a positive interaction set. A negative interaction dataset was calculated using sequences of proteins known not to interact with these same binding partners. Using Random Forest and positive and negative PID, a machine learning model was trained and used to predict interacting versus non-interacting proteins. Furthermore, the continuous variable, cosine similarity in the interaction descriptors was used to rank bacterial therapeutic candidates. Laboratory binding assays were conducted to test the candidates for their potential as therapeutics. Results from binding assays reveal the accuracy of the machine learning prediction and are subsequently used to further improve the model.

Keywords: protein-interactions, machine-learning, metagenomics, microbiome

Procedia PDF Downloads 343
136 Molecular Modeling of 17-Picolyl and 17-Picolinylidene Androstane Derivatives with Anticancer Activity

Authors: Sanja Podunavac-Kuzmanović, Strahinja Kovačević, Lidija Jevrić, Evgenija Djurendić, Jovana Ajduković

Abstract:

In the present study, the molecular modeling of a series of 24 17-picolyl and 17-picolinylidene androstane derivatives whit significant anticancer activity was carried out. Modelling of studied compounds was performed by CS ChemBioDraw Ultra v12.0 program for drawing 2D molecular structures and CS ChemBio3D Ultra v12.0 for 3D molecular modelling. The obtained 3D structures were subjected to energy minimization using molecular mechanics force field method (MM2). The cutoff for structure optimization was set at a gradient of 0.1 kcal/Åmol. Full geometry optimization was done by the Austin Model 1 (AM1) until the root mean square (RMS) gradient reached a value smaller than 0.0001 kcal/Åmol using Molecular Orbital Package (MOPAC) program. The obtained physicochemical, lipophilicity and topological descriptors were used for analysis of molecular similarities and dissimilarities applying suitable chemometric methods (principal component analysis and cluster analysis). These results are the part of the project No. 114-451-347/2015-02, financially supported by the Provincial Secretariat for Science and Technological Development of Vojvodina and CMST COST Action CM1306.

Keywords: androstane derivatives, anticancer activity, chemometrics, molecular descriptors

Procedia PDF Downloads 329
135 Qsar Studies of Certain Novel Heterocycles Derived From bis-1, 2, 4 Triazoles as Anti-Tumor Agents

Authors: Madhusudan Purohit, Stephen Philip, Bharathkumar Inturi

Abstract:

In this paper we report the quantitative structure activity relationship of novel bis-triazole derivatives for predicting the activity profile. The full model encompassed a dataset of 46 Bis- triazoles. Tripos Sybyl X 2.0 program was used to conduct CoMSIA QSAR modeling. The Partial Least-Squares (PLS) analysis method was used to conduct statistical analysis and to derive a QSAR model based on the field values of CoMSIA descriptor. The compounds were divided into test and training set. The compounds were evaluated by various CoMSIA parameters to predict the best QSAR model. An optimum numbers of components were first determined separately by cross-validation regression for CoMSIA model, which were then applied in the final analysis. A series of parameters were used for the study and the best fit model was obtained using donor, partition coefficient and steric parameters. The CoMSIA models demonstrated good statistical results with regression coefficient (r2) and the cross-validated coefficient (q2) of 0.575 and 0.830 respectively. The standard error for the predicted model was 0.16322. In the CoMSIA model, the steric descriptors make a marginally larger contribution than the electrostatic descriptors. The finding that the steric descriptor is the largest contributor for the CoMSIA QSAR models is consistent with the observation that more than half of the binding site area is occupied by steric regions.

Keywords: 3D QSAR, CoMSIA, triazoles, novel heterocycles

Procedia PDF Downloads 419
134 Universiti Sains Malaysia

Authors: Eisa A. Alsafran, Francis T. Edum-Fotwe, Wayne E. Lord

Abstract:

The degree to which a public client actively participates in Public Private Partnership (PPP) schemes, is seen as a determinant of the success of the arrangement, and in particular, efficiency in the delivery of the assets of any infrastructure development. The asset delivery is often an early barometer for judging the overall performance of the PPP. Currently, there are no defined descriptors for the degree of such participation. The lack of defined descriptors makes the association between the degree of participation and efficiency of asset delivery, difficult to establish. This is particularly so if an optimum effect is desired. In addition, such an association is important for the strategic decision to embark on any PPP initiative. This paper presents a conceptual model of different levels of participation that characterise PPP schemes. The modelling was achieved by a systematic review of reported sources that address essential aspects and structures of PPP schemes, published from 2001 to 2015. As a precursor to the modelling, the common areas of Public Client Participation (PCP) were investigated. Equity and risk emerged as two dominant factors in the common areas of PCP, and were therefore adopted to form the foundation of the modelling. The resultant conceptual model defines the different states of combined PCP. The defined states provide a more rational basis for establishing how the degree of PCP affects the efficiency of asset delivery in PPP schemes.

Keywords: asset delivery, infrastructure development, public private partnership, public client participation

Procedia PDF Downloads 244
133 Normalizing Flow to Augmented Posterior: Conditional Density Estimation with Interpretable Dimension Reduction for High Dimensional Data

Authors: Cheng Zeng, George Michailidis, Hitoshi Iyatomi, Leo L. Duan

Abstract:

The conditional density characterizes the distribution of a response variable y given other predictor x and plays a key role in many statistical tasks, including classification and outlier detection. Although there has been abundant work on the problem of Conditional Density Estimation (CDE) for a low-dimensional response in the presence of a high-dimensional predictor, little work has been done for a high-dimensional response such as images. The promising performance of normalizing flow (NF) neural networks in unconditional density estimation acts as a motivating starting point. In this work, the authors extend NF neural networks when external x is present. Specifically, they use the NF to parameterize a one-to-one transform between a high-dimensional y and a latent z that comprises two components [zₚ, zₙ]. The zₚ component is a low-dimensional subvector obtained from the posterior distribution of an elementary predictive model for x, such as logistic/linear regression. The zₙ component is a high-dimensional independent Gaussian vector, which explains the variations in y not or less related to x. Unlike existing CDE methods, the proposed approach coined Augmented Posterior CDE (AP-CDE) only requires a simple modification of the common normalizing flow framework while significantly improving the interpretation of the latent component since zₚ represents a supervised dimension reduction. In image analytics applications, AP-CDE shows good separation of 𝑥-related variations due to factors such as lighting condition and subject id from the other random variations. Further, the experiments show that an unconditional NF neural network based on an unsupervised model of z, such as a Gaussian mixture, fails to generate interpretable results.

Keywords: conditional density estimation, image generation, normalizing flow, supervised dimension reduction

Procedia PDF Downloads 68
132 Theoretical Study of Structural Parameters, Chemical Reactivity and Spectral and Thermodynamical Properties of Organometallic Complexes Containing Zinc, Nickel and Cadmium with Nitrilotriacetic Acid and Tea Ligands: Density Functional Theory Investigation

Authors: Nour El Houda Bensiradj, Nafila Zouaghi, Taha Bensiradj

Abstract:

The pollution of water resources is characterized by the presence of microorganisms, chemicals, or industrial waste. Generally, this waste generates effluents containing large quantities of heavy metals, making the water unsuitable for consumption and causing the death of aquatic life and associated biodiversity. Currently, it is very important to assess the impact of heavy metals in water pollution as well as the processes for treating and reducing them. Among the methods of water treatment and disinfection, we mention the complexation of metal ions using ligands which serve to precipitate and subsequently eliminate these ions. In this context, we are interested in the study of complexes containing heavy metals such as zinc, nickel, and cadmium, which are present in several industrial discharges and are discharged into water sources. We will use the ligands of triethanolamine (TEA) and nitrilotriacetic acid (NTA). The theoretical study is based on molecular modeling, using the density functional theory (DFT) implemented in the Gaussian 09 program. The geometric and energetic properties of the above complexes will be calculated. Spectral properties such as infrared, as well as reactivity descriptors, and thermodynamic properties such as enthalpy and free enthalpy will also be determined.

Keywords: heavy metals, NTA, TEA, DFT, IR, reactivity descriptors

Procedia PDF Downloads 70
131 QSRR Analysis of 17-Picolyl and 17-Picolinylidene Androstane Derivatives Based on Partial Least Squares and Principal Component Regression

Authors: Sanja Podunavac-Kuzmanović, Strahinja Kovačević, Lidija Jevrić, Evgenija Djurendić, Jovana Ajduković

Abstract:

There are several methods for determination of the lipophilicity of biologically active compounds, however chromatography has been shown as a very suitable method for this purpose. Chromatographic (C18-RP-HPLC) analysis of a series of 24 17-picolyl and 17-picolinylidene androstane derivatives was carried out. The obtained retention indices (logk, methanol (90%) / water (10%)) were correlated with calculated physicochemical and lipophilicity descriptors. The QSRR analysis was carried out applying principal component regression (PCR) and partial least squares regression (PLS). The PCR and PLS model were selected on the basis of the highest variance and the lowest root mean square error of cross-validation. The obtained PCR and PLS model successfully correlate the calculated molecular descriptors with logk parameter indicating the significance of the lipophilicity of compounds in chromatographic process. On the basis of the obtained results it can be concluded that the obtained logk parameters of the analyzed androstane derivatives can be considered as their chromatographic lipophilicity. These results are the part of the project No. 114-451-347/2015-02, financially supported by the Provincial Secretariat for Science and Technological Development of Vojvodina and CMST COST Action CM1105.

Keywords: androstane derivatives, chromatography, molecular structure, principal component regression, partial least squares regression

Procedia PDF Downloads 242
130 Computer-Aided Classification of Liver Lesions Using Contrasting Features Difference

Authors: Hussein Alahmer, Amr Ahmed

Abstract:

Liver cancer is one of the common diseases that cause the death. Early detection is important to diagnose and reduce the incidence of death. Improvements in medical imaging and image processing techniques have significantly enhanced interpretation of medical images. Computer-Aided Diagnosis (CAD) systems based on these techniques play a vital role in the early detection of liver disease and hence reduce liver cancer death rate.  This paper presents an automated CAD system consists of three stages; firstly, automatic liver segmentation and lesion’s detection. Secondly, extracting features. Finally, classifying liver lesions into benign and malignant by using the novel contrasting feature-difference approach. Several types of intensity, texture features are extracted from both; the lesion area and its surrounding normal liver tissue. The difference between the features of both areas is then used as the new lesion descriptors. Machine learning classifiers are then trained on the new descriptors to automatically classify liver lesions into benign or malignant. The experimental results show promising improvements. Moreover, the proposed approach can overcome the problems of varying ranges of intensity and textures between patients, demographics, and imaging devices and settings.

Keywords: CAD system, difference of feature, fuzzy c means, lesion detection, liver segmentation

Procedia PDF Downloads 293
129 Binding Studies of Complexes of Anticancer Drugs with DNA and Enzymes Involved in DNA Replication Using Molecular Docking and Cell Culture Techniques

Authors: Fouzia Perveen, Rumana Qureshi

Abstract:

The presently studied twelve anticancer drugs are the cytotoxic agents which inhibit the replication of DNA and activity of enzymes involved in DNA replication namely topoisomerase-II, polymerase and helicase and have shown remarkable anticancer activity in clinical trials. In this study, we performed molecular docking studies of twelve antitumor drugs against DNA and DNA enzymes in the presence and absence of ascorbic acid (AA) and developed the quantitative structure-activity relationship (QSAR) model for anticancer activity screening. A number of electronic and steric descriptors were calculated using MOE software package. QSAR was established showing a correlation of binding strength with various physicochemical descriptors. Out of these twelve, eight cytotoxic drugs were tested on Non-Small Cell Lung Cancer cell lines (H-157 and H-1299) in the absence and presence of ascorbic acid and experimental IC50 values were calculated. From the docking studies, binding constants were calculated indicating the strength of drug-DNA and drug-enzyme complex formation and it was correlated to the IC50 values (both experimental and theoretical). These results can offer useful references for directing the molecular design of DNA enzyme inhibitor with improved anticancer activity.

Keywords: ascorbic acid, binding constant, cytotoxic agents, cell culture, DNA, DNA enzymes, molecular docking

Procedia PDF Downloads 401
128 Sensory Gap Analysis on Port Wine Promotion and Perceptions

Authors: José Manue Carvalho Vieira, Mariana Magalhães, Elizabeth Serra

Abstract:

The Port Wine industry is essential to Portugal because it carries a tangible cultural heritage and for social and economic reasons. Positioned as a luxury product, brands need to pay more attention to the new generation's habits, preferences, languages, and sensory perceptions. Healthy lifestyles, anti-alcohol campaigns, and digitalisation of their buying decision process need to be better understood to understand the wine market in the future. The purpose of this study is to clarify the sensory perception gap between Port Wine descriptors promotion and the new generation's perceptions to help wineries to align their strategies. Based on the interpretivist approach - multiple methods and techniques (mixed-methods), different world views and different assumptions, and different data collection methods and analysis, this research integrated qualitative semi-structured interviews, Port Wine promotion contents, and social media perceptions mined by Sentiment Analysis Enginius algorithm. Findings confirm that Port Wine CEOs' strategies, brands' promotional content, and social perceptions are not sufficiently aligned. The central insight for Port Wine brands' managers is that there is a long and continuous work of understanding and associating their descriptors with the most relevant perceptual values and criteria of their targets to reposition (when necessary) and sustainably revitalise their brands. Finally, this study hypothesised a sensory gap that leads to a decrease in consumption, trying to find recommendations on how to transform it into an advantage for a better attraction towards the young age group (18-25).

Keywords: port wine, consumer habits, sensory gap analysis, wine marketing

Procedia PDF Downloads 206
127 Non-Linear Assessment of Chromatographic Lipophilicity and Model Ranking of Newly Synthesized Steroid Derivatives

Authors: Milica Karadzic, Lidija Jevric, Sanja Podunavac-Kuzmanovic, Strahinja Kovacevic, Anamarija Mandic, Katarina Penov Gasi, Marija Sakac, Aleksandar Okljesa, Andrea Nikolic

Abstract:

The present paper deals with chromatographic lipophilicity prediction of newly synthesized steroid derivatives. The prediction was achieved using in silico generated molecular descriptors and quantitative structure-retention relationship (QSRR) methodology with the artificial neural networks (ANN) approach. Chromatographic lipophilicity of the investigated compounds was expressed as retention factor value logk. For QSRR modeling, a feedforward back-propagation ANN with gradient descent learning algorithm was applied. Using the novel sum of ranking differences (SRD) method generated ANN models were ranked. The aim was to distinguish the most consistent QSRR model that can be found, and similarity or dissimilarity between the models that could be noticed. In this study, SRD was performed with average values of retention factor value logk as reference values. An excellent correlation between experimentally observed retention factor value logk and values predicted by the ANN was obtained with a correlation coefficient higher than 0.9890. Statistical results show that the established ANN models can be applied for required purpose. This article is based upon work from COST Action (TD1305), supported by COST (European Cooperation in Science and Technology).

Keywords: artificial neural networks, liquid chromatography, molecular descriptors, steroids, sum of ranking differences

Procedia PDF Downloads 291
126 Artificial Intelligence in Bioscience: The Next Frontier

Authors: Parthiban Srinivasan

Abstract:

With recent advances in computational power and access to enough data in biosciences, artificial intelligence methods are increasingly being used in drug discovery research. These methods are essentially a series of advanced statistics based exercises that review the past to indicate the likely future. Our goal is to develop a model that accurately predicts biological activity and toxicity parameters for novel compounds. We have compiled a robust library of over 150,000 chemical compounds with different pharmacological properties from literature and public domain databases. The compounds are stored in simplified molecular-input line-entry system (SMILES), a commonly used text encoding for organic molecules. We utilize an automated process to generate an array of numerical descriptors (features) for each molecule. Redundant and irrelevant descriptors are eliminated iteratively. Our prediction engine is based on a portfolio of machine learning algorithms. We found Random Forest algorithm to be a better choice for this analysis. We captured non-linear relationship in the data and formed a prediction model with reasonable accuracy by averaging across a large number of randomized decision trees. Our next step is to apply deep neural network (DNN) algorithm to predict the biological activity and toxicity properties. We expect the DNN algorithm to give better results and improve the accuracy of the prediction. This presentation will review all these prominent machine learning and deep learning methods, our implementation protocols and discuss these techniques for their usefulness in biomedical and health informatics.

Keywords: deep learning, drug discovery, health informatics, machine learning, toxicity prediction

Procedia PDF Downloads 333
125 Exploring 1,2,4-Triazine-3(2H)-One Derivatives as Anticancer Agents for Breast Cancer: A QSAR, Molecular Docking, ADMET, and Molecular Dynamics

Authors: Said Belaaouad

Abstract:

This study aimed to explore the quantitative structure-activity relationship (QSAR) of 1,2,4-Triazine-3(2H)-one derivative as a potential anticancer agent against breast cancer. The electronic descriptors were obtained using the Density Functional Theory (DFT) method, and a multiple linear regression techniques was employed to construct the QSAR model. The model exhibited favorable statistical parameters, including R2=0.849, R2adj=0.656, MSE=0.056, R2test=0.710, and Q2cv=0.542, indicating its reliability. Among the descriptors analyzed, absolute electronegativity (χ), total energy (TE), number of hydrogen bond donors (NHD), water solubility (LogS), and shape coefficient (I) were identified as influential factors. Furthermore, leveraging the validated QSAR model, new derivatives of 1,2,4-Triazine-3(2H)-one were designed, and their activity and pharmacokinetic properties were estimated. Subsequently, molecular docking (MD) and molecular dynamics (MD) simulations were employed to assess the binding affinity of the designed molecules. The Tubulin colchicine binding site, which plays a crucial role in cancer treatment, was chosen as the target protein. Through the simulation trajectory spanning 100 ns, the binding affinity was calculated using the MMPBSA script. As a result, fourteen novel Tubulin-colchicine inhibitors with promising pharmacokinetic characteristics were identified. Overall, this study provides valuable insights into the QSAR of 1,2,4-Triazine-3(2H)-one derivative as potential anticancer agent, along with the design of new compounds and their assessment through molecular docking and dynamics simulations targeting the Tubulin-colchicine binding site.

Keywords: QSAR, molecular docking, ADMET, 1, 2, 4-triazin-3(2H)-ones, breast cancer, anticancer, molecular dynamic simulations, MMPBSA calculation

Procedia PDF Downloads 64
124 Contribution to the Study of Automatic Epileptiform Pattern Recognition in Long Term EEG Signals

Authors: Christine F. Boos, Fernando M. Azevedo

Abstract:

Electroencephalogram (EEG) is a record of the electrical activity of the brain that has many applications, such as monitoring alertness, coma and brain death; locating damaged areas of the brain after head injury, stroke and tumor; monitoring anesthesia depth; researching physiology and sleep disorders; researching epilepsy and localizing the seizure focus. Epilepsy is a chronic condition, or a group of diseases of high prevalence, still poorly explained by science and whose diagnosis is still predominantly clinical. The EEG recording is considered an important test for epilepsy investigation and its visual analysis is very often applied for clinical confirmation of epilepsy diagnosis. Moreover, this EEG analysis can also be used to help define the types of epileptic syndrome, determine epileptiform zone, assist in the planning of drug treatment and provide additional information about the feasibility of surgical intervention. In the context of diagnosis confirmation the analysis is made using long term EEG recordings with at least 24 hours long and acquired by a minimum of 24 electrodes in which the neurophysiologists perform a thorough visual evaluation of EEG screens in search of specific electrographic patterns called epileptiform discharges. Considering that the EEG screens usually display 10 seconds of the recording, the neurophysiologist has to evaluate 360 screens per hour of EEG or a minimum of 8,640 screens per long term EEG recording. Analyzing thousands of EEG screens in search patterns that have a maximum duration of 200 ms is a very time consuming, complex and exhaustive task. Because of this, over the years several studies have proposed automated methodologies that could facilitate the neurophysiologists’ task of identifying epileptiform discharges and a large number of methodologies used neural networks for the pattern classification. One of the differences between all of these methodologies is the type of input stimuli presented to the networks, i.e., how the EEG signal is introduced in the network. Five types of input stimuli have been commonly found in literature: raw EEG signal, morphological descriptors (i.e. parameters related to the signal’s morphology), Fast Fourier Transform (FFT) spectrum, Short-Time Fourier Transform (STFT) spectrograms and Wavelet Transform features. This study evaluates the application of these five types of input stimuli and compares the classification results of neural networks that were implemented using each of these inputs. The performance of using raw signal varied between 43 and 84% efficiency. The results of FFT spectrum and STFT spectrograms were quite similar with average efficiency being 73 and 77%, respectively. The efficiency of Wavelet Transform features varied between 57 and 81% while the descriptors presented efficiency values between 62 and 93%. After simulations we could observe that the best results were achieved when either morphological descriptors or Wavelet features were used as input stimuli.

Keywords: Artificial neural network, electroencephalogram signal, pattern recognition, signal processing

Procedia PDF Downloads 503
123 Interpretable Deep Learning Models for Medical Condition Identification

Authors: Dongping Fang, Lian Duan, Xiaojing Yuan, Mike Xu, Allyn Klunder, Kevin Tan, Suiting Cao, Yeqing Ji

Abstract:

Accurate prediction of a medical condition with straight clinical evidence is a long-sought topic in the medical management and health insurance field. Although great progress has been made with machine learning algorithms, the medical community is still, to a certain degree, suspicious about the model's accuracy and interpretability. This paper presents an innovative hierarchical attention deep learning model to achieve good prediction and clear interpretability that can be easily understood by medical professionals. This deep learning model uses a hierarchical attention structure that matches naturally with the medical history data structure and reflects the member’s encounter (date of service) sequence. The model attention structure consists of 3 levels: (1) attention on the medical code types (diagnosis codes, procedure codes, lab test results, and prescription drugs), (2) attention on the sequential medical encounters within a type, (3) attention on the medical codes within an encounter and type. This model is applied to predict the occurrence of stage 3 chronic kidney disease (CKD3), using three years’ medical history of Medicare Advantage (MA) members from a top health insurance company. The model takes members’ medical events, both claims and electronic medical record (EMR) data, as input, makes a prediction of CKD3 and calculates the contribution from individual events to the predicted outcome. The model outcome can be easily explained with the clinical evidence identified by the model algorithm. Here are examples: Member A had 36 medical encounters in the past three years: multiple office visits, lab tests and medications. The model predicts member A has a high risk of CKD3 with the following well-contributed clinical events - multiple high ‘Creatinine in Serum or Plasma’ tests and multiple low kidneys functioning ‘Glomerular filtration rate’ tests. Among the abnormal lab tests, more recent results contributed more to the prediction. The model also indicates regular office visits, no abnormal findings of medical examinations, and taking proper medications decreased the CKD3 risk. Member B had 104 medical encounters in the past 3 years and was predicted to have a low risk of CKD3, because the model didn’t identify diagnoses, procedures, or medications related to kidney disease, and many lab test results, including ‘Glomerular filtration rate’ were within the normal range. The model accurately predicts members A and B and provides interpretable clinical evidence that is validated by clinicians. Without extra effort, the interpretation is generated directly from the model and presented together with the occurrence date. Our model uses the medical data in its most raw format without any further data aggregation, transformation, or mapping. This greatly simplifies the data preparation process, mitigates the chance for error and eliminates post-modeling work needed for traditional model explanation. To our knowledge, this is the first paper on an interpretable deep-learning model using a 3-level attention structure, sourcing both EMR and claim data, including all 4 types of medical data, on the entire Medicare population of a big insurance company, and more importantly, directly generating model interpretation to support user decision. In the future, we plan to enrich the model input by adding patients’ demographics and information from free-texted physician notes.

Keywords: deep learning, interpretability, attention, big data, medical conditions

Procedia PDF Downloads 67
122 River Habitat Modeling for the Entire Macroinvertebrate Community

Authors: Pinna Beatrice., Laini Alex, Negro Giovanni, Burgazzi Gemma, Viaroli Pierluigi, Vezza Paolo

Abstract:

Habitat models rarely consider macroinvertebrates as ecological targets in rivers. Available approaches mainly focus on single macroinvertebrate species, not addressing the ecological needs and functionality of the entire community. This research aimed to provide an approach to model the habitat of the macroinvertebrate community. The approach is based on the recently developed Flow-T index, together with a Random Forest (RF) regression, which is employed to apply the Flow-T index at the meso-habitat scale. Using different datasets gathered from both field data collection and 2D hydrodynamic simulations, the model has been calibrated in the Trebbia river (2019 campaign), and then validated in the Trebbia, Taro, and Enza rivers (2020 campaign). The three rivers are characterized by a braiding morphology, gravel riverbeds, and summer low flows. The RF model selected 12 mesohabitat descriptors as important for the macroinvertebrate community. These descriptors belong to different frequency classes of water depth, flow velocity, substrate grain size, and connectivity to the main river channel. The cross-validation R² coefficient (R²𝒸ᵥ) of the training dataset is 0.71 for the Trebbia River (2019), whereas the R² coefficient for the validation datasets (Trebbia, Taro, and Enza Rivers 2020) is 0.63. The agreement between the simulated results and the experimental data shows sufficient accuracy and reliability. The outcomes of the study reveal that the model can identify the ecological response of the macroinvertebrate community to possible flow regime alterations and to possible river morphological modifications. Lastly, the proposed approach allows extending the MesoHABSIM methodology, widely used for the fish habitat assessment, to a different ecological target community. Further applications of the approach can be related to flow design in both perennial and non-perennial rivers, including river reaches in which fish fauna is absent.

Keywords: ecological flows, macroinvertebrate community, mesohabitat, river habitat modeling

Procedia PDF Downloads 62
121 New Approach for Constructing a Secure Biometric Database

Authors: A. Kebbeb, M. Mostefai, F. Benmerzoug, Y. Chahir

Abstract:

The multimodal biometric identification is the combination of several biometric systems. The challenge of this combination is to reduce some limitations of systems based on a single modality while significantly improving performance. In this paper, we propose a new approach to the construction and the protection of a multimodal biometric database dedicated to an identification system. We use a topological watermarking to hide the relation between face image and the registered descriptors extracted from other modalities of the same person for more secure user identification.

Keywords: biometric databases, multimodal biometrics, security authentication, digital watermarking

Procedia PDF Downloads 348
120 Evaluating Classification with Efficacy Metrics

Authors: Guofan Shao, Lina Tang, Hao Zhang

Abstract:

The values of image classification accuracy are affected by class size distributions and classification schemes, making it difficult to compare the performance of classification algorithms across different remote sensing data sources and classification systems. Based on the term efficacy from medicine and pharmacology, we have developed the metrics of image classification efficacy at the map and class levels. The novelty of this approach is that a baseline classification is involved in computing image classification efficacies so that the effects of class statistics are reduced. Furthermore, the image classification efficacies are interpretable and comparable, and thus, strengthen the assessment of image data classification methods. We use real-world and hypothetical examples to explain the use of image classification efficacies. The metrics of image classification efficacy meet the critical need to rectify the strategy for the assessment of image classification performance as image classification methods are becoming more diversified.

Keywords: accuracy assessment, efficacy, image classification, machine learning, uncertainty

Procedia PDF Downloads 179
119 Computational Approach to Cyclin-Dependent Kinase 2 Inhibitors Design and Analysis: Merging Quantitative Structure-Activity Relationship, Absorption, Distribution, Metabolism, Excretion, and Toxicity, Molecular Docking, and Molecular Dynamics Simulations

Authors: Mohamed Moussaoui, Mouna Baassi, Soukayna Baammi, Hatim Soufi, Mohammed Salah, Rachid Daoud, Achraf EL Allali, Mohammed Elalaoui Belghiti, Said Belaaouad

Abstract:

The present study aims to investigate the quantitative structure-activity relationship (QSAR) of a series of Thiazole derivatives reported as anticancer agents (hepatocellular carcinoma), using principally the electronic descriptors calculated by the density functional theory (DFT) method and by applying the multiple linear regression method. The developed model showed good statistical parameters (R²= 0.725, R²ₐ𝒹ⱼ= 0.653, MSE = 0.060, R²ₜₑₛₜ= 0.827, Q²𝒸ᵥ = 0.536). The energy of the highest occupied molecular orbital (EHOMO) orbital, electronic energy (TE), shape coefficient (I), number of rotatable bonds (NROT), and index of refraction (n) were revealed to be the main descriptors influencing the anti-cancer activity. Additional Thiazole derivatives were then designed and their activities and pharmacokinetic properties were predicted using the validated QSAR model. These designed molecules underwent evaluation through molecular docking (MD) and molecular dynamic (MD) simulations, with binding affinity calculated using the MMPBSA script according to a 100 ns simulation trajectory. This process aimed to study both their affinity and stability towards Cyclin-Dependent Kinase 2 (CDK2), a target protein for cancer disease treatment. The research concluded by identifying four CDK2 inhibitors - A1, A3, A5, and A6 - displaying satisfactory pharmacokinetic properties. MDs results indicated that the designed compound A5 remained stable in the active center of the CDK2 protein, suggesting its potential as an effective inhibitor for the treatment of hepatocellular carcinoma. The findings of this study could contribute significantly to the development of effective CDK2 inhibitors.

Keywords: QSAR, ADMET, Thiazole, anticancer, molecular docking, molecular dynamic simulations, MMPBSA calculation

Procedia PDF Downloads 71
118 Adapted Intersection over Union: A Generalized Metric for Evaluating Unsupervised Classification Models

Authors: Prajwal Prakash Vasisht, Sharath Rajamurthy, Nishanth Dara

Abstract:

In a supervised machine learning approach, metrics such as precision, accuracy, and coverage can be calculated using ground truth labels to help in model tuning, evaluation, and selection. In an unsupervised setting, however, where the data has no ground truth, there are few interpretable metrics that can guide us to do the same. Our approach creates a framework to adapt the Intersection over Union metric, referred to as Adapted IoU, usually used to evaluate supervised learning models, into the unsupervised domain, which solves the problem by factoring in subject matter expertise and intuition about the ideal output from the model. This metric essentially provides a scale that allows us to compare the performance across numerous unsupervised models or tune hyper-parameters and compare different versions of the same model.

Keywords: general metric, unsupervised learning, classification, intersection over union

Procedia PDF Downloads 17