Search results for: clustering ensemble
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 791

Search results for: clustering ensemble

311 Static vs. Stream Mining Trajectories Similarity Measures

Authors: Musaab Riyadh, Norwati Mustapha, Dina Riyadh

Abstract:

Trajectory similarity can be defined as the cost of transforming one trajectory into another based on certain similarity method. It is the core of numerous mining tasks such as clustering, classification, and indexing. Various approaches have been suggested to measure similarity based on the geometric and dynamic properties of trajectory, the overlapping between trajectory segments, and the confined area between entire trajectories. In this article, an evaluation of these approaches has been done based on computational cost, usage memory, accuracy, and the amount of data which is needed in advance to determine its suitability to stream mining applications. The evaluation results show that the stream mining applications support similarity methods which have low computational cost and memory, single scan on data, and free of mathematical complexity due to the high-speed generation of data.

Keywords: global distance measure, local distance measure, semantic trajectory, spatial dimension, stream data mining

Procedia PDF Downloads 396
310 Cr Induced Magnetization in Zinc-Blende ZnO-Based Diluted Magnetic Semiconductors

Authors: Bakhtiar Ul Haq, R. Ahmed, A. Shaari, Mazmira Binti Mohamed, Nisar Ali

Abstract:

The capability of exploiting the electronic charge and spin properties simultaneously in a single material has made diluted magnetic semiconductors (DMS) remarkable in the field of spintronics. We report the designing of DMS based on zinc-blend ZnO doped with Cr impurity. The full potential linearized augmented plane wave plus local orbital FP-L(APW+lo) method in density functional theory (DFT) has been adapted to carry out these investigations. For treatment of exchange and correlation energy, generalized gradient approximations have been used. Introducing Cr atoms in the matrix of ZnO has induced strong magnetic moment with ferromagnetic ordering at stable ground state. Cr:ZnO was found to favor the short range magnetic interaction that reflect the tendency of Cr clustering. The electronic structure of ZnO is strongly influenced in the presence of Cr impurity atoms where impurity bands appear in the band gap.

Keywords: ZnO, density functional theory, diluted agnetic semiconductors, ferromagnetic materials, FP-L(APW+lo)

Procedia PDF Downloads 426
309 Interpretation and Prediction of Geotechnical Soil Parameters Using Ensemble Machine Learning

Authors: Goudjil kamel, Boukhatem Ghania, Jlailia Djihene

Abstract:

This paper delves into the development of a sophisticated desktop application designed to calculate soil bearing capacity and predict limit pressure. Drawing from an extensive review of existing methodologies, the study meticulously examines various approaches employed in soil bearing capacity calculations, elucidating their theoretical foundations and practical applications. Furthermore, the study explores the burgeoning intersection of artificial intelligence (AI) and geotechnical engineering, underscoring the transformative potential of AI- driven solutions in enhancing predictive accuracy and efficiency.Central to the research is the utilization of cutting-edge machine learning techniques, including Artificial Neural Networks (ANN), XGBoost, and Random Forest, for predictive modeling. Through comprehensive experimentation and rigorous analysis, the efficacy and performance of each method are rigorously evaluated, with XGBoost emerging as the preeminent algorithm, showcasing superior predictive capabilities compared to its counterparts. The study culminates in a nuanced understanding of the intricate dynamics at play in geotechnical analysis, offering valuable insights into optimizing soil bearing capacity calculations and limit pressure predictions. By harnessing the power of advanced computational techniques and AI-driven algorithms, the paper presents a paradigm shift in the realm of geotechnical engineering, promising enhanced precision and reliability in civil engineering projects.

Keywords: limit pressure of soil, xgboost, random forest, bearing capacity

Procedia PDF Downloads 22
308 An Improved C-Means Model for MRI Segmentation

Authors: Ying Shen, Weihua Zhu

Abstract:

Medical images are important to help identifying different diseases, for example, Magnetic resonance imaging (MRI) can be used to investigate the brain, spinal cord, bones, joints, breasts, blood vessels, and heart. Image segmentation, in medical image analysis, is usually the first step to find out some characteristics with similar color, intensity or texture so that the diagnosis could be further carried out based on these features. This paper introduces an improved C-means model to segment the MRI images. The model is based on information entropy to evaluate the segmentation results by achieving global optimization. Several contributions are significant. Firstly, Genetic Algorithm (GA) is used for achieving global optimization in this model where fuzzy C-means clustering algorithm (FCMA) is not capable of doing that. Secondly, the information entropy after segmentation is used for measuring the effectiveness of MRI image processing. Experimental results show the outperformance of the proposed model by comparing with traditional approaches.

Keywords: magnetic resonance image (MRI), c-means model, image segmentation, information entropy

Procedia PDF Downloads 226
307 Self-Organizing Maps for Exploration of Partially Observed Data and Imputation of Missing Values in the Context of the Manufacture of Aircraft Engines

Authors: Sara Rejeb, Catherine Duveau, Tabea Rebafka

Abstract:

To monitor the production process of turbofan aircraft engines, multiple measurements of various geometrical parameters are systematically recorded on manufactured parts. Engine parts are subject to extremely high standards as they can impact the performance of the engine. Therefore, it is essential to analyze these databases to better understand the influence of the different parameters on the engine's performance. Self-organizing maps are unsupervised neural networks which achieve two tasks simultaneously: they visualize high-dimensional data by projection onto a 2-dimensional map and provide clustering of the data. This technique has become very popular for data exploration since it provides easily interpretable results and a meaningful global view of the data. As such, self-organizing maps are usually applied to aircraft engine condition monitoring. As databases in this field are huge and complex, they naturally contain multiple missing entries for various reasons. The classical Kohonen algorithm to compute self-organizing maps is conceived for complete data only. A naive approach to deal with partially observed data consists in deleting items or variables with missing entries. However, this requires a sufficient number of complete individuals to be fairly representative of the population; otherwise, deletion leads to a considerable loss of information. Moreover, deletion can also induce bias in the analysis results. Alternatively, one can first apply a common imputation method to create a complete dataset and then apply the Kohonen algorithm. However, the choice of the imputation method may have a strong impact on the resulting self-organizing map. Our approach is to address simultaneously the two problems of computing a self-organizing map and imputing missing values, as these tasks are not independent. In this work, we propose an extension of self-organizing maps for partially observed data, referred to as missSOM. First, we introduce a criterion to be optimized, that aims at defining simultaneously the best self-organizing map and the best imputations for the missing entries. As such, missSOM is also an imputation method for missing values. To minimize the criterion, we propose an iterative algorithm that alternates the learning of a self-organizing map and the imputation of missing values. Moreover, we develop an accelerated version of the algorithm by entwining the iterations of the Kohonen algorithm with the updates of the imputed values. This method is efficiently implemented in R and will soon be released on CRAN. Compared to the standard Kohonen algorithm, it does not come with any additional cost in terms of computing time. Numerical experiments illustrate that missSOM performs well in terms of both clustering and imputation compared to the state of the art. In particular, it turns out that missSOM is robust to the missingness mechanism, which is in contrast to many imputation methods that are appropriate for only a single mechanism. This is an important property of missSOM as, in practice, the missingness mechanism is often unknown. An application to measurements on one type of part is also provided and shows the practical interest of missSOM.

Keywords: imputation method of missing data, partially observed data, robustness to missingness mechanism, self-organizing maps

Procedia PDF Downloads 151
306 Machine Learning Analysis of Eating Disorders Risk, Physical Activity and Psychological Factors in Adolescents: A Community Sample Study

Authors: Marc Toutain, Pascale Leconte, Antoine Gauthier

Abstract:

Introduction: Eating Disorders (ED), such as anorexia, bulimia, and binge eating, are psychiatric illnesses that mostly affect young people. The main symptoms concern eating (restriction, excessive food intake) and weight control behaviors (laxatives, vomiting). Psychological comorbidities (depression, executive function disorders, etc.) and problematic behaviors toward physical activity (PA) are commonly associated with ED. Acquaintances on ED risk factors are still lacking, and more community sample studies are needed to improve prevention and early detection. To our knowledge, studies are needed to specifically investigate the link between ED risk level, PA, and psychological risk factors in a community sample of adolescents. The aim of this study is to assess the relation between ED risk level, exercise (type, frequency, and motivations for engaging in exercise), and psychological factors based on the Jacobi risk factors model. We suppose that a high risk of ED will be associated with the practice of high caloric cost PA, motivations oriented to weight and shape control, and psychological disturbances. Method: An online survey destined for students has been sent to several middle schools and colleges in northwest France. This survey combined several questionnaires, the Eating Attitude Test-26 assessing ED risk; the Exercise Motivation Inventory–2 assessing motivations toward PA; the Hospital Anxiety and Depression Scale assessing anxiety and depression, the Contour Drawing Rating Scale; and the Body Esteem Scale assessing body dissatisfaction, Rosenberg Self-esteem Scale assessing self-esteem, the Exercise Dependence Scale-Revised assessing PA dependence, the Multidimensional Assessment of Interoceptive Awareness assessing interoceptive awareness and the Frost Multidimensional Perfectionism Scale assessing perfectionism. Machine learning analysis will be performed in order to constitute groups with a tree-based model clustering method, extract risk profile(s) with a bootstrap method comparison, and predict ED risk with a prediction method based on a decision tree-based model. Expected results: 1044 complete records have already been collected, and the survey will be closed at the end of May 2022. Records will be analyzed with a clustering method and a bootstrap method in order to reveal risk profile(s). Furthermore, a predictive tree decision method will be done to extract an accurate predictive model of ED risk. This analysis will confirm typical main risk factors and will give more data on presumed strong risk factors such as exercise motivations and interoceptive deficit. Furthermore, it will enlighten particular risk profiles with a strong level of proof and greatly contribute to improving the early detection of ED and contribute to a better understanding of ED risk factors.

Keywords: eating disorders, risk factors, physical activity, machine learning

Procedia PDF Downloads 83
305 The Efficacy of Open Educational Resources in Students’ Performance and Engagement

Authors: Huda Al-Shuaily, E. M. Lacap

Abstract:

Higher Education is one of the most essential fundamentals for the advancement and progress of a country. It demands to be as accessible as possible and as comprehensive as it can be reached. In this paper, we succeeded to expand the accessibility and delivery of higher education using an Open Educational Resources (OER), a freely accessible, openly licensed documents, and media for teaching and learning. This study creates a comparative design of student’s academic performance on the course Introduction to Database and student engagement to the virtual learning environment (VLE). The study was done in two successive semesters - one without using the OER and the other is using OER. In the study, we established that there is a significant increase in student’s engagement in VLE in the latter semester compared to the former. By using the latter semester’s data, we manage to show that the student’s engagement has a positive impact on students’ academic performance. Moreso, after clustering their academic performance, the impact is seen higher for students who are low performing. The results show that these engagements can be used to potentially predict the learning styles of the student with a high degree of precision.

Keywords: EDM, learning analytics, moodle, OER, student-engagement

Procedia PDF Downloads 339
304 Microbial Biogeography of Greek Olive Varieties Assessed by Amplicon-Based Metagenomics Analysis

Authors: Lena Payati, Maria Kazou, Effie Tsakalidou

Abstract:

Table olives are one of the most popular fermented vegetables worldwide, which along with olive oil, have a crucial role in the world economy. They are highly appreciated by the consumers for their characteristic taste and pleasant aromas, while several health and nutritional benefits have been reported as well. Until recently, microbial biogeography, i.e., the study of microbial diversity over time and space, has been mainly associated with wine. However, nowadays, the term 'terroir' has been extended to other crops and food products so as to link the geographical origin and environmental conditions to quality aspects of fermented foods. Taking the above into consideration, the present study focuses on the microbial fingerprinting of the most important olive varieties of Greece with the state-of-the-art amplicon-based metagenomics analysis. Towards this, in 2019, 61 samples from 38 different olive varieties were collected at the final stage of ripening from 13 well spread geographical regions in Greece. For the metagenomics analysis, total DNA was extracted from the olive samples, and the 16S rRNA gene and ITS DNA region were sequenced and analyzed using bioinformatics tools for the identification of bacterial and yeasts/fungal diversity, respectively. Furthermore, principal component analysis (PCA) was also performed for data clustering based on the average microbial composition of all samples from each region of origin. According to the composition, results obtained, when samples were analyzed separately, the majority of both bacteria (such as Pantoea, Enterobacter, Roserbergiella, and Pseudomonas) and yeasts/fungi (such as Aureobasidium, Debaromyces, Candida, and Cladosporium) genera identified were found in all 61 samples. Even though interesting differences were observed at the relative abundance level of the identified genera, the bacterial genus Pantoea and the yeast/fungi genus Aureobasidium were the dominant ones in 35 and 40 samples, respectively. Of note, olive samples collected from the same region had similar fingerprint (genera identified and relative abundance level) regardless of the variety, indicating a potential association between the relative abundance of certain taxa and the geographical region. When samples were grouped by region of origin, distinct bacterial profiles per region were observed, which was also evident from the PCA analysis. This was not the case for the yeast/fungi profiles since 10 out of the 13 regions were grouped together mainly due to the dominance of the genus Aureobasidium. A second cluster was formed for the islands Crete and Rhodes, both of which are located in the Southeast Aegean Sea. These two regions clustered together mainly due to the identification of the genus Toxicocladosporium in relatively high abundances. Finally, the Agrinio region was separated from the others as it showed a completely different microbial fingerprinting. However, due to the limited number of olive samples from some regions, a subsequent PCA analysis with more samples from these regions is expected to yield in a more clear clustering. The present study is part of a bigger project, the first of its kind in Greece, with the ultimate goal to analyze a larger set of olive samples of different varieties and from different regions in Greece in order to have a reliable olives’ microbial biogeography.

Keywords: amplicon-based metagenomics analysis, bacteria, microbial biogeography, olive microbiota, yeasts/fungi

Procedia PDF Downloads 115
303 The System Dynamics Research of China-Africa Trade, Investment and Economic Growth

Authors: Emma Serwaa Obobisaa, Haibo Chen

Abstract:

International trade and outward foreign direct investment are important factors which are generally recognized in the economic growth and development. Though several scholars have struggled to reveal the influence of trade and outward foreign direct investment (FDI) on economic growth, most studies utilized common econometric models such as vector autoregression and aggregated the variables, which for the most part prompts, however, contradictory and mixed results. Thus, there is an exigent need for the precise study of the trade and FDI effect of economic growth while applying strong econometric models and disaggregating the variables into its separate individual variables to explicate their respective effects on economic growth. This will guarantee the provision of policies and strategies that are geared towards individual variables to ensure sustainable development and growth. This study, therefore, seeks to examine the causal effect of China-Africa trade and Outward Foreign Direct Investment on the economic growth of Africa using a robust and recent econometric approach such as system dynamics model. Our study impanels and tests an ensemble of a group of vital variables predominant in recent studies on trade-FDI-economic growth causality: Foreign direct ınvestment, international trade and economic growth. Our results showed that the system dynamics method provides accurate statistical inference regarding the direction of the causality among the variables than the conventional method such as OLS and Granger Causality predominantly used in the literature as it is more robust and provides accurate, critical values.

Keywords: economic growth, outward foreign direct investment, system dynamics model, international trade

Procedia PDF Downloads 107
302 Automatic Facial Skin Segmentation Using Possibilistic C-Means Algorithm for Evaluation of Facial Surgeries

Authors: Elham Alaee, Mousa Shamsi, Hossein Ahmadi, Soroosh Nazem, Mohammad Hossein Sedaaghi

Abstract:

Human face has a fundamental role in the appearance of individuals. So the importance of facial surgeries is undeniable. Thus, there is a need for the appropriate and accurate facial skin segmentation in order to extract different features. Since Fuzzy C-Means (FCM) clustering algorithm doesn’t work appropriately for noisy images and outliers, in this paper we exploit Possibilistic C-Means (PCM) algorithm in order to segment the facial skin. For this purpose, first, we convert facial images from RGB to YCbCr color space. To evaluate performance of the proposed algorithm, the database of Sahand University of Technology, Tabriz, Iran was used. In order to have a better understanding from the proposed algorithm; FCM and Expectation-Maximization (EM) algorithms are also used for facial skin segmentation. The proposed method shows better results than the other segmentation methods. Results include misclassification error (0.032) and the region’s area error (0.045) for the proposed algorithm.

Keywords: facial image, segmentation, PCM, FCM, skin error, facial surgery

Procedia PDF Downloads 586
301 Genomic Adaptation to Local Climate Conditions in Native Cattle Using Whole Genome Sequencing Data

Authors: Rugang Tian

Abstract:

In this study, we generated whole-genome sequence (WGS) data from110 native cattle. Together with whole-genome sequences from world-wide cattle populations, we estimated the genetic diversity and population genetic structure of different cattle populations. Our findings revealed clustering of cattle groups in line with their geographic locations. We identified noticeable genetic diversity between indigenous cattle breeds and commercial populations. Among all studied cattle groups, lower genetic diversity measures were found in commercial populations, however, high genetic diversity were detected in some local cattle, particularly in Rashoki and Mongolian breeds. Our search for potential genomic regions under selection in native cattle revealed several candidate genes related with immune response and cold shock protein on multiple chromosomes such as TRPM8, NMUR1, PRKAA2, SMTNL2 and OXR1 that are involved in energy metabolism and metabolic homeostasis.

Keywords: cattle, whole-genome, population structure, adaptation

Procedia PDF Downloads 74
300 Bioinformatics Analysis of DGAT1 Gene in Domestic Ruminnants

Authors: Sirous Eydivandi

Abstract:

Diacylglycerol-O-acyltransferase (DGAT1) gene encodes diacylglycerol transferase enzyme that plays an important role in glycerol lipid metabolism. DGAT1 is considered to be the key enzyme in controlling the synthesis of triglycerides in adipocytes. This enzyme catalyzes the final step of triglyceride synthesis (transform triacylglycerol (DAG) into triacylglycerol (TAG). A total of 20 DGAT1 gene sequences and corresponding amino acids belonging to 4 species include cattle, goats, sheep and yaks were analyzed, and the differentiation within and among the species was also studied. The length of the DGAT1 gene varies greatly, from 1527 to 1785 bp, due to deletion, insertion, and stop codon mutation resulting in elongation. Observed genetic diversity was higher among species than within species, and Goat had more polymorphisms than any other species. Novel amino acid variation sites were detected within several species which might be used to illustrate the functional variation. Differentiation of the DGAT1 gene was obvious among species, and the clustering result was consistent with the taxonomy in the National Center for Biotechnology Information.

Keywords: DGAT1gene, bioinformatic, ruminnants, biotechnology information

Procedia PDF Downloads 491
299 Organotin (IV) Based Complexes as Promiscuous Antibacterials: Synthesis in vitro, in Silico Pharmacokinetic, and Docking Studies

Authors: Wajid Rehman, Sirajul Haq, Bakhtiar Muhammad, Syed Fahad Hassan, Amin Badshah, Muhammad Waseem, Fazal Rahim, Obaid-Ur-Rahman Abid, Farzana Latif Ansari, Umer Rashid

Abstract:

Five novel triorganotin (IV) compounds have been synthesized and characterized. The tin atom is penta-coordinated to assume trigonal-bipyramidal geometry. Using in silico derived parameters; the objective of our study is to design and synthesize promiscuous antibacterials potent enough to combat resistance. Among various synthesized organotin (IV) complexes, compound 5 was found as potent antibacterial agent against various bacterial strains. Further lead optimization of drug-like properties was evaluated through in silico predictions. Data mining and computational analysis were utilized to derive compound promiscuity phenomenon to avoid drug attrition rate in designing antibacterials. Xanthine oxidase and human glucose- 6-phosphatase were found as only true positive off-target hits by ChEMBL database and others utilizing similarity ensemble approach. Propensity towards a-3 receptor, human macrophage migration factor and thiazolidinedione were found as false positive off targets with E-value 1/4> 10^-4 for compound 1, 3, and 4. Further, displaying positive drug-drug interaction of compound 1 as uricosuric was validated by all databases and docked protein targets with sequence similarity and compositional matrix alignment via BLAST software. Promiscuity of the compound 5 was further confirmed by in silico binding to different antibacterial targets.

Keywords: antibacterial activity, drug promiscuity, ADMET prediction, metallo-pharmaceutical, antimicrobial resistance

Procedia PDF Downloads 503
298 Sentiment Classification of Documents

Authors: Swarnadip Ghosh

Abstract:

Sentiment Analysis is the process of detecting the contextual polarity of text. In other words, it determines whether a piece of writing is positive, negative or neutral.Sentiment analysis of documents holds great importance in today's world, when numerous information is stored in databases and in the world wide web. An efficient algorithm to illicit such information, would be beneficial for social, economic as well as medical purposes. In this project, we have developed an algorithm to classify a document into positive or negative. Using our algorithm, we obtained a feature set from the data, and classified the documents based on this feature set. It is important to note that, in the classification, we have not used the independence assumption, which is considered by many procedures like the Naive Bayes. This makes the algorithm more general in scope. Moreover, because of the sparsity and high dimensionality of such data, we did not use empirical distribution for estimation, but developed a method by finding degree of close clustering of the data points. We have applied our algorithm on a movie review data set obtained from IMDb and obtained satisfactory results.

Keywords: sentiment, Run's Test, cross validation, higher dimensional pmf estimation

Procedia PDF Downloads 402
297 Verification & Validation of Map Reduce Program Model for Parallel K-Mediod Algorithm on Hadoop Cluster

Authors: Trapti Sharma, Devesh Kumar Srivastava

Abstract:

This paper is basically a analysis study of above MapReduce implementation and also to verify and validate the MapReduce solution model for Parallel K-Mediod algorithm on Hadoop Cluster. MapReduce is a programming model which authorize the managing of huge amounts of data in parallel, on a large number of devices. It is specially well suited to constant or moderate changing set of data since the implementation point of a position is usually high. MapReduce has slowly become the framework of choice for “big data”. The MapReduce model authorizes for systematic and instant organizing of large scale data with a cluster of evaluate nodes. One of the primary affect in Hadoop is how to minimize the completion length (i.e. makespan) of a set of MapReduce duty. In this paper, we have verified and validated various MapReduce applications like wordcount, grep, terasort and parallel K-Mediod clustering algorithm. We have found that as the amount of nodes increases the completion time decreases.

Keywords: hadoop, mapreduce, k-mediod, validation, verification

Procedia PDF Downloads 369
296 A Study of the Performance Parameter for Recommendation Algorithm Evaluation

Authors: C. Rana, S. K. Jain

Abstract:

The enormous amount of Web data has challenged its usage in efficient manner in the past few years. As such, a range of techniques are applied to tackle this problem; prominent among them is personalization and recommender system. In fact, these are the tools that assist user in finding relevant information of web. Most of the e-commerce websites are applying such tools in one way or the other. In the past decade, a large number of recommendation algorithms have been proposed to tackle such problems. However, there have not been much research in the evaluation criteria for these algorithms. As such, the traditional accuracy and classification metrics are still used for the evaluation purpose that provides a static view. This paper studies how the evolution of user preference over a period of time can be mapped in a recommender system using a new evaluation methodology that explicitly using time dimension. We have also presented different types of experimental set up that are generally used for recommender system evaluation. Furthermore, an overview of major accuracy metrics and metrics that go beyond the scope of accuracy as researched in the past few years is also discussed in detail.

Keywords: collaborative filtering, data mining, evolutionary, clustering, algorithm, recommender systems

Procedia PDF Downloads 413
295 IT-Aided Business Process Enabling Real-Time Analysis of Candidates for Clinical Trials

Authors: Matthieu-P. Schapranow

Abstract:

Recruitment of participants for clinical trials requires the screening of a big number of potential candidates, i.e. the testing for trial-specific inclusion and exclusion criteria, which is a time-consuming and complex task. Today, a significant amount of time is spent on identification of adequate trial participants as their selection may affect the overall study results. We introduce a unique patient eligibility metric, which allows systematic ranking and classification of candidates based on trial-specific filter criteria. Our web application enables real-time analysis of patient data and assessment of candidates using freely definable inclusion and exclusion criteria. As a result, the overall time required for identifying eligible candidates is tremendously reduced whilst additional degrees of freedom for evaluating the relevance of individual candidates are introduced by our contribution.

Keywords: in-memory technology, clinical trials, screening, eligibility metric, data analysis, clustering

Procedia PDF Downloads 493
294 HPTLC Metabolite Fingerprinting of Artocarpus champeden Stembark from Several Different Locations in Indonesia and Correlation with Antimalarial Activity

Authors: Imam Taufik, Hilkatul Ilmi, Puryani, Mochammad Yuwono, Aty Widyawaruyanti

Abstract:

Artocarpus champeden Spreng stembark (Moraceae) in Indonesia well known as ‘cempedak’ had been traditionally used for malarial remedies. The difference of growth locations could cause the difference of metabolite profiling. As a consequence, there were difference antimalarial activities in spite of the same plants. The aim of this research was to obtain the profile of metabolites that contained in A. champeden stembark from different locations in Indonesia for authentication and quality control purpose of this extract. The profiling had been performed by HPTLC-Densitometry technique and antimalarial activity had been also determined by HRP2-ELISA technique. The correlation between metabolite fingerprinting and antimalarial activity had been analyzed by Principle Component Analysis, Hierarchical Clustering Analysis and Partial Least Square. As a result, there is correlation between the difference metabolite fingerprinting and antimalarial activity from several different growth locations.

Keywords: antimalarial, artocarpus champeden spreng, metabolite fingerprinting, multivariate analysis

Procedia PDF Downloads 311
293 Performance and Emission Prediction in a Biodiesel Engine Fuelled with Honge Methyl Ester Using RBF Neural Networks

Authors: Shiva Kumar, G. S. Vijay, Srinivas Pai P., Shrinivasa Rao B. R.

Abstract:

In the present study RBF neural networks were used for predicting the performance and emission parameters of a biodiesel engine. Engine experiments were carried out in a 4 stroke diesel engine using blends of diesel and Honge methyl ester as the fuel. Performance parameters like BTE, BSEC, Tech and emissions from the engine were measured. These experimental results were used for ANN modeling. RBF center initialization was done by random selection and by using Clustered techniques. Network was trained by using fixed and varying widths for the RBF units. It was observed that RBF results were having a good agreement with the experimental results. Networks trained by using clustering technique gave better results than using random selection of centers in terms of reduced MRE and increased prediction accuracy. The average MRE for the performance parameters was 3.25% with the prediction accuracy of 98% and for emissions it was 10.4% with a prediction accuracy of 80%.

Keywords: radial basis function networks, emissions, performance parameters, fuzzy c means

Procedia PDF Downloads 558
292 Detection Method of Federated Learning Backdoor Based on Weighted K-Medoids

Authors: Xun Li, Haojie Wang

Abstract:

Federated learning is a kind of distributed training and centralized training mode, which is of great value in the protection of user privacy. In order to solve the problem that the model is vulnerable to backdoor attacks in federated learning, a backdoor attack detection method based on a weighted k-medoids algorithm is proposed. First of all, this paper collates the update parameters of the client to construct a vector group, then uses the principal components analysis (PCA) algorithm to extract the corresponding feature information from the vector group, and finally uses the improved k-medoids clustering algorithm to identify the normal and backdoor update parameters. In this paper, the backdoor is implanted in the federation learning model through the model replacement attack method in the simulation experiment, and the update parameters from the attacker are effectively detected and removed by the defense method proposed in this paper.

Keywords: federated learning, backdoor attack, PCA, k-medoids, backdoor defense

Procedia PDF Downloads 114
291 Cirrhosis Mortality Prediction as Classification using Frequent Subgraph Mining

Authors: Abdolghani Ebrahimi, Diego Klabjan, Chenxi Ge, Daniela Ladner, Parker Stride

Abstract:

In this work, we use machine learning and novel data analysis techniques to predict the one-year mortality of cirrhotic patients. Data from 2,322 patients with liver cirrhosis are collected at a single medical center. Different machine learning models are applied to predict one-year mortality. A comprehensive feature space including demographic information, comorbidity, clinical procedure and laboratory tests is being analyzed. A temporal pattern mining technic called Frequent Subgraph Mining (FSM) is being used. Model for End-stage liver disease (MELD) prediction of mortality is used as a comparator. All of our models statistically significantly outperform the MELD-score model and show an average 10% improvement of the area under the curve (AUC). The FSM technic itself does not improve the model significantly, but FSM, together with a machine learning technique called an ensemble, further improves the model performance. With the abundance of data available in healthcare through electronic health records (EHR), existing predictive models can be refined to identify and treat patients at risk for higher mortality. However, due to the sparsity of the temporal information needed by FSM, the FSM model does not yield significant improvements. To the best of our knowledge, this is the first work to apply modern machine learning algorithms and data analysis methods on predicting one-year mortality of cirrhotic patients and builds a model that predicts one-year mortality significantly more accurate than the MELD score. We have also tested the potential of FSM and provided a new perspective of the importance of clinical features.

Keywords: machine learning, liver cirrhosis, subgraph mining, supervised learning

Procedia PDF Downloads 134
290 Combining a Continuum of Hidden Regimes and a Heteroskedastic Three-Factor Model in Option Pricing

Authors: Rachid Belhachemi, Pierre Rostan, Alexandra Rostan

Abstract:

This paper develops a discrete-time option pricing model for index options. The model consists of two key ingredients. First, daily stock return innovations are driven by a continuous hidden threshold mixed skew-normal (HTSN) distribution which generates conditional non-normality that is needed to fit daily index return. The most important feature of the HTSN is the inclusion of a latent state variable with a continuum of states, unlike the traditional mixture distributions where the state variable is discrete with little number of states. The HTSN distribution belongs to the class of univariate probability distributions where parameters of the distribution capture the dependence between the variable of interest and the continuous latent state variable (the regime). The distribution has an interpretation in terms of a mixture distribution with time-varying mixing probabilities. It has been shown empirically that this distribution outperforms its main competitor, the mixed normal (MN) distribution, in terms of capturing the stylized facts known for stock returns, namely, volatility clustering, leverage effect, skewness, kurtosis and regime dependence. Second, heteroscedasticity in the model is captured by a threeexogenous-factor GARCH model (GARCHX), where the factors are taken from the principal components analysis of various world indices and presents an application to option pricing. The factors of the GARCHX model are extracted from a matrix of world indices applying principal component analysis (PCA). The empirically determined factors are uncorrelated and represent truly different common components driving the returns. Both factors and the eight parameters inherent to the HTSN distribution aim at capturing the impact of the state of the economy on price levels since distribution parameters have economic interpretations in terms of conditional volatilities and correlations of the returns with the hidden continuous state. The PCA identifies statistically independent factors affecting the random evolution of a given pool of assets -in our paper a pool of international stock indices- and sorting them by order of relative importance. The PCA computes a historical cross asset covariance matrix and identifies principal components representing independent factors. In our paper, factors are used to calibrate the HTSN-GARCHX model and are ultimately responsible for the nature of the distribution of random variables being generated. We benchmark our model to the MN-GARCHX model following the same PCA methodology and the standard Black-Scholes model. We show that our model outperforms the benchmark in terms of RMSE in dollar losses for put and call options, which in turn outperforms the analytical Black-Scholes by capturing the stylized facts known for index returns, namely, volatility clustering, leverage effect, skewness, kurtosis and regime dependence.

Keywords: continuous hidden threshold, factor models, GARCHX models, option pricing, risk-premium

Procedia PDF Downloads 297
289 Cotton Crops Vegetative Indices Based Assessment Using Multispectral Images

Authors: Muhammad Shahzad Shifa, Amna Shifa, Muhammad Omar, Aamir Shahzad, Rahmat Ali Khan

Abstract:

Many applications of remote sensing to vegetation and crop response depend on spectral properties of individual leaves and plants. Vegetation indices are usually determined to estimate crop biophysical parameters like crop canopies and crop leaf area indices with the help of remote sensing. Cotton crops assessment is performed with the help of vegetative indices. Remotely sensed images from an optical multispectral radiometer MSR5 are used in this study. The interpretation is based on the fact that different materials reflect and absorb light differently at different wavelengths. Non-normalized and normalized forms of these datasets are analyzed using two complementary data mining algorithms; K-means and K-nearest neighbor (KNN). Our analysis shows that the use of normalized reflectance data and vegetative indices are suitable for an automated assessment and decision making.

Keywords: cotton, condition assessment, KNN algorithm, clustering, MSR5, vegetation indices

Procedia PDF Downloads 333
288 Evaluation of Robust Feature Descriptors for Texture Classification

Authors: Jia-Hong Lee, Mei-Yi Wu, Hsien-Tsung Kuo

Abstract:

Texture is an important characteristic in real and synthetic scenes. Texture analysis plays a critical role in inspecting surfaces and provides important techniques in a variety of applications. Although several descriptors have been presented to extract texture features, the development of object recognition is still a difficult task due to the complex aspects of texture. Recently, many robust and scaling-invariant image features such as SIFT, SURF and ORB have been successfully used in image retrieval and object recognition. In this paper, we have tried to compare the performance for texture classification using these feature descriptors with k-means clustering. Different classifiers including K-NN, Naive Bayes, Back Propagation Neural Network , Decision Tree and Kstar were applied in three texture image sets - UIUCTex, KTH-TIPS and Brodatz, respectively. Experimental results reveal SIFTS as the best average accuracy rate holder in UIUCTex, KTH-TIPS and SURF is advantaged in Brodatz texture set. BP neuro network works best in the test set classification among all used classifiers.

Keywords: texture classification, texture descriptor, SIFT, SURF, ORB

Procedia PDF Downloads 369
287 Performance Evaluation of Various Segmentation Techniques on MRI of Brain Tissue

Authors: U.V. Suryawanshi, S.S. Chowhan, U.V Kulkarni

Abstract:

Accuracy of segmentation methods is of great importance in brain image analysis. Tissue classification in Magnetic Resonance brain images (MRI) is an important issue in the analysis of several brain dementias. This paper portraits performance of segmentation techniques that are used on Brain MRI. A large variety of algorithms for segmentation of Brain MRI has been developed. The objective of this paper is to perform a segmentation process on MR images of the human brain, using Fuzzy c-means (FCM), Kernel based Fuzzy c-means clustering (KFCM), Spatial Fuzzy c-means (SFCM) and Improved Fuzzy c-means (IFCM). The review covers imaging modalities, MRI and methods for noise reduction and segmentation approaches. All methods are applied on MRI brain images which are degraded by salt-pepper noise demonstrate that the IFCM algorithm performs more robust to noise than the standard FCM algorithm. We conclude with a discussion on the trend of future research in brain segmentation and changing norms in IFCM for better results.

Keywords: image segmentation, preprocessing, MRI, FCM, KFCM, SFCM, IFCM

Procedia PDF Downloads 331
286 A Memetic Algorithm Approach to Clustering in Mobile Wireless Sensor Networks

Authors: Masood Ahmad, Ataul Aziz Ikram, Ishtiaq Wahid

Abstract:

Wireless sensor network (WSN) is the interconnection of mobile wireless nodes with limited energy and memory. These networks can be deployed formany critical applications like military operations, rescue management, fire detection and so on. In flat routing structure, every node plays an equal role of sensor and router. The topology may change very frequently due to the mobile nature of nodes in WSNs. The topology maintenance may produce more overhead messages. To avoid topology maintenance overhead messages, an optimized cluster based mobile wireless sensor network using memetic algorithm is proposed in this paper. The nodes in this network are first divided into clusters. The cluster leaders then transmit data to that base station. The network is validated through extensive simulation study. The results show that the proposed technique has superior results compared to existing techniques.

Keywords: WSN, routing, cluster based, meme, memetic algorithm

Procedia PDF Downloads 481
285 Intrusion Detection Using Dual Artificial Techniques

Authors: Rana I. Abdulghani, Amera I. Melhum

Abstract:

With the abnormal growth of the usage of computers over networks and under the consideration or agreement of most of the computer security experts who said that the goal of building a secure system is never achieved effectively, all these points led to the design of the intrusion detection systems(IDS). This research adopts a comparison between two techniques for network intrusion detection, The first one used the (Particles Swarm Optimization) that fall within the field (Swarm Intelligence). In this Act, the algorithm Enhanced for the purpose of obtaining the minimum error rate by amending the cluster centers when better fitness function is found through the training stages. Results show that this modification gives more efficient exploration of the original algorithm. The second algorithm used a (Back propagation NN) algorithm. Finally a comparison between the results of two methods used were based on (NSL_KDD) data sets for the construction and evaluation of intrusion detection systems. This research is only interested in clustering the two categories (Normal and Abnormal) for the given connection records. Practices experiments result in intrude detection rate (99.183818%) for EPSO and intrude detection rate (69.446416%) for BP neural network.

Keywords: IDS, SI, BP, NSL_KDD, PSO

Procedia PDF Downloads 382
284 Comparison of Different Machine Learning Algorithms for Solubility Prediction

Authors: Muhammet Baldan, Emel Timuçin

Abstract:

Molecular solubility prediction plays a crucial role in various fields, such as drug discovery, environmental science, and material science. In this study, we compare the performance of five machine learning algorithms—linear regression, support vector machines (SVM), random forests, gradient boosting machines (GBM), and neural networks—for predicting molecular solubility using the AqSolDB dataset. The dataset consists of 9981 data points with their corresponding solubility values. MACCS keys (166 bits), RDKit properties (20 properties), and structural properties(3) features are extracted for every smile representation in the dataset. A total of 189 features were used for training and testing for every molecule. Each algorithm is trained on a subset of the dataset and evaluated using metrics accuracy scores. Additionally, computational time for training and testing is recorded to assess the efficiency of each algorithm. Our results demonstrate that random forest model outperformed other algorithms in terms of predictive accuracy, achieving an 0.93 accuracy score. Gradient boosting machines and neural networks also exhibit strong performance, closely followed by support vector machines. Linear regression, while simpler in nature, demonstrates competitive performance but with slightly higher errors compared to ensemble methods. Overall, this study provides valuable insights into the performance of machine learning algorithms for molecular solubility prediction, highlighting the importance of algorithm selection in achieving accurate and efficient predictions in practical applications.

Keywords: random forest, machine learning, comparison, feature extraction

Procedia PDF Downloads 40
283 Long Wavelength Coherent Pulse of Sound Propagating in Granular Media

Authors: Rohit Kumar Shrivastava, Amalia Thomas, Nathalie Vriend, Stefan Luding

Abstract:

A mechanical wave or vibration propagating through granular media exhibits a specific signature in time. A coherent pulse or wavefront arrives first with multiply scattered waves (coda) arriving later. The coherent pulse is micro-structure independent i.e. it depends only on the bulk properties of the disordered granular sample, the sound wave velocity of the granular sample and hence bulk and shear moduli. The coherent wavefront attenuates (decreases in amplitude) and broadens with distance from its source. The pulse attenuation and broadening effects are affected by disorder (polydispersity; contrast in size of the granules) and have often been attributed to dispersion and scattering. To study the effect of disorder and initial amplitude (non-linearity) of the pulse imparted to the system on the coherent wavefront, numerical simulations have been carried out on one-dimensional sets of particles (granular chains). The interaction force between the particles is given by a Hertzian contact model. The sizes of particles have been selected randomly from a Gaussian distribution, where the standard deviation of this distribution is the relevant parameter that quantifies the effect of disorder on the coherent wavefront. Since, the coherent wavefront is system configuration independent, ensemble averaging has been used for improving the signal quality of the coherent pulse and removing the multiply scattered waves. The results concerning the width of the coherent wavefront have been formulated in terms of scaling laws. An experimental set-up of photoelastic particles constituting a granular chain is proposed to validate the numerical results.

Keywords: discrete elements, Hertzian contact, polydispersity, weakly nonlinear, wave propagation

Procedia PDF Downloads 204
282 Leverage Effect for Volatility with Generalized Laplace Error

Authors: Farrukh Javed, Krzysztof Podgórski

Abstract:

We propose a new model that accounts for the asymmetric response of volatility to positive ('good news') and negative ('bad news') shocks in economic time series the so-called leverage effect. In the past, asymmetric powers of errors in the conditionally heteroskedastic models have been used to capture this effect. Our model is using the gamma difference representation of the generalized Laplace distributions that efficiently models the asymmetry. It has one additional natural parameter, the shape, that is used instead of power in the asymmetric power models to capture the strength of a long-lasting effect of shocks. Some fundamental properties of the model are provided including the formula for covariances and an explicit form for the conditional distribution of 'bad' and 'good' news processes given the past the property that is important for the statistical fitting of the model. Relevant features of volatility models are illustrated using S&P 500 historical data.

Keywords: heavy tails, volatility clustering, generalized asymmetric laplace distribution, leverage effect, conditional heteroskedasticity, asymmetric power volatility, GARCH models

Procedia PDF Downloads 385