Search results for: combined cluster and discriminant analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 28843

Search results for: combined cluster and discriminant analysis

28843 Application of Combined Cluster and Discriminant Analysis to Make the Operation of Monitoring Networks More Economical

Authors: Norbert Magyar, Jozsef Kovacs, Peter Tanos, Balazs Trasy, Tamas Garamhegyi, Istvan Gabor Hatvani

Abstract:

Water is one of the most important common resources, and as a result of urbanization, agriculture, and industry it is becoming more and more exposed to potential pollutants. The prevention of the deterioration of water quality is a crucial role for environmental scientist. To achieve this aim, the operation of monitoring networks is necessary. In general, these networks have to meet many important requirements, such as representativeness and cost efficiency. However, existing monitoring networks often include sampling sites which are unnecessary. With the elimination of these sites the monitoring network can be optimized, and it can operate more economically. The aim of this study is to illustrate the applicability of the CCDA (Combined Cluster and Discriminant Analysis) to the field of water quality monitoring and optimize the monitoring networks of a river (the Danube), a wetland-lake system (Kis-Balaton & Lake Balaton), and two surface-subsurface water systems on the watershed of Lake Neusiedl/Lake Fertő and on the Szigetköz area over a period of approximately two decades. CCDA combines two multivariate data analysis methods: hierarchical cluster analysis and linear discriminant analysis. Its goal is to determine homogeneous groups of observations, in our case sampling sites, by comparing the goodness of preconceived classifications obtained from hierarchical cluster analysis with random classifications. The main idea behind CCDA is that if the ratio of correctly classified cases for a grouping is higher than at least 95% of the ratios for the random classifications, then at the level of significance (α=0.05) the given sampling sites don’t form a homogeneous group. Due to the fact that the sampling on the Lake Neusiedl/Lake Fertő was conducted at the same time at all sampling sites, it was possible to visualize the differences between the sampling sites belonging to the same or different groups on scatterplots. Based on the results, the monitoring network of the Danube yields redundant information over certain sections, so that of 12 sampling sites, 3 could be eliminated without loss of information. In the case of the wetland (Kis-Balaton) one pair of sampling sites out of 12, and in the case of Lake Balaton, 5 out of 10 could be discarded. For the groundwater system of the catchment area of Lake Neusiedl/Lake Fertő all 50 monitoring wells are necessary, there is no redundant information in the system. The number of the sampling sites on the Lake Neusiedl/Lake Fertő can decrease to approximately the half of the original number of the sites. Furthermore, neighbouring sampling sites were compared pairwise using CCDA and the results were plotted on diagrams or isoline maps showing the location of the greatest differences. These results can help researchers decide where to place new sampling sites. The application of CCDA proved to be a useful tool in the optimization of the monitoring networks regarding different types of water bodies. Based on the results obtained, the monitoring networks can be operated more economically.

Keywords: combined cluster and discriminant analysis, cost efficiency, monitoring network optimization, water quality

Procedia PDF Downloads 323
28842 Analysis Customer Loyalty Characteristic and Segmentation Analysis in Mobile Phone Category in Indonesia

Authors: A. B. Robert, Adam Pramadia, Calvin Andika

Abstract:

The main purpose of this study is to explore consumer loyalty characteristic of mobile phone category in Indonesia. Second, this research attempts to identify consumer segment and to explore their profile in each segment as the basis of marketing strategy formulation. This study used some tools of multivariate analysis such as discriminant analysis and cluster analysis. Discriminate analysis used to discriminate consumer loyal and not loyal by using particular variables. Cluster analysis used to reveal various segment in mobile phone category. In addition to having better customer understanding in each segment, this study used descriptive analysis and cross tab analysis in each segment defined by cluster analysis. This study expected several findings. First, consumer can be divided into two large group of loyal versus not loyal by set of variables. Second, this study identifies customer segment in mobile phone category. Third, exploring customer profile in each segment that has been identified. This study answer a call for additional empirical research into different product categories. Therefore, a replication research is advisable. By knowing the customer loyalty characteristic, and deep analysis of their consumption behavior and profile for each segment, this study is very advisable for high impact marketing strategy development. This study contributes body of knowledge by adding empirical study of consumer loyalty, segmentation analysis in mobile phone category by multiple brand analysis.

Keywords: customer loyalty, segmentation, marketing strategy, discriminant analysis, cluster analysis, mobile phone

Procedia PDF Downloads 566
28841 Discriminant Analysis as a Function of Predictive Learning to Select Evolutionary Algorithms in Intelligent Transportation System

Authors: Jorge A. Ruiz-Vanoye, Ocotlán Díaz-Parra, Alejandro Fuentes-Penna, Daniel Vélez-Díaz, Edith Olaco García

Abstract:

In this paper, we present the use of the discriminant analysis to select evolutionary algorithms that better solve instances of the vehicle routing problem with time windows. We use indicators as independent variables to obtain the classification criteria, and the best algorithm from the generic genetic algorithm (GA), random search (RS), steady-state genetic algorithm (SSGA), and sexual genetic algorithm (SXGA) as the dependent variable for the classification. The discriminant classification was trained with classic instances of the vehicle routing problem with time windows obtained from the Solomon benchmark. We obtained a classification of the discriminant analysis of 66.7%.

Keywords: Intelligent Transportation Systems, data-mining techniques, evolutionary algorithms, discriminant analysis, machine learning

Procedia PDF Downloads 437
28840 Electricity Generation from Renewables and Targets: An Application of Multivariate Statistical Techniques

Authors: Filiz Ersoz, Taner Ersoz, Tugrul Bayraktar

Abstract:

Renewable energy is referred to as "clean energy" and common popular support for the use of renewable energy (RE) is to provide electricity with zero carbon dioxide emissions. This study provides useful insight into the European Union (EU) RE, especially, into electricity generation obtained from renewables, and their targets. The objective of this study is to identify groups of European countries, using multivariate statistical analysis and selected indicators. The hierarchical clustering method is used to decide the number of clusters for EU countries. The conducted statistical hierarchical cluster analysis is based on the Ward’s clustering method and squared Euclidean distances. Hierarchical cluster analysis identified eight distinct clusters of European countries. Then, non-hierarchical clustering (k-means) method was applied. Discriminant analysis was used to determine the validity of the results with data normalized by Z score transformation. To explore the relationship between the selected indicators, correlation coefficients were computed. The results of the study reveal the current situation of RE in European Union Member States.

Keywords: share of electricity generation, k-means clustering, discriminant, CO2 emission

Procedia PDF Downloads 394
28839 Impact Evaluation of Discriminant Analysis on Epidemic Protocol in Warships’s Scenarios

Authors: Davi Marinho de Araujo Falcão, Ronaldo Moreira Salles, Paulo Henrique Maranhão

Abstract:

Disruption Tolerant Networks (DTN) are an evolution of Mobile Adhoc Networks (MANET) and work good in scenarioswhere nodes are sparsely distributed, with low density, intermittent connections and an end-to-end infrastructure is not possible to guarantee. Therefore, DTNs are recommended for high latency applications that can last from hours to days. The maritime scenario has mobility characteristics that contribute to a DTN network approach, but the concern with data security is also a relevant aspect in such scenarios. Continuing the previous work, which evaluated the performance of some DTN protocols (Epidemic, Spray and Wait, and Direct Delivery) in three warship scenarios and proposed the application of discriminant analysis, as a classification technique for secure connections, in the Epidemic protocol, thus, the current article proposes a new analysis of the directional discriminant function with opening angles smaller than 90 degrees, demonstrating that the increase in directivity influences the selection of a greater number of secure connections by the directional discriminant Epidemic protocol.

Keywords: DTN, discriminant function, epidemic protocol, security, tactical messages, warship scenario

Procedia PDF Downloads 163
28838 A Study on the Performance of 2-PC-D Classification Model

Authors: Nurul Aini Abdul Wahab, Nor Syamim Halidin, Sayidatina Aisah Masnan, Nur Izzati Romli

Abstract:

There are many applications of principle component method for reducing the large set of variables in various fields. Fisher’s Discriminant function is also a popular tool for classification. In this research, the researcher focuses on studying the performance of Principle Component-Fisher’s Discriminant function in helping to classify rice kernels to their defined classes. The data were collected on the smells or odour of the rice kernel using odour-detection sensor, Cyranose. 32 variables were captured by this electronic nose (e-nose). The objective of this research is to measure how well a combination model, between principle component and linear discriminant, to be as a classification model. Principle component method was used to reduce all 32 variables to a smaller and manageable set of components. Then, the reduced components were used to develop the Fisher’s Discriminant function. In this research, there are 4 defined classes of rice kernel which are Aromatic, Brown, Ordinary and Others. Based on the output from principle component method, the 32 variables were reduced to only 2 components. Based on the output of classification table from the discriminant analysis, 40.76% from the total observations were correctly classified into their classes by the PC-Discriminant function. Indirectly, it gives an idea that the classification model developed has committed to more than 50% of misclassifying the observations. As a conclusion, the Fisher’s Discriminant function that was built on a 2-component from PCA (2-PC-D) is not satisfying to classify the rice kernels into its defined classes.

Keywords: classification model, discriminant function, principle component analysis, variable reduction

Procedia PDF Downloads 307
28837 Fetal Ilium as a Tool for Sex Determination: Discriminant Functional Analysis

Authors: Luv Sharma

Abstract:

Sex determination has been the most intriguing puzzle for forensic pathologists and anthropologists, for which efforts have been made for a long. Sexual dimorphism is well established in the adult pelvis, and it is known to provide the highest level of information about sexual dimorphism. This study was conducted to know whether this dimorphism exists in fetal bones or not. A total of 34 pairs of fetal pelvis bones (22 males and 12 Females), ages ranging from 4 months to full term, were collected from unidentified dead fetuses brought to the Department of Forensic Medicine for routine medicolegal autopsies to study for sexual dimorphism in the Department of Anatomy, Pt. BD Sharma PGIMS, Rohtak. Samples were divided into 2 age groups, and various metric parameters were recorded with the help of a digital vernier caliper. Data obtained was subjected to descriptive and discriminant functional analysis. Results of Descriptive and Discriminant Functional Analysis showed that sex determination can be done with 100% accuracy by using different combinations of parameters of fetal ilium. This study illustrates that sexual dimorphism exists from early fetal life after mid-pregnancy; it can be clearly established by discriminant functional analysis.

Keywords: Ilium, fetus, sex determination, morphometric

Procedia PDF Downloads 29
28836 Evalutaion of the Surface Water Quality Using the Water Quality Index and Discriminant Analysis Method

Authors: Lazhar Belkhiri, Ammar Tiri, Lotfi Mouni

Abstract:

Water resources present to the public order of the world a very important problem for the protection and management of water quality given the complexity of water quality data sets. In this study, the water quality index (WQI) and irrigation water quality index (IWQI) were calculated in order to evaluate the surface water quality for drinking and irrigation purposes based on nine hydrochemical parameters. In order to separate the variables that are the most responsible for the spatial differentiation, the discriminant analysis (DA) was applied. The results show that the surface water quality for drinking is poor quality and very poor quality based on WQI values, however, the values of IWQI reflect that this water is acceptable for irrigation with a restriction for sensitive plants. Consequently, the discriminant analysis DA method has shown that the following parameters pH, potassium, chloride, sulfate, and bicarbonate are significant discrimination between the different stations with the spatial variation of the surface water quality, therefore, the results obtained in this study provide very useful information to decision-makers

Keywords: surface water quality, drinking and irrigation purposes, water quality index, discriminant analysis

Procedia PDF Downloads 49
28835 Alcohol and Tobacco Influencing Prevalence of Hypertension among 15-54 Old Indian Men: An Application of Discriminant Analysis Using National Family Health Survey, 2015-16

Authors: Chander Shekhar, Jeetendra Yadav, Shaziya Allarakha

Abstract:

Hypertension has been described as an 'iceberg disease' as those who suffered are ignored and hence usually seek healthcare services at a very late stage. It is estimated that more than 2 million Indians are suffering from hypertensive heart disease that contributed to above 0.13 million deaths in 2016. The paper study aims to know the prevalence of Hypertension in India and its variation by socioeconomic backgrounds and to find out risk factors discriminating hypertension with special emphasis on consumption of tobacco and alcohol among men aged 15-54 years in India. The paper uses NFHS (2015-16) data. The paper used binary logistic regression and discriminant analysis to find significant predictors and discriminants of interest. The prevalence of hypertension was 16.5% in the study population. The results suggest that consumption of alcohol and tobacco are significant discriminant characteristics in carrying hypertension irrespective of what socioeconomic background characteristic he possesses.

Keywords: hypertention, alcohol, tobacco, discriminant

Procedia PDF Downloads 122
28834 Measuring Multi-Class Linear Classifier for Image Classification

Authors: Fatma Susilawati Mohamad, Azizah Abdul Manaf, Fadhillah Ahmad, Zarina Mohamad, Wan Suryani Wan Awang

Abstract:

A simple and robust multi-class linear classifier is proposed and implemented. For a pair of classes of the linear boundary, a collection of segments of hyper planes created as perpendicular bisectors of line segments linking centroids of the classes or part of classes. Nearest Neighbor and Linear Discriminant Analysis are compared in the experiments to see the performances of each classifier in discriminating ripeness of oil palm. This paper proposes a multi-class linear classifier using Linear Discriminant Analysis (LDA) for image identification. Result proves that LDA is well capable in separating multi-class features for ripeness identification.

Keywords: multi-class, linear classifier, nearest neighbor, linear discriminant analysis

Procedia PDF Downloads 503
28833 Using Discriminant Analysis to Forecast Crime Rate in Nigeria

Authors: O. P. Popoola, O. A. Alawode, M. O. Olayiwola, A. M. Oladele

Abstract:

This research work is based on using discriminant analysis to forecast crime rate in Nigeria between 1996 and 2008. The work is interested in how gender (male and female) relates to offences committed against the government, against other properties, disturbance in public places, murder/robbery offences and other offences. The data used was collected from the National Bureau of Statistics (NBS). SPSS, the statistical package was used to analyse the data. Time plot was plotted on all the 29 offences gotten from the raw data. Eigenvalues and Multivariate tests, Wilks’ Lambda, standardized canonical discriminant function coefficients and the predicted classifications were estimated. The research shows that the distribution of the scores from each function is standardized to have a mean O and a standard deviation of 1. The magnitudes of the coefficients indicate how strongly the discriminating variable affects the score. In the predicted group membership, 172 cases that were predicted to commit crime against Government group, 66 were correctly predicted and 106 were incorrectly predicted. After going through the predicted classifications, we found out that most groups numbers that were correctly predicted were less than those that were incorrectly predicted.

Keywords: discriminant analysis, DA, multivariate analysis of variance, MANOVA, canonical correlation, and Wilks’ Lambda

Procedia PDF Downloads 440
28832 The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups

Authors: Lily Ingsrisawang, Tasanee Nacharoen

Abstract:

Introduction: The problems of unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many research papers found that the performance of existing classifier tends to be biased towards the majority class. The k -nearest neighbors’ nonparametric discriminant analysis is one method that was proposed for classifying unbalanced classes with good performance. Hence, the methods of discriminant analysis are of interest to us in investigating misclassification error rates for class-imbalanced data of three diabetes risk groups. Objective: The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification application of class-imbalanced data of diabetes risk groups. Methods: Data from a healthy project for 599 staffs in a government hospital in Bangkok were obtained for the classification problem. The staffs were diagnosed into one of three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data along with the variables; diabetes risk group, age, gender, cholesterol, and BMI was analyzed and bootstrapped up to 50 and 100 samples, 599 observations per sample, for additional estimation of misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples show non-normality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. In finding the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions with three choices of (0.90:0.05:0.05), (0.80: 0.10: 0.10) or (0.70, 0.15, 0.15). Results: The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k = 3 or k = 4 and the prior probabilities of {non-risk:risk:diabetic} as {0.90:0.05:0.05} or {0.80:0.10:0.10} gave the smallest error rate of misclassification. Conclusion: The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.

Keywords: error rate, bootstrap, diabetes risk groups, k-nearest neighbors

Procedia PDF Downloads 409
28831 Comparison of Linear Discriminant Analysis and Support Vector Machine Classifications for Electromyography Signals Acquired at Five Positions of Elbow Joint

Authors: Amna Khan, Zareena Kausar, Saad Malik

Abstract:

Bio Mechatronics has extended applications in the field of rehabilitation. It has been contributing since World War II in improving the applicability of prosthesis and assistive devices in real life scenarios. In this paper, classification accuracies have been compared for two classifiers against five positions of elbow. Electromyography (EMG) signals analysis have been acquired directly from skeletal muscles of human forearm for each of the three defined positions and at modified extreme positions of elbow flexion and extension using 8 electrode Myo armband sensor. Features were extracted from filtered EMG signals for each position. Performance of two classifiers, support vector machine (SVM) and linear discriminant analysis (LDA) has been compared by analyzing the classification accuracies. SVM illustrated classification accuracies between 90-96%, in contrast to 84-87% depicted by LDA for five defined positions of elbow keeping the number of samples and selected feature the same for both SVM and LDA.

Keywords: classification accuracies, electromyography, linear discriminant analysis (LDA), Myo armband sensor, support vector machine (SVM)

Procedia PDF Downloads 330
28830 Discriminant Analysis of Pacing Behavior on Mass Start Speed Skating

Authors: Feng Li, Qian Peng

Abstract:

The mass start speed skating (MSSS) is a new event for the 2018 PyeongChang Winter Olympics and will be an official race for the 2022 Beijing Winter Olympics. Considering that the event rankings were based on points gained on laps, it is worthwhile to investigate the pacing behavior on each lap that directly influences the ranking of the race. The aim of this study was to detect the pacing behavior and performance on MSSS regarding skaters’ level (SL), competition stage (semi-final/final) (CS) and gender (G). All the men's and women's races in the World Cup and World Championships were analyzed in the 2018-2019 and 2019-2020 seasons. As a result, a total of 601 skaters from 36 games were observed. ANOVA for repeated measures was applied to compare the pacing behavior on each lap, and the three-way ANOVA for repeated measures was used to identify the influence of SL, CS, and G on pacing behavior and total time spent. In general, the results showed that the pacing behavior from fast to slow were cluster 1—laps 4, 8, 12, 15, 16, cluster 2—laps 5, 9, 13, 14, cluster 3—laps 3, 6, 7, 10, 11, and cluster 4—laps 1 and 2 (p=0.000). For CS, the total time spent in the final was less than the semi-final (p=0.000). For SL, top-level skaters spent less total time than the middle-level and low-level (p≤0.002), while there was no significant difference between the middle-level and low-level (p=0.214). For G, the men’s skaters spent less total time than women on all laps (p≤0.048). This study could help to coach staff better understand the pacing behavior regarding SL, CS, and G, further providing references concerning promoting the pacing strategy and decision making before and during the race.

Keywords: performance analysis, pacing strategy, winning strategy, winter Olympics

Procedia PDF Downloads 165
28829 Cluster Analysis of Students’ Learning Satisfaction

Authors: Purevdolgor Luvsantseren, Ajnai Luvsan-Ish, Oyuntsetseg Sandag, Javzmaa Tsend, Akhit Tileubai, Baasandorj Chilhaasuren, Jargalbat Puntsagdash, Galbadrakh Chuluunbaatar

Abstract:

One of the indicators of the quality of university services is student satisfaction. Aim: We aimed to study the level of satisfaction of students in the first year of premedical courses in the course of Medical Physics using the cluster method. Materials and Methods: In the framework of this goal, a questionnaire was collected from a total of 324 students who studied the medical physics course of the 1st course of the premedical course at the Mongolian National University of Medical Sciences. When determining the level of satisfaction, the answers were obtained on five levels of satisfaction: "excellent", "good", "medium", "bad" and "very bad". A total of 39 questionnaires were collected from students: 8 for course evaluation, 19 for teacher evaluation, and 12 for student evaluation. From the research, a database with 39 fields and 324 records was created. Results: In this database, cluster analysis was performed in MATLAB and R programs using the k-means method of data mining. Calculated the Hopkins statistic in the created database, the values are 0.88, 0.87, and 0.97. This shows that cluster analysis methods can be used. The course evaluation sub-fund is divided into three clusters. Among them, cluster I has 150 objects with a "good" rating of 46.2%, cluster II has 119 objects with a "medium" rating of 36.7%, and Cluster III has 54 objects with a "good" rating of 16.6%. The teacher evaluation sub-base into three clusters, there are 179 objects with a "good" rating of 55.2% in cluster II, 108 objects with an "average" rating of 33.3% in cluster III, and 36 objects with an "excellent" rating in cluster I of 11.1%. The sub-base of student evaluations is divided into two clusters: cluster II has 215 objects with an "excellent" rating of 66.3%, and cluster I has 108 objects with an "excellent" rating of 33.3%. Evaluating the resulting clusters with the Silhouette coefficient, 0.32 for the course evaluation cluster, 0.31 for the teacher evaluation cluster, and 0.30 for student evaluation show statistical significance. Conclusion: Finally, to conclude, cluster analysis in the model of the medical physics lesson “good” - 46.2%, “middle” - 36.7%, “bad” - 16.6%; 55.2% - “good”, 33.3% - “middle”, 11.1% - “bad” in the teacher evaluation model; 66.3% - “good” and 33.3% of “bad” in the student evaluation model.

Keywords: questionnaire, data mining, k-means method, silhouette coefficient

Procedia PDF Downloads 17
28828 Cluster Analysis of Customer Churn in Telecom Industry

Authors: Abbas Al-Refaie

Abstract:

The research examines the factors that affect customer churn (CC) in the Jordanian telecom industry. A total of 700 surveys were distributed. Cluster analysis revealed three main clusters. Results showed that CC and customer satisfaction (CS) were the key determinants in forming the three clusters. In two clusters, the center values of CC were high, indicating that the customers were loyal and SC was expensive and time- and energy-consuming. Still, the mobile service provider (MSP) should enhance its communication (COM), and value added services (VASs), as well as customer complaint management systems (CCMS). Finally, for the third cluster the center of the CC indicates a poor level of loyalty, which facilitates customers churn to another MSP. The results of this study provide valuable feedback for MSP decision makers regarding approaches to improving their performance and reducing CC.

Keywords: cluster analysis, telecom industry, switching cost, customer churn

Procedia PDF Downloads 303
28827 A Feature Clustering-Based Sequential Selection Approach for Color Texture Classification

Authors: Mohamed Alimoussa, Alice Porebski, Nicolas Vandenbroucke, Rachid Oulad Haj Thami, Sana El Fkihi

Abstract:

Color and texture are highly discriminant visual cues that provide an essential information in many types of images. Color texture representation and classification is therefore one of the most challenging problems in computer vision and image processing applications. Color textures can be represented in different color spaces by using multiple image descriptors which generate a high dimensional set of texture features. In order to reduce the dimensionality of the feature set, feature selection techniques can be used. The goal of feature selection is to find a relevant subset from an original feature space that can improve the accuracy and efficiency of a classification algorithm. Traditionally, feature selection is focused on removing irrelevant features, neglecting the possible redundancy between relevant ones. This is why some feature selection approaches prefer to use feature clustering analysis to aid and guide the search. These techniques can be divided into two categories. i) Feature clustering-based ranking algorithm uses feature clustering as an analysis that comes before feature ranking. Indeed, after dividing the feature set into groups, these approaches perform a feature ranking in order to select the most discriminant feature of each group. ii) Feature clustering-based subset search algorithms can use feature clustering following one of three strategies; as an initial step that comes before the search, binded and combined with the search or as the search alternative and replacement. In this paper, we propose a new feature clustering-based sequential selection approach for the purpose of color texture representation and classification. Our approach is a three step algorithm. First, irrelevant features are removed from the feature set thanks to a class-correlation measure. Then, introducing a new automatic feature clustering algorithm, the feature set is divided into several feature clusters. Finally, a sequential search algorithm, based on a filter model and a separability measure, builds a relevant and non redundant feature subset: at each step, a feature is selected and features of the same cluster are removed and thus not considered thereafter. This allows to significantly speed up the selection process since large number of redundant features are eliminated at each step. The proposed algorithm uses the clustering algorithm binded and combined with the search. Experiments using a combination of two well known texture descriptors, namely Haralick features extracted from Reduced Size Chromatic Co-occurence Matrices (RSCCMs) and features extracted from Local Binary patterns (LBP) image histograms, on five color texture data sets, Outex, NewBarktex, Parquet, Stex and USPtex demonstrate the efficiency of our method compared to seven of the state of the art methods in terms of accuracy and computation time.

Keywords: feature selection, color texture classification, feature clustering, color LBP, chromatic cooccurrence matrix

Procedia PDF Downloads 102
28826 Hybrid Approach for Face Recognition Combining Gabor Wavelet and Linear Discriminant Analysis

Authors: A: Annis Fathima, V. Vaidehi, S. Ajitha

Abstract:

Face recognition system finds many applications in surveillance and human computer interaction systems. As the applications using face recognition systems are of much importance and demand more accuracy, more robustness in the face recognition system is expected with less computation time. In this paper, a hybrid approach for face recognition combining Gabor Wavelet and Linear Discriminant Analysis (HGWLDA) is proposed. The normalized input grayscale image is approximated and reduced in dimension to lower the processing overhead for Gabor filters. This image is convolved with bank of Gabor filters with varying scales and orientations. LDA, a subspace analysis techniques are used to reduce the intra-class space and maximize the inter-class space. The techniques used are 2-dimensional Linear Discriminant Analysis (2D-LDA), 2-dimensional bidirectional LDA ((2D)2LDA), Weighted 2-dimensional bidirectional Linear Discriminant Analysis (Wt (2D)2 LDA). LDA reduces the feature dimension by extracting the features with greater variance. k-Nearest Neighbour (k-NN) classifier is used to classify and recognize the test image by comparing its feature with each of the training set features. The HGWLDA approach is robust against illumination conditions as the Gabor features are illumination invariant. This approach also aims at a better recognition rate using less number of features for varying expressions. The performance of the proposed HGWLDA approaches is evaluated using AT&T database, MIT-India face database and faces94 database. It is found that the proposed HGWLDA approach provides better results than the existing Gabor approach.

Keywords: face recognition, Gabor wavelet, LDA, k-NN classifier

Procedia PDF Downloads 447
28825 The Use of Multivariate Statistical and GIS for Characterization Groundwater Quality in Laghouat Region, Algeria

Authors: Rouighi Mustapha, Bouzid Laghaa Souad, Rouighi Tahar

Abstract:

Due to rain Shortage and the increase of population in the last years, wells excavation and groundwater use for different purposes had been increased without any planning. This is a great challenge for our country. Moreover, this scarcity of water resources in this region is unfortunately combined with rapid fresh water resources quality deterioration, due to salinity and contamination processes. Therefore, it is necessary to conduct the studies about groundwater quality in Algeria. In this work consists in the identification of the factors which influence the water quality parameters in Laghouat region by using statistical analysis Principal Component Analysis (PCA), Hierarchical Cluster Analysis (HCA) and geographic information system (GIS) in an attempt to discriminate the sources of the variation of water quality variations. The results of PCA technique indicate that variables responsible for water quality composition are mainly related to soluble salts variables; natural processes and the nature of the rock which modifies significantly the water chemistry. Inferred from the positive correlation between K+ and NO3-, NO3- is believed to be human induced rather than naturally originated. In this study, the multivariate statistical analysis and GIS allows the hydrogeologist to have supplementary tools in the characterization and evaluating of aquifers.

Keywords: cluster, analysis, GIS, groundwater, laghouat, quality

Procedia PDF Downloads 295
28824 The Use of Ward Linkage in Cluster Integration with a Path Analysis Approach

Authors: Adji Achmad Rinaldo Fernandes

Abstract:

Path analysis is an analytical technique to study the causal relationship between independent and dependent variables. In this study, the integration of Clusters in the Ward Linkage method was used in a variety of clusters with path analysis. The variables used are character (x₁), capacity (x₂), capital (x₃), collateral (x₄), and condition of economy (x₄) to on time pay (y₂) through the variable willingness to pay (y₁). The purpose of this study was to compare the Ward Linkage method cluster integration in various clusters with path analysis to classify willingness to pay (y₁). The data used are primary data from questionnaires filled out by customers of Bank X, using purposive sampling. The measurement method used is the average score method. The results showed that the Ward linkage method cluster integration with path analysis on 2 clusters is the best method, by comparing the coefficient of determination. Variable character (x₁), capacity (x₂), capital (x₃), collateral (x₄), and condition of economy (x₅) to on time pay (y₂) through willingness to pay (y₁) can be explained by 58.3%, while the remaining 41.7% is explained by variables outside the model.

Keywords: cluster integration, linkage, path analysis, compliant paying behavior

Procedia PDF Downloads 147
28823 Artificial Intelligence: Obstacles Patterns and Implications

Authors: Placide Poba-Nzaou, Anicet Tchibozo, Malatsi Galani, Ali Etkkali, Erwin Halim

Abstract:

Artificial intelligence (AI) is a general-purpose technology that is transforming many industries, working life and society by stimulating economic growth and innovation. Despite the huge potential of benefits to be generated, the adoption of AI varies from one organization to another, from one region to another, and from one industry to another, due in part to obstacles that can inhibit an organization or organizations located in a specific geographic region or operating in a specific industry from adopting AI technology. In this context, these obstacles and their implications for AI adoption from the perspective of configurational theory is important for at least three reasons: (1) understanding these obstacles is the first step in enabling policymakers and providers to make an informed decision in stimulating AI adoption (2) most studies have investigating obstacles or challenges of AI adoption in isolation with linear assumptions while configurational theory offers a holistic and multifaceted way of investigating the intricate interactions between perceived obstacles and barriers helping to assess their synergetic combination while holding assumptions of non-linearity leading to insights that would otherwise be out of the scope of studies investigating these obstacles in isolation. This study aims to pursue two objectives: (1) characterize organizations by uncovering the typical profiles of combinations of 15 internal and external obstacles that may prevent organizations from adopting AI technology, (2) assess the variation in terms of intensity of AI adoption associated with each configuration. We used data from a survey of AI adoption by organizations conducted throughout the EU27, Norway, Iceland and the UK (N=7549). Cluster analysis and discriminant analysis help uncover configurations of organizations based on the 15 obstacles, including eight external and seven internal. Second, we compared the clusters according to AI adoption intensity using an analysis of variance (ANOVA) and a Tamhane T2 post hoc test. The study uncovers three strongly separated clusters of organizations based on perceived obstacles to AI adoption. The clusters are labeled according to their magnitude of perceived obstacles to AI adoption: (1) Cluster I – High Level of perceived obstacles (N = 2449, 32.4%)(2) Cluster II – Low Level of perceived obstacles (N =1879, 24.9%) (3) Cluster III – Moderate Level of perceived obstacles (N =3221, 42.7%). The proposed taxonomy goes beyond the normative understanding of perceived obstacles to AI adoption and associated implications: it provides a well-structured and parsimonious lens that is useful for policymakers, AI technology providers, and researchers. Surprisingly, the ANOVAs revealed a “high level of perceived obstacles” cluster associated with a significantly high intensity of AI adoption.

Keywords: Artificial intelligence (AI), obstacles, adoption, taxonomy.

Procedia PDF Downloads 68
28822 Comparative Analysis of Spectral Estimation Methods for Brain-Computer Interfaces

Authors: Rafik Djemili, Hocine Bourouba, M. C. Amara Korba

Abstract:

In this paper, we present a method in order to classify EEG signals for Brain-Computer Interfaces (BCI). EEG signals are first processed by means of spectral estimation methods to derive reliable features before classification step. Spectral estimation methods used are standard periodogram and the periodogram calculated by the Welch method; both methods are compared with Logarithm of Band Power (logBP) features. In the method proposed, we apply Linear Discriminant Analysis (LDA) followed by Support Vector Machine (SVM). Classification accuracy reached could be as high as 85%, which proves the effectiveness of classification of EEG signals based BCI using spectral methods.

Keywords: brain-computer interface, motor imagery, electroencephalogram, linear discriminant analysis, support vector machine

Procedia PDF Downloads 473
28821 Clustering Locations of Textile and Garment Industries to Compare with the Future Industrial Cluster in Thailand

Authors: Kanogkan Leerojanaprapa

Abstract:

Textile and garment industry is used to a major exporting industry of Thailand. According to lacking of the nation's price-competitiveness by stopping the EU's GSP (Generalised Scheme of Preferences) and ‘Nationwide Minimum Wage Policy’ that Thailand’s employers must pay all employees at least 300 baht (about $10) a day, the supply chains of the Thai textile and garment industry is affected and need to be reformed. Therefore, either Thai textile or garment industry will be existed or not would be concerned. This is also challenged for the government to decide which industries should be promoted the future industries of Thailand. Recently Thai government launch The Cluster-based Special Economic Development Zones Policy for promoting business cluster (effect on September 16, 2015). They define a cluster as the concentration of interconnected businesses and related institutions that operate within the same geographic areas and textiles and garment is one of target industrial clusters and 9 provinces are targeted (Bangkok, Kanchanaburi, Nakhon Pathom, Ratchaburi, Samut Sakhon, Chonburi, Chachoengsao, Prachinburi, and Sa Kaeo). The cluster zone are defined to link west-east corridor connected to manufacturing source in Cambodia and Mynmar to Bangkok where are promoted to be design, sourcing, and trading hub. The Thai government will provide tax and non-tax incentives for targeted industries within the clusters and expects these businesses are scattered to where they can get the most benefit which will identify future industrial cluster. This research will show the difference between the current cluster and future cluster following the target provinces of the textile and garment. The current cluster is analysed from secondary data. The four characteristics of the numbers of plants in Spinning, weaving and finishing of textiles, Manufacture of made-up textile articles, except apparel, Manufacture of knitted and crocheted fabrics, and Manufacture of other textiles, not elsewhere classified in particular 77 provinces (in total) are clustered by K-means cluster analysis and Hierarchical Cluster Analysis. In addition, the cluster can be confirmed and showed which variables contribute the most to defined cluster solution with ANOVA test. The results of analysis can identify 22 provinces (which the textile or garment plants are located) into 3 clusters. Plants in cluster 1 tend to be large numbers of plants which is only Bangkok, Next plants in cluster 2 tend to be moderate numbers of plants which are Samut Prakan, Samut Sakhon and Nakhon Pathom. Finally plants in cluster 3 tend to be little numbers of plants which are other 18 provinces. The same methodology can be implemented in other industries for future study.

Keywords: ANOVA, hierarchical cluster analysis, industrial clusters, K -means cluster analysis, textile and garment industry

Procedia PDF Downloads 193
28820 Neighborhood Graph-Optimized Preserving Discriminant Analysis for Image Feature Extraction

Authors: Xiaoheng Tan, Xianfang Li, Tan Guo, Yuchuan Liu, Zhijun Yang, Hongye Li, Kai Fu, Yufang Wu, Heling Gong

Abstract:

The image data collected in reality often have high dimensions, and it contains noise and redundant information. Therefore, it is necessary to extract the compact feature expression of the original perceived image. In this process, effective use of prior knowledge such as data structure distribution and sample label is the key to enhance image feature discrimination and robustness. Based on the above considerations, this paper proposes a local preserving discriminant feature learning model based on graph optimization. The model has the following characteristics: (1) Locality preserving constraint can effectively excavate and preserve the local structural relationship between data. (2) The flexibility of graph learning can be improved by constructing a new local geometric structure graph using label information and the nearest neighbor threshold. (3) The L₂,₁ norm is used to redefine LDA, and the diagonal matrix is introduced as the scale factor of LDA, and the samples are selected, which improves the robustness of feature learning. The validity and robustness of the proposed algorithm are verified by experiments in two public image datasets.

Keywords: feature extraction, graph optimization local preserving projection, linear discriminant analysis, L₂, ₁ norm

Procedia PDF Downloads 124
28819 A Comparative and Critical Analysis of Some Routing Protocols in Wireless Sensor Networks

Authors: Ishtiaq Wahid, Masood Ahmad, Nighat Ayub, Sajad Ali

Abstract:

Lifetime of a wireless sensor network (WSN) is directly proportional to the energy consumption of its constituent nodes. Routing in wireless sensor network is very challenging due its inherit characteristics. In hierarchal routing the sensor filed is divided into clusters. The cluster-heads are selected from each cluster, which forms a hierarchy of nodes. The cluster-heads are used to transmit the data to the base station while other nodes perform the sensing task. In this way the lifetime of the network is increased. In this paper a comparative study of hierarchal routing protocols are conducted. The simulation is done in NS-2 for validation.

Keywords: WSN, cluster, routing, sensor networks

Procedia PDF Downloads 448
28818 Analysis of Entrepreneurship in Industrial Cluster

Authors: Wen-Hsiang Lai

Abstract:

Except for the internal aspects of entrepreneurship (i.e. motivation, opportunity perspective and alertness), there are external aspects that affecting entrepreneurship (i.e. the industrial cluster). By comparing the machinery companies located inside and outside the industrial district, this study aims to explore the cluster effects on the entrepreneurship of companies in Taiwan machinery clusters (TMC). In this study, three factors affecting the entrepreneurship in TMC are conducted as “competition”, “embedded-ness” and “specialized knowledge”. The “competition” in the industrial cluster is defined as the competitive advantages that companies gain in form of demand effects and diversified strategies; the “embedded-ness” refers to the quality of company relations (relational embedded-ness) and ranges (structural embedded-ness) with the industry components (universities, customers and complementary) that affecting knowledge transfer and knowledge generations; the “specialized knowledge” shares the internal knowledge within industrial clusters. This study finds that when comparing to the companies which are outside the cluster, the industrial cluster has positive influence on the entrepreneurship. Additionally, the factor of “relational embedded-ness” has significant impact on the entrepreneurship and affects the adaptation ability of companies in TMC. Finally, the factor of “competition” reveals partial influence on the entrepreneurship.

Keywords: entrepreneurship, industrial cluster, industrial district, economies of agglomerations, Taiwan Machinery Cluster (TMC)

Procedia PDF Downloads 358
28817 Using Group Concept Mapping to Identify a Pharmacy-Based Trigger Tool to Detect Adverse Drug Events

Authors: Rodchares Hanrinth, Theerapong Srisil, Peeraya Sriphong, Pawich Paktipat

Abstract:

The trigger tool is the low-cost, low-tech method to detect adverse events through clues called triggers. The Institute for Healthcare Improvement (IHI) has developed the Global Trigger Tool for measuring and preventing adverse events. However, this tool is not specific for detecting adverse drug events. The pharmacy-based trigger tool is needed to detect adverse drug events (ADEs). Group concept mapping is an effective method for conceptualizing various ideas from diverse stakeholders. This technique was used to identify a pharmacy-based trigger to detect adverse drug events (ADEs). The aim of this study was to involve the pharmacists in conceptualizing, developing, and prioritizing a feasible trigger tool to detect adverse drug events in a provincial hospital, the northeastern part of Thailand. The study was conducted during the 6-month period between April 1 and September 30, 2017. Study participants involved 20 pharmacists (17 hospital pharmacists and 3 pharmacy lecturers) engaging in three concept mapping workshops. In this meeting, the concept mapping technique created by Trochim, a highly constructed qualitative group technic for idea generating and sharing, was used to produce and construct participants' views on what triggers were potential to detect ADEs. During the workshops, participants (n = 20) were asked to individually rate the feasibility and potentiality of each trigger and to group them into relevant categories to enable multidimensional scaling and hierarchical cluster analysis. The outputs of analysis included the trigger list, cluster list, point map, point rating map, cluster map, and cluster rating map. The three workshops together resulted in 21 different triggers that were structured in a framework forming 5 clusters: drug allergy, drugs induced diseases, dosage adjustment in renal diseases, potassium concerning, and drug overdose. The first cluster is drug allergy such as the doctor’s orders for dexamethasone injection combined with chlorpheniramine injection. Later, the diagnosis of drug-induced hepatitis in a patient taking anti-tuberculosis drugs is one trigger in the ‘drugs induced diseases’ cluster. Then, for the third cluster, the doctor’s orders for enalapril combined with ibuprofen in a patient with chronic kidney disease is the example of a trigger. The doctor’s orders for digoxin in a patient with hypokalemia is a trigger in a cluster. Finally, the doctor’s orders for naloxone with narcotic overdose was classified as a trigger in a cluster. This study generated triggers that are similar to some of IHI Global trigger tool, especially in the medication module such as drug allergy and drug overdose. However, there are some specific aspects of this tool, including drug-induced diseases, dosage adjustment in renal diseases, and potassium concerning which do not contain in any trigger tools. The pharmacy-based trigger tool is suitable for pharmacists in hospitals to detect potential adverse drug events using clues of triggers.

Keywords: adverse drug events, concept mapping, hospital, pharmacy-based trigger tool

Procedia PDF Downloads 119
28816 Analysis of Expression Data Using Unsupervised Techniques

Authors: M. A. I Perera, C. R. Wijesinghe, A. R. Weerasinghe

Abstract:

his study was conducted to review and identify the unsupervised techniques that can be employed to analyze gene expression data in order to identify better subtypes of tumors. Identifying subtypes of cancer help in improving the efficacy and reducing the toxicity of the treatments by identifying clues to find target therapeutics. Process of gene expression data analysis described under three steps as preprocessing, clustering, and cluster validation. Feature selection is important since the genomic data are high dimensional with a large number of features compared to samples. Hierarchical clustering and K Means are often used in the analysis of gene expression data. There are several cluster validation techniques used in validating the clusters. Heatmaps are an effective external validation method that allows comparing the identified classes with clinical variables and visual analysis of the classes.

Keywords: cancer subtypes, gene expression data analysis, clustering, cluster validation

Procedia PDF Downloads 116
28815 Scientific Linux Cluster for BIG-DATA Analysis (SLBD): A Case of Fayoum University

Authors: Hassan S. Hussein, Rania A. Abul Seoud, Amr M. Refaat

Abstract:

Scientific researchers face in the analysis of very large data sets that is increasing noticeable rate in today’s and tomorrow’s technologies. Hadoop and Spark are types of software that developed frameworks. Hadoop framework is suitable for many Different hardware platforms. In this research, a scientific Linux cluster for Big Data analysis (SLBD) is presented. SLBD runs open source software with large computational capacity and high performance cluster infrastructure. SLBD composed of one cluster contains identical, commodity-grade computers interconnected via a small LAN. SLBD consists of a fast switch and Gigabit-Ethernet card which connect four (nodes). Cloudera Manager is used to configure and manage an Apache Hadoop stack. Hadoop is a framework allows storing and processing big data across the cluster by using MapReduce algorithm. MapReduce algorithm divides the task into smaller tasks which to be assigned to the network nodes. Algorithm then collects the results and form the final result dataset. SLBD clustering system allows fast and efficient processing of large amount of data resulting from different applications. SLBD also provides high performance, high throughput, high availability, expandability and cluster scalability.

Keywords: big data platforms, cloudera manager, Hadoop, MapReduce

Procedia PDF Downloads 332
28814 Improvement of the Melon (Cucumis melo L.) through Genetic Gain and Discriminant Function

Authors: M. R. Naroui Rad, H. Fanaei, A. Ghalandarzehi

Abstract:

To find out the yield of melon, the traits are vital. This research was performed with the objective to assess the impact of nine different morphological traits on the production of 20 melon landraces in the sistan weather region. For all the traits genetic variation was noted. Minimum genetical variance (9.66) along with high genetic interaction with the environment led to low heritability (0.24) of the yield. The broad sense heritability of the traits that were included into the differentiating model was more than it was in the production. In this study, the five selected traits, number of fruit, fruit weight, fruit width, flesh diameter and plant yield can differentiate the genotypes with high or low production. This demonstrated the significance of these 5 traits in plant breeding programs. Discriminant function of these 5 traits, particularly, the weight of the fruit, in case of the current outputs was employed as an all-inclusive parameter for pointing out landraces with the highest yield. 75% of variation in yield can be explained with this index, and the weight of fruit also has substantial relation with the total production (r=0.72**). This factor can be highly beneficial in case of future breeding program selections.

Keywords: melon, discriminant analysis, genetic components, yield, selection

Procedia PDF Downloads 299