Search results for: correlation clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4412

Search results for: correlation clustering

4202 Event Driven Dynamic Clustering and Data Aggregation in Wireless Sensor Network

Authors: Ashok V. Sutagundar, Sunilkumar S. Manvi

Abstract:

Energy, delay and bandwidth are the prime issues of wireless sensor network (WSN). Energy usage optimization and efficient bandwidth utilization are important issues in WSN. Event triggered data aggregation facilitates such optimal tasks for event affected area in WSN. Reliable delivery of the critical information to sink node is also a major challenge of WSN. To tackle these issues, we propose an event driven dynamic clustering and data aggregation scheme for WSN that enhances the life time of the network by minimizing redundant data transmission. The proposed scheme operates as follows: (1) Whenever the event is triggered, event triggered node selects the cluster head. (2) Cluster head gathers data from sensor nodes within the cluster. (3) Cluster head node identifies and classifies the events out of the collected data using Bayesian classifier. (4) Aggregation of data is done using statistical method. (5) Cluster head discovers the paths to the sink node using residual energy, path distance and bandwidth. (6) If the aggregated data is critical, cluster head sends the aggregated data over the multipath for reliable data communication. (7) Otherwise aggregated data is transmitted towards sink node over the single path which is having the more bandwidth and residual energy. The performance of the scheme is validated for various WSN scenarios to evaluate the effectiveness of the proposed approach in terms of aggregation time, cluster formation time and energy consumed for aggregation.

Keywords: wireless sensor network, dynamic clustering, data aggregation, wireless communication

Procedia PDF Downloads 431
4201 Cr Induced Magnetization in Zinc-Blende ZnO-Based Diluted Magnetic Semiconductors

Authors: Bakhtiar Ul Haq, R. Ahmed, A. Shaari, Mazmira Binti Mohamed, Nisar Ali

Abstract:

The capability of exploiting the electronic charge and spin properties simultaneously in a single material has made diluted magnetic semiconductors (DMS) remarkable in the field of spintronics. We report the designing of DMS based on zinc-blend ZnO doped with Cr impurity. The full potential linearized augmented plane wave plus local orbital FP-L(APW+lo) method in density functional theory (DFT) has been adapted to carry out these investigations. For treatment of exchange and correlation energy, generalized gradient approximations have been used. Introducing Cr atoms in the matrix of ZnO has induced strong magnetic moment with ferromagnetic ordering at stable ground state. Cr:ZnO was found to favor the short range magnetic interaction that reflect the tendency of Cr clustering. The electronic structure of ZnO is strongly influenced in the presence of Cr impurity atoms where impurity bands appear in the band gap.

Keywords: ZnO, density functional theory, diluted agnetic semiconductors, ferromagnetic materials, FP-L(APW+lo)

Procedia PDF Downloads 409
4200 Unsupervised Part-of-Speech Tagging for Amharic Using K-Means Clustering

Authors: Zelalem Fantahun

Abstract:

Part-of-speech tagging is the process of assigning a part-of-speech or other lexical class marker to each word into naturally occurring text. Part-of-speech tagging is the most fundamental and basic task almost in all natural language processing. In natural language processing, the problem of providing large amount of manually annotated data is a knowledge acquisition bottleneck. Since, Amharic is one of under-resourced language, the availability of tagged corpus is the bottleneck problem for natural language processing especially for POS tagging. A promising direction to tackle this problem is to provide a system that does not require manually tagged data. In unsupervised learning, the learner is not provided with classifications. Unsupervised algorithms seek out similarity between pieces of data in order to determine whether they can be characterized as forming a group. This paper explicates the development of unsupervised part-of-speech tagger using K-Means clustering for Amharic language since large amount of data is produced in day-to-day activities. In the development of the tagger, the following procedures are followed. First, the unlabeled data (raw text) is divided into 10 folds and tokenization phase takes place; at this level, the raw text is chunked at sentence level and then into words. The second phase is feature extraction which includes word frequency, syntactic and morphological features of a word. The third phase is clustering. Among different clustering algorithms, K-means is selected and implemented in this study that brings group of similar words together. The fourth phase is mapping, which deals with looking at each cluster carefully and the most common tag is assigned to a group. This study finds out two features that are capable of distinguishing one part-of-speech from others these are morphological feature and positional information and show that it is possible to use unsupervised learning for Amharic POS tagging. In order to increase performance of the unsupervised part-of-speech tagger, there is a need to incorporate other features that are not included in this study, such as semantic related information. Finally, based on experimental result, the performance of the system achieves a maximum of 81% accuracy.

Keywords: POS tagging, Amharic, unsupervised learning, k-means

Procedia PDF Downloads 426
4199 Privacy Preserving Data Publishing Based on Sensitivity in Context of Big Data Using Hive

Authors: P. Srinivasa Rao, K. Venkatesh Sharma, G. Sadhya Devi, V. Nagesh

Abstract:

Privacy Preserving Data Publication is the main concern in present days because the data being published through the internet has been increasing day by day. This huge amount of data was named as Big Data by its size. This project deals the privacy preservation in the context of Big Data using a data warehousing solution called hive. We implemented Nearest Similarity Based Clustering (NSB) with Bottom-up generalization to achieve (v,l)-anonymity. (v,l)-Anonymity deals with the sensitivity vulnerabilities and ensures the individual privacy. We also calculate the sensitivity levels by simple comparison method using the index values, by classifying the different levels of sensitivity. The experiments were carried out on the hive environment to verify the efficiency of algorithms with Big Data. This framework also supports the execution of existing algorithms without any changes. The model in the paper outperforms than existing models.

Keywords: sensitivity, sensitive level, clustering, Privacy Preserving Data Publication (PPDP), bottom-up generalization, Big Data

Procedia PDF Downloads 275
4198 Identification of Nonlinear Systems Using Radial Basis Function Neural Network

Authors: C. Pislaru, A. Shebani

Abstract:

This paper uses the radial basis function neural network (RBFNN) for system identification of nonlinear systems. Five nonlinear systems are used to examine the activity of RBFNN in system modeling of nonlinear systems; the five nonlinear systems are dual tank system, single tank system, DC motor system, and two academic models. The feed forward method is considered in this work for modelling the non-linear dynamic models, where the K-Means clustering algorithm used in this paper to select the centers of radial basis function network, because it is reliable, offers fast convergence and can handle large data sets. The least mean square method is used to adjust the weights to the output layer, and Euclidean distance method used to measure the width of the Gaussian function.

Keywords: system identification, nonlinear systems, neural networks, radial basis function, K-means clustering algorithm

Procedia PDF Downloads 454
4197 Parallel Genetic Algorithms Clustering for Handling Recruitment Problem

Authors: Walid Moudani, Ahmad Shahin

Abstract:

This research presents a study to handle the recruitment services system. It aims to enhance a business intelligence system by embedding data mining in its core engine and to facilitate the link between job searchers and recruiters companies. The purpose of this study is to present an intelligent management system for supporting recruitment services based on data mining methods. It consists to apply segmentation on the extracted job postings offered by the different recruiters. The details of the job postings are associated to a set of relevant features that are extracted from the web and which are based on critical criterion in order to define consistent clusters. Thereafter, we assign the job searchers to the best cluster while providing a ranking according to the job postings of the selected cluster. The performance of the proposed model used is analyzed, based on a real case study, with the clustered job postings dataset and classified job searchers dataset by using some metrics.

Keywords: job postings, job searchers, clustering, genetic algorithms, business intelligence

Procedia PDF Downloads 313
4196 Simulating Lean and Green Correlation in Supply Chain Context

Authors: Rachid Benmoussa, Fatima Ezzahra Essaber, Roland De Guio, Fatima Zahra Ben Moussa

Abstract:

Implementing green practices in supply chain management is a complex task mainly because ecological, economical and operational goals are usually in conflict. Green practices might thus face companies’ reluctance because managers can consider its implementation obviously as a performance lean degradation. To implement lean and green practices successfully, companies need relevant decision-making tools to highlight the correlation between them. To contribute to this issue, this work tries to answer the following research question: How to use simulation to assess correlation (antagonism or convergence) between lean and green goals? To answer this question, we propose in this paper a based simulation process that measures correlation generally between two variables. So as to prove its relevance, a logistics academic case study is used to illustrate all its stages. It shows, as for example, that Lean goal 'Stock' and Green goal 'CO₂ emission' are not conceptually correlated (linearly).

Keywords: simulation, lean, green, supply chain

Procedia PDF Downloads 469
4195 A Model Based Metaheuristic for Hybrid Hierarchical Community Structure in Social Networks

Authors: Radhia Toujani, Jalel Akaichi

Abstract:

In recent years, the study of community detection in social networks has received great attention. The hierarchical structure of the network leads to the emergence of the convergence to a locally optimal community structure. In this paper, we aim to avoid this local optimum in the introduced hybrid hierarchical method. To achieve this purpose, we present an objective function where we incorporate the value of structural and semantic similarity based modularity and a metaheuristic namely bees colonies algorithm to optimize our objective function on both hierarchical level divisive and agglomerative. In order to assess the efficiency and the accuracy of the introduced hybrid bee colony model, we perform an extensive experimental evaluation on both synthetic and real networks.

Keywords: social network, community detection, agglomerative hierarchical clustering, divisive hierarchical clustering, similarity, modularity, metaheuristic, bee colony

Procedia PDF Downloads 361
4194 Evaluation of Affecting Factors on Effectiveness of Animal Artificial Insemination Training Courses in Zanjan Province

Authors: Ali Ashraf Hamedi Oghul Beyk

Abstract:

This research is aimed in order to demonstrate the factors affecting on effectiveness of animal artificial insemination training courses in Zanjan province. The research method is descriptive and correlation. Research tools a questionnaire and research sample are 104 persons who participated in animal artificial insemination training courses. The data resulted from this procedure was analysed by using SPSS software under windows system.independent variables include :individual, sociological, technical, and organizational, dependent variable is: affecting factors on effectiveness of animal artificial insemination training courses the finding of this study indicates that there is a significant correlation(99/0) between individual variables such as motivation and interest and experiment and effectiveness of animal artificial insemination training courses. There is significant correlation (95/0) between sociological variables such as job and education and effectiveness of animal artificial insemination training course. There is significant correlation (99/0) between techn ical variables such as training quality media and instructional materials. Moreover, effectiveness of animal artificial insemination training course there is significant correlation(0/95) between organizational variables such as trainers combination,place conditions.

Keywords: animal artificial insemination, effect, effectiveness, training courses, Zanjan

Procedia PDF Downloads 364
4193 Correlation of SPT N-Value and Equipment Drilling Parameters in Deep Soil Mixing

Authors: John Eric C. Bargas, Ma. Cecilia M. Marcos

Abstract:

One of the most common ground improvement techniques is Deep Soil Mixing (DSM). As the technique progresses, there is still lack in the development when it comes to depth control. This was the issue experienced during the installation of DSM in one of the National projects in the Philippines. This study assesses the feasibility of using equipment drilling parameters such as hydraulic pressure, drilling speed and rotational speed in determining the Standard Penetration Test N-value of a specific soil. Hydraulic pressure and drilling speed with a constant rotational speed of 30 rpm have a positive correlation with SPT N-value for cohesive soil and sand. A linear trend was observed for cohesive soil. The correlation of SPT N-value and hydraulic pressure yielded a R²=0.5377 while the correlation of SPT N-value and drilling speed has a R²=0.6355. While the best fitted model for sand is polynomial trend. The correlation of SPT N-value and hydraulic pressure yielded a R²=0.7088 while the correlation of SPT N-value and drilling speed has a R²=0.4354. The low correlation may be attributed to the behavior of sand when the auger penetrates. Sand tends to follow the rotation of the auger rather than resisting which was observed for very loose to medium dense sand. Specific Energy and the product of hydraulic pressure and drilling speed yielded same R² with a positive correlation. Linear trend was observed for cohesive soil while polynomial trend for sand. Cohesive soil yielded a R²=0.7320 which has a strong relationship. Sand also yielded a strong relationship having a coefficient of determination, R²=0.7203. It is feasible to use hydraulic pressure and drilling speed to estimate the SPT N-value of the soil. Also, the product of hydraulic pressure and drilling speed can be a substitute to specific energy when estimating the SPT N-value of a soil. However, additional considerations are necessary to account for other influencing factors like ground water and physical and mechanical properties of soil.

Keywords: ground improvement, equipment drilling parameters, standard penetration test, deep soil mixing

Procedia PDF Downloads 13
4192 CoP-Networks: Virtual Spaces for New Faculty’s Professional Development in the 21st Higher Education

Authors: Eman AbuKhousa, Marwan Z. Bataineh

Abstract:

The 21st century higher education and globalization challenge new faculty members to build effective professional networks and partnership with industry in order to accelerate their growth and success. This creates the need for community of practice (CoP)-oriented development approaches that focus on cognitive apprenticeship while considering individual predisposition and future career needs. This work adopts data mining, clustering analysis, and social networking technologies to present the CoP-Network as a virtual space that connects together similar career-aspiration individuals who are socially influenced to join and engage in a process for domain-related knowledge and practice acquisitions. The CoP-Network model can be integrated into higher education to extend traditional graduate and professional development programs.

Keywords: clustering analysis, community of practice, data mining, higher education, new faculty challenges, social network, social influence, professional development

Procedia PDF Downloads 165
4191 Customer Satisfaction and Effective HRM Policies: Customer and Employee Satisfaction

Authors: S. Anastasiou, C. Nathanailides

Abstract:

The purpose of this study is to examine the possible link between employee and customer satisfaction. The service provided by employees, help to build a good relationship with customers and can help at increasing their loyalty. Published data for job satisfaction and indicators of customer services were gathered from relevant published works which included data from five different countries. The reviewed data indicate a significant correlation between indicators of customer and employee satisfaction in the Banking sector. There was a significant correlation between the two parameters (Pearson correlation R2=0.52 P<0.05) The reviewed data provide evidence that there is some practical evidence which links these two parameters.

Keywords: job satisfaction, job performance, customer’ service, banks, human resources management

Procedia PDF Downloads 304
4190 A Minimum Spanning Tree-Based Method for Initializing the K-Means Clustering Algorithm

Authors: J. Yang, Y. Ma, X. Zhang, S. Li, Y. Zhang

Abstract:

The traditional k-means algorithm has been widely used as a simple and efficient clustering method. However, the algorithm often converges to local minima for the reason that it is sensitive to the initial cluster centers. In this paper, an algorithm for selecting initial cluster centers on the basis of minimum spanning tree (MST) is presented. The set of vertices in MST with same degree are regarded as a whole which is used to find the skeleton data points. Furthermore, a distance measure between the skeleton data points with consideration of degree and Euclidean distance is presented. Finally, MST-based initialization method for the k-means algorithm is presented, and the corresponding time complexity is analyzed as well. The presented algorithm is tested on five data sets from the UCI Machine Learning Repository. The experimental results illustrate the effectiveness of the presented algorithm compared to three existing initialization methods.

Keywords: degree, initial cluster center, k-means, minimum spanning tree

Procedia PDF Downloads 390
4189 Proposing a Boundary Coverage Algorithm ‎for Underwater Sensor Network

Authors: Seyed Mohsen Jameii

Abstract:

Wireless underwater sensor networks are a type of sensor networks that are located in underwater environments and linked together by acoustic waves. The application of these kinds of network includes monitoring of pollutants (chemical, biological, and nuclear), oil fields detection, prediction of the likelihood of a tsunami in coastal areas, the use of wireless sensor nodes to monitor the passing submarines, and determination of appropriate locations for anchoring ships. This paper proposes a boundary coverage algorithm for intrusion detection in underwater sensor networks. In the first phase of the proposed algorithm, optimal deployment of nodes is done in the water. In the second phase, after the employment of nodes at the proper depth, clustering is executed to reduce the exchanges of messages between the sensors. In the third phase, the algorithm of "divide and conquer" is used to save energy and increase network efficiency. The simulation results demonstrate the efficiency of the proposed algorithm.

Keywords: boundary coverage, clustering, divide and ‎conquer, underwater sensor nodes

Procedia PDF Downloads 322
4188 Power Aware Modified I-LEACH Protocol Using Fuzzy IF Then Rules

Authors: Gagandeep Singh, Navdeep Singh

Abstract:

Due to limited battery of sensor nodes, so energy efficiency found to be main constraint in WSN. Therefore the main focus of the present work is to find the ways to minimize the energy consumption problem and will results; enhancement in the network stability period and life time. Many researchers have proposed different kind of the protocols to enhance the network lifetime further. This paper has evaluated the issues which have been neglected in the field of the WSNs. WSNs are composed of multiple unattended ultra-small, limited-power sensor nodes. Sensor nodes are deployed randomly in the area of interest. Sensor nodes have limited processing, wireless communication and power resource capabilities Sensor nodes send sensed data to sink or Base Station (BS). I-LEACH gives adaptive clustering mechanism which very efficiently deals with energy conservations. This paper ends up with the shortcomings of various adaptive clustering based WSNs protocols.

Keywords: WSN, I-Leach, MATLAB, sensor

Procedia PDF Downloads 259
4187 Investigation of the Speckle Pattern Effect for Displacement Assessments by Digital Image Correlation

Authors: Salim Çalışkan, Hakan Akyüz

Abstract:

Digital image correlation has been accustomed as a versatile and efficient method for measuring displacements on the article surfaces by comparing reference subsets in undeformed images with the define target subset in the distorted image. The theoretical model points out that the accuracy of the digital image correlation displacement data can be exactly anticipated based on the divergence of the image noise and the sum of the squares of the subset intensity gradients. The digital image correlation procedure locates each subset of the original image in the distorted image. The software then determines the displacement values of the centers of the subassemblies, providing the complete displacement measures. In this paper, the effect of the speckle distribution and its effect on displacements measured out plane displacement data as a function of the size of the subset was investigated. Nine groups of speckle patterns were used in this study: samples are sprayed randomly by pre-manufactured patterns of three different hole diameters, each with three coverage ratios, on a computer numerical control punch press. The resulting displacement values, referenced at the center of the subset, are evaluated based on the average of the displacements of the pixel’s interior the subset.

Keywords: digital image correlation, speckle pattern, experimental mechanics, tensile test, aluminum alloy

Procedia PDF Downloads 54
4186 Correlation of Spirometry with Six Minute Walk Test and Grading of Dyspnoea in COPD Patients

Authors: Anand K. Patel

Abstract:

Background: Patients with COPD have decreased pulmonary functions, which in turn reflect on their day-to-day activities. Objectives: To assess the correlation between functional vital capacity (FVC) and forced expiratory volume in one second (FEV1) with 6 minutes walk test (6MWT). To correlate the Borg rating for perceived exertion scale (Borg scale) and Modified medical research council (MMRC) dyspnea scale with the 6MWT, FVC and FEV1. Method: In this prospective study total 72 patients with COPD diagnosed by the GOLD guidelines were enrolled after taking written consent. They were first asked to rate physical exertion on the Borg scale as well as the modified medical research council dyspnea scale and then were subjected to perform pre and post bronchodilator spirometry followed by 6 minute walk test. The findings were correlated by calculating the Pearson coefficient for each set and obtaining the p-values, with a p < 0.05 being clinically significant. Result: There was a significant correlation between spirometry and 6MWT suggesting that patients with lower measurements were unable to walk for longer distances. However, FVC had the stronger correlation than FEV1. MMRC scale had a stronger correlation with 6MWT as compared to the Borg scale. Conclusion: The study suggests that 6MWT is a better test for monitoring the patients of COPD. In spirometry, FVC should be used in monitoring patients with COPD, instead of FEV1. MMRC scale shows a stronger correlation than the Borg scale, and we should use it more often.

Keywords: spirometry, 6 minute walk test, MMRC, Borg scale

Procedia PDF Downloads 176
4185 LiDAR Based Real Time Multiple Vehicle Detection and Tracking

Authors: Zhongzhen Luo, Saeid Habibi, Martin v. Mohrenschildt

Abstract:

Self-driving vehicle require a high level of situational awareness in order to maneuver safely when driving in real world condition. This paper presents a LiDAR based real time perception system that is able to process sensor raw data for multiple target detection and tracking in dynamic environment. The proposed algorithm is nonparametric and deterministic that is no assumptions and priori knowledge are needed from the input data and no initializations are required. Additionally, the proposed method is working on the three-dimensional data directly generated by LiDAR while not scarifying the rich information contained in the domain of 3D. Moreover, a fast and efficient for real time clustering algorithm is applied based on a radially bounded nearest neighbor (RBNN). Hungarian algorithm procedure and adaptive Kalman filtering are used for data association and tracking algorithm. The proposed algorithm is able to run in real time with average run time of 70ms per frame.

Keywords: lidar, segmentation, clustering, tracking

Procedia PDF Downloads 397
4184 Research on the Risks of Railroad Receiving and Dispatching Trains Operators: Natural Language Processing Risk Text Mining

Authors: Yangze Lan, Ruihua Xv, Feng Zhou, Yijia Shan, Longhao Zhang, Qinghui Xv

Abstract:

Receiving and dispatching trains is an important part of railroad organization, and the risky evaluation of operating personnel is still reflected by scores, lacking further excavation of wrong answers and operating accidents. With natural language processing (NLP) technology, this study extracts the keywords and key phrases of 40 relevant risk events about receiving and dispatching trains and reclassifies the risk events into 8 categories, such as train approach and signal risks, dispatching command risks, and so on. Based on the historical risk data of personnel, the K-Means clustering method is used to classify the risk level of personnel. The result indicates that the high-risk operating personnel need to strengthen the training of train receiving and dispatching operations towards essential trains and abnormal situations.

Keywords: receiving and dispatching trains, natural language processing, risk evaluation, K-means clustering

Procedia PDF Downloads 60
4183 The Use of Appeals in Green Printed Advertisements: A Case of Product Orientation and Organizational Image Orientation Ads

Authors: Chutima Ruanguttamanun

Abstract:

Despite the relatively large number of studies that have examined the use of appeals in advertisements, research on the use of appeals in green advertisements is still underdeveloped and needs to be investigated further, as it is definitely a tool for marketers to create illustrious ads. In this study, content analysis was employed to examine the nature of green advertising appeals and to match the appeals with the green advertisements. Two different types of green print advertisings, product orientation and organizational image orientation were used. Thirty highly educated participants with different backgrounds were asked individually to ascertain three appeals out of thirty-four given appeals found among forty real green advertisements. To analyze participant responses and to group them based on common appeals, two-step K-mean clustering is used. The clustering solution indicates that eye-catching graphics and imaginative appeals are highly notable in both types of green ads. Depressed, meaningful and sad appeals are found to be highly used in organizational image orientation ads, whereas, corporate image, informative and natural appeals are found to be essential for product orientation ads.

Keywords: advertising appeals, green marketing, green advertisement, printed advertisement

Procedia PDF Downloads 253
4182 Exploring the Role of Data Mining in Crime Classification: A Systematic Literature Review

Authors: Faisal Muhibuddin, Ani Dijah Rahajoe

Abstract:

This in-depth exploration, through a systematic literature review, scrutinizes the nuanced role of data mining in the classification of criminal activities. The research focuses on investigating various methodological aspects and recent developments in leveraging data mining techniques to enhance the effectiveness and precision of crime categorization. Commencing with an exposition of the foundational concepts of crime classification and its evolutionary dynamics, this study details the paradigm shift from conventional methods towards approaches supported by data mining, addressing the challenges and complexities inherent in the modern crime landscape. Specifically, the research delves into various data mining techniques, including K-means clustering, Naïve Bayes, K-nearest neighbour, and clustering methods. A comprehensive review of the strengths and limitations of each technique provides insights into their respective contributions to improving crime classification models. The integration of diverse data sources takes centre stage in this research. A detailed analysis explores how the amalgamation of structured data (such as criminal records) and unstructured data (such as social media) can offer a holistic understanding of crime, enriching classification models with more profound insights. Furthermore, the study explores the temporal implications in crime classification, emphasizing the significance of considering temporal factors to comprehend long-term trends and seasonality. The availability of real-time data is also elucidated as a crucial element in enhancing responsiveness and accuracy in crime classification.

Keywords: data mining, classification algorithm, naïve bayes, k-means clustering, k-nearest neigbhor, crime, data analysis, sistematic literature review

Procedia PDF Downloads 45
4181 Bioinformatic Approaches in Population Genetics and Phylogenetic Studies

Authors: Masoud Sheidai

Abstract:

Biologists with a special field of population genetics and phylogeny have different research tasks such as populations’ genetic variability and divergence, species relatedness, the evolution of genetic and morphological characters, and identification of DNA SNPs with adaptive potential. To tackle these problems and reach a concise conclusion, they must use the proper and efficient statistical and bioinformatic methods as well as suitable genetic and morphological characteristics. In recent years application of different bioinformatic and statistical methods, which are based on various well-documented assumptions, are the proper analytical tools in the hands of researchers. The species delineation is usually carried out with the use of different clustering methods like K-means clustering based on proper distance measures according to the studied features of organisms. A well-defined species are assumed to be separated from the other taxa by molecular barcodes. The species relationships are studied by using molecular markers, which are analyzed by different analytical methods like multidimensional scaling (MDS) and principal coordinate analysis (PCoA). The species population structuring and genetic divergence are usually investigated by PCoA and PCA methods and a network diagram. These are based on bootstrapping of data. The Association of different genes and DNA sequences to ecological and geographical variables is determined by LFMM (Latent factor mixed model) and redundancy analysis (RDA), which are based on Bayesian and distance methods. Molecular and morphological differentiating characters in the studied species may be identified by linear discriminant analysis (DA) and discriminant analysis of principal components (DAPC). We shall illustrate these methods and related conclusions by giving examples from different edible and medicinal plant species.

Keywords: GWAS analysis, K-Means clustering, LFMM, multidimensional scaling, redundancy analysis

Procedia PDF Downloads 105
4180 A Clustering-Based Approach for Weblog Data Cleaning

Authors: Amine Ganibardi, Cherif Arab Ali

Abstract:

This paper addresses the data cleaning issue as a part of web usage data preprocessing within the scope of Web Usage Mining. Weblog data recorded by web servers within log files reflect usage activity, i.e., End-users’ clicks and underlying user-agents’ hits. As Web Usage Mining is interested in End-users’ behavior, user-agents’ hits are referred to as noise to be cleaned-off before mining. Filtering hits from clicks is not trivial for two reasons, i.e., a server records requests interlaced in sequential order regardless of their source or type, website resources may be set up as requestable interchangeably by end-users and user-agents. The current methods are content-centric based on filtering heuristics of relevant/irrelevant items in terms of some cleaning attributes, i.e., website’s resources filetype extensions, website’s resources pointed by hyperlinks/URIs, http methods, user-agents, etc. These methods need exhaustive extra-weblog data and prior knowledge on the relevant and/or irrelevant items to be assumed as clicks or hits within the filtering heuristics. Such methods are not appropriate for dynamic/responsive Web for three reasons, i.e., resources may be set up to as clickable by end-users regardless of their type, website’s resources are indexed by frame names without filetype extensions, web contents are generated and cancelled differently from an end-user to another. In order to overcome these constraints, a clustering-based cleaning method centered on the logging structure is proposed. This method focuses on the statistical properties of the logging structure at the requested and referring resources attributes levels. It is insensitive to logging content and does not need extra-weblog data. The used statistical property takes on the structure of the generated logging feature by webpage requests in terms of clicks and hits. Since a webpage consists of its single URI and several components, these feature results in a single click to multiple hits ratio in terms of the requested and referring resources. Thus, the clustering-based method is meant to identify two clusters based on the application of the appropriate distance to the frequency matrix of the requested and referring resources levels. As the ratio clicks to hits is single to multiple, the clicks’ cluster is the smallest one in requests number. Hierarchical Agglomerative Clustering based on a pairwise distance (Gower) and average linkage has been applied to four logfiles of dynamic/responsive websites whose click to hits ratio range from 1/2 to 1/15. The optimal clustering set on the basis of average linkage and maximum inter-cluster inertia results always in two clusters. The evaluation of the smallest cluster referred to as clicks cluster under the terms of confusion matrix indicators results in 97% of true positive rate. The content-centric cleaning methods, i.e., conventional and advanced cleaning, resulted in a lower rate 91%. Thus, the proposed clustering-based cleaning outperforms the content-centric methods within dynamic and responsive web design without the need of any extra-weblog. Such an improvement in cleaning quality is likely to refine dependent analysis.

Keywords: clustering approach, data cleaning, data preprocessing, weblog data, web usage data

Procedia PDF Downloads 161
4179 AM/E/c Queuing Hub Maximal Covering Location Model with Fuzzy Parameter

Authors: M. H. Fazel Zarandi, N. Moshahedi

Abstract:

The hub location problem appears in a variety of applications such as medical centers, firefighting facilities, cargo delivery systems and telecommunication network design. The location of service centers has a strong influence on the congestion at each of them, and, consequently, on the quality of service. This paper presents a fuzzy maximal hub covering location problem (FMCHLP) in which travel costs between any pair of nodes is considered as a fuzzy variable. In order to consider the quality of service, we model each hub as a queue. Arrival rate follows Poisson distribution and service rate follows Erlang distribution. In this paper, at first, a nonlinear mathematical programming model is presented. Then, we convert it to the linear one. We solved the linear model using GAMS software up to 25 nodes and for large sizes due to the complexity of hub covering location problems, and simulated annealing algorithm is developed to solve and test the model. Also, we used possibilistic c-means clustering method in order to find an initial solution.

Keywords: fuzzy modeling, location, possibilistic clustering, queuing

Procedia PDF Downloads 378
4178 Correlation between Cephalometric Measurements and Visual Perception of Facial Profile in Skeletal Type II Patients

Authors: Choki, Supatchai Boonpratham, Suwannee Luppanapornlarp

Abstract:

The objective of this study was to find a correlation between cephalometric measurements and visual perception of facial profile in skeletal type II patients. In this study, 250 lateral cephalograms of female patients from age, 20 to 22 years were analyzed. The profile outlines of all the samples were hand traced and transformed into silhouettes by the principal investigator. Profile ratings were done by 9 orthodontists on Visual Analogue Scale from score one to ten (increasing level of convexity). 37 hard issue and soft tissue cephalometric measurements were analyzed by the principal investigator. All the measurements were repeated after 2 weeks interval for error assessment. At last, the rankings of visual perceptions were correlated with cephalometric measurements using Spearman correlation coefficient (P < 0.05). The results show that the increase in facial convexity was correlated with higher values of ANB (A point, nasion and B point), AF-BF (distance from A point to B point in mm), L1-NB (distance from lower incisor to NB line in mm), anterior maxillary alveolar height, posterior maxillary alveolar height, overjet, H angle hard tissue, H angle soft tissue and lower lip to E plane (absolute correlation values from 0.277 to 0.711). In contrast, the increase in facial convexity was correlated with lower values of Pg. to N perpendicular and Pg. to NB (mm) (absolute correlation value -0.302 and -0.294 respectively). From the soft tissue measurements, H angles had a higher correlation with visual perception than facial contour angle, nasolabial angle, and lower lip to E plane. In conclusion, the findings of this study indicated that the correlation of cephalometric measurements with visual perception was less than expected. Only 29% of cephalometric measurements had a significant correlation with visual perception. Therefore, diagnosis based solely on cephalometric analysis can result in failure to meet the patient’s esthetic expectation.

Keywords: cephalometric measurements, facial profile, skeletal type II, visual perception

Procedia PDF Downloads 120
4177 Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering

Authors: Yunus Doğan, Ahmet Durap

Abstract:

Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.

Keywords: clustering algorithms, coastal engineering, data mining, data summarization, statistical methods

Procedia PDF Downloads 344
4176 Antioxidant Activity and Correlation of Free Phenolic Content with the DPPH Radical Scavenging and Reducing Power Activity of Date Palm (Phoenix dactylifera L.) from Algeria

Authors: Cheyma Bensaci, Mokhtar Saidi, Zineb Ghiaba

Abstract:

The first objective of this study is to determine the phenolic contents and antioxidant capacities of three different varieties of date palm (Phoenix dactylifera L.) fruit (DPF) from Algeria were using three different solvents. As for the second objective is to find the correlation of phenolic contents with the both DPPH radical scavenging and reducing power activity. These results showed that date had strongly scavenging activity on DPPH .The IC50 value for DPPH radical scavenging activity was 0.15 mg/ml in acetone/H2O extract from Gh. And also, acetone/H2O extract from Gh showed the best AEAC value for reducing power was 8,48 mM. The results also showed that there are a positive correlation, so confined values between 0.153 and 0.972.

Keywords: phoenix dactylifera, antioxidant activity, correlation, reducing power

Procedia PDF Downloads 363
4175 Exploring the Correlation between Students' Performance in Educational Statistics and Research Methods in Education: The Influence of Undergraduate Programs

Authors: Justice Dadzie, Stacy H. Surman, Ruth K. Annan-Brew, Ifesinachi J. Ezugwu, Evans Addison

Abstract:

This study aimed to explore the correlation between students' performance in educational statistics and research methods in education, as well as investigate potential differences in performance based on their undergraduate programs. A cross-sectional design was employed, and data was collected from 170 students enrolled in master of philosophy programs in the department of education and psychology. The correlation analysis revealed a strong positive correlation between students' performance in intermediate statistics in education and research methods in education. This indicates a close relationship between the two domains. The MANOVA analysis showed no significant differences in the linear combination of intermediate statistics in education and research methods in education scores across the different undergraduate programs. The tests of between-subjects effects further confirmed that the student's performance in intermediate statistics in education and research methods in education did not differ significantly across the different undergraduate programs. These findings contribute to the existing literature by providing insights into the correlation between educational statistics and research methods, and the influence of undergraduate program backgrounds on students' performance in these domains. The strong positive correlation between intermediate statistics and research methods highlights the importance of a solid foundation in statistics for understanding and applying research methods. Moreover, the consistent relationship across different academic backgrounds emphasizes the need for targeted interventions and support systems to enhance graduate students' competencies in these critical areas.

Keywords: educational statistics, research methods, undergraduate programs, students performance

Procedia PDF Downloads 28
4174 Characteristics of Butterfly Communities according to Habitat Types of Jeongmaek in Korea

Authors: Ji-Suk Kim, Dong-Pil Kim, Kee-Rae Gang, Yoon Ho Choi

Abstract:

This study was conducted to investigate the characteristics of butterfly communities according to the habitat characteristics of Korean veins. The survey sites were 12 mountains located in the vein, and 12~30 quadrats (200 in total) were set. The species richness and biodiversity were different according to land use type. Two types of land use (forest and graveyard) showed lower species diversity index values ​​than other land use types. The species abundance was low in the forest and graveyards, and grasslands, forest tops, cultivated areas and urban areas showed relatively high species richness. The altitude was not statistically significant with the number of species of butterflies and biodiversity index. The degree of canopy closure showed a negative correlation. As a result of interspecific correlation analysis, it was confirmed that there was a very high correlation (R2=0.746) between Lycaena phlaeas and Pseudozizeeria maha argia, Choaspes benjaminii japonica and Argyronome ruslana.

Keywords: land use type, species diversity index, correlation, canopy closure

Procedia PDF Downloads 140
4173 The Correlation between Body Composition and Spinal Alignment in Healthy Young Adults

Authors: Ferruh Taspinar, Ismail Saracoglu, Emrah Afsar, Eda O. Okur, Gulce K. Seyyar, Gamze Kurt, Betul Taspinar

Abstract:

Although it is thought that abdominal adiposity is one of the risk factor for postural deviation, such as increased lumbar lordosis, the body mass index is not sufficient to indicate effects of abdominal adiposity on spinal alignment and postural changes. The aim of this study was to investigate the correlation with detailed body composition and spine alignment in healthy young adults. This cross-sectional study was conducted with sixty seven healthy volunteers (37 men and 30 women) whose ages ranged between 19 and 27 years. All participants’ sagittal spinal curvatures of lumbar and thoracic region were measured via Spinal mouse® (Idiag, Fehraltorf, Switzerland). Also, body composition analysis (whole body fat ratio, whole body muscle ratio, abdominal fat ratio, and trunk muscle ratio) estimation by means of bioelectrical impedance was evaluated via Tanita Bc 418 Ma Segmental Body Composition Analyser (Tanita, Japan). Pearson’s correlation was used to analysis among the variables. The mean lumbar lordosis and thoracic kyphosis angles were 21.02°±9.39, 41.50°±7.97, respectively. Statistically analysis showed a significant positive correlation between whole body fat ratio and lumbar lordosis angle (r=0.28, p=0.02). Similarly, there was a positive correlation between abdominal fat ratio and lumbar lordosis angle (r=0.27, p=0.03). The thoracic kyphosis angle showed also positive correlation with whole body fat ratio (r=0.33, p=0.00) and abdominal fat ratio (r=0.40, p=0.01). The whole body muscle ratio showed negative correlation between lumbar lordosis (r=-0.28, p=0.02) and thoracic kyphosis angles (r=-0.33, p=0.00), although there was no statistically correlation between trunk muscle ratio, lumbar and thoracic curvatures (p>0.05). The study demonstrated that an increase of fat ratio and decrease of muscle ratio in abdominal region or whole body shifts the spinal alignment which may adversely affect the spinal loading. Therefore, whole body composition should be taken into account in spine rehabilitation.

Keywords: body composition, lumbar lordosis, spinal alignment, thoracic kyphosis

Procedia PDF Downloads 369