Search results for: clustering algorithm
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3906

Search results for: clustering algorithm

3666 A Hybrid Fuzzy Clustering Approach for Fertile and Unfertile Analysis

Authors: Shima Soltanzadeh, Mohammad Hosain Fazel Zarandi, Mojtaba Barzegar Astanjin

Abstract:

Diagnosis of male infertility by the laboratory tests is expensive and, sometimes it is intolerable for patients. Filling out the questionnaire and then using classification method can be the first step in decision-making process, so only in the cases with a high probability of infertility we can use the laboratory tests. In this paper, we evaluated the performance of four classification methods including naive Bayesian, neural network, logistic regression and fuzzy c-means clustering as a classification, in the diagnosis of male infertility due to environmental factors. Since the data are unbalanced, the ROC curves are most suitable method for the comparison. In this paper, we also have selected the more important features using a filtering method and examined the impact of this feature reduction on the performance of each methods; generally, most of the methods had better performance after applying the filter. We have showed that using fuzzy c-means clustering as a classification has a good performance according to the ROC curves and its performance is comparable to other classification methods like logistic regression.

Keywords: classification, fuzzy c-means, logistic regression, Naive Bayesian, neural network, ROC curve

Procedia PDF Downloads 304
3665 Spatial-Temporal Clustering Characteristics of Dengue in the Northern Region of Sri Lanka, 2010-2013

Authors: Sumiko Anno, Keiji Imaoka, Takeo Tadono, Tamotsu Igarashi, Subramaniam Sivaganesh, Selvam Kannathasan, Vaithehi Kumaran, Sinnathamby Noble Surendran

Abstract:

Dengue outbreaks are affected by biological, ecological, socio-economic and demographic factors that vary over time and space. These factors have been examined separately and still require systematic clarification. The present study aimed to investigate the spatial-temporal clustering relationships between these factors and dengue outbreaks in the northern region of Sri Lanka. Remote sensing (RS) data gathered from a plurality of satellites were used to develop an index comprising rainfall, humidity and temperature data. RS data gathered by ALOS/AVNIR-2 were used to detect urbanization, and a digital land cover map was used to extract land cover information. Other data on relevant factors and dengue outbreaks were collected through institutions and extant databases. The analyzed RS data and databases were integrated into geographic information systems, enabling temporal analysis, spatial statistical analysis and space-time clustering analysis. Our present results showed that increases in the number of the combination of ecological factor and socio-economic and demographic factors with above the average or the presence contribute to significantly high rates of space-time dengue clusters.

Keywords: ALOS/AVNIR-2, dengue, space-time clustering analysis, Sri Lanka

Procedia PDF Downloads 446
3664 Multi-Subpopulation Genetic Algorithm with Estimation of Distribution Algorithm for Textile Batch Dyeing Scheduling Problem

Authors: Nhat-To Huynh, Chen-Fu Chien

Abstract:

Textile batch dyeing scheduling problem is complicated which includes batch formation, batch assignment on machines, batch sequencing with sequence-dependent setup time. Most manufacturers schedule their orders manually that are time consuming and inefficient. More power methods are needed to improve the solution. Motivated by the real needs, this study aims to propose approaches in which genetic algorithm is developed with multi-subpopulation and hybridised with estimation of distribution algorithm to solve the constructed problem for minimising the makespan. A heuristic algorithm is designed and embedded into the proposed algorithms to improve the ability to get out of the local optima. In addition, an empirical study is conducted in a textile company in Taiwan to validate the proposed approaches. The results have showed that proposed approaches are more efficient than simulated annealing algorithm.

Keywords: estimation of distribution algorithm, genetic algorithm, multi-subpopulation, scheduling, textile dyeing

Procedia PDF Downloads 273
3663 A Rapid Code Acquisition Scheme in OOC-Based CDMA Systems

Authors: Keunhong Chae, Seokho Yoon

Abstract:

We propose a code acquisition scheme called improved multiple-shift (IMS) for optical code division multiple access systems, where the optical orthogonal code is used instead of the pseudo noise code. Although the IMS algorithm has a similar process to that of the conventional MS algorithm, it has a better code acquisition performance than the conventional MS algorithm. We analyze the code acquisition performance of the IMS algorithm and compare the code acquisition performances of the MS and the IMS algorithms in single-user and multi-user environments.

Keywords: code acquisition, optical CDMA, optical orthogonal code, serial algorithm

Procedia PDF Downloads 502
3662 Algorithms for Run-Time Task Mapping in NoC-Based Heterogeneous MPSoCs

Authors: M. K. Benhaoua, A. K. Singh, A. E. Benyamina, P. Boulet

Abstract:

Mapping parallelized tasks of applications onto these MPSoCs can be done either at design time (static) or at run-time (dynamic). Static mapping strategies find the best placement of tasks at design-time, and hence, these are not suitable for dynamic workload and seem incapable of runtime resource management. The number of tasks or applications executing in MPSoC platform can exceed the available resources, requiring efficient run-time mapping strategies to meet these constraints. This paper describes a new Spiral Dynamic Task Mapping heuristic for mapping applications onto NoC-based Heterogeneous MPSoC. This heuristic is based on packing strategy and routing Algorithm proposed also in this paper. Heuristic try to map the tasks of an application in a clustering region to reduce the communication overhead between the communicating tasks. The heuristic proposed in this paper attempts to map the tasks of an application that are most related to each other in a spiral manner and to find the best possible path load that minimizes the communication overhead. In this context, we have realized a simulation environment for experimental evaluations to map applications with varying number of tasks onto an 8x8 NoC-based Heterogeneous MPSoCs platform, we demonstrate that the new mapping heuristics with the new modified dijkstra routing algorithm proposed are capable of reducing the total execution time and energy consumption of applications when compared to state-of-the-art run-time mapping heuristics reported in the literature.

Keywords: multiprocessor system on chip, MPSoC, network on chip, NoC, heterogeneous architectures, run-time mapping heuristics, routing algorithm

Procedia PDF Downloads 451
3661 Efficient Heuristic Algorithm to Speed Up Graphcut in Gpu for Image Stitching

Authors: Tai Nguyen, Minh Bui, Huong Ninh, Tu Nguyen, Hai Tran

Abstract:

GraphCut algorithm has been widely utilized to solve various types of computer vision problems. Its expensive computational cost encouraged many researchers to improve the speed of the algorithm. Recent works proposed schemes that work on parallel computing platforms such as CUDA. However, the problem of low convergence speed prevents the usage of GraphCut for real time applications. In this paper, we propose global suppression heuristic to boost the conver-gence process of the algorithm. A parallel implementation of GraphCut algorithm on CUDA designed for the image stitching problem is introduced. Our method achieves up to 3× time boost on the graph of size 80 × 480 compared to the best sequential GraphCut algorithm while achieving satisfactory stitched images, suitable for panorama applications. Our source code will be soon available for further research.

Keywords: CUDA, graph cut, image stitching, texture synthesis, maxflow/mincut algorithm

Procedia PDF Downloads 91
3660 Uncertainty Quantification of Corrosion Anomaly Length of Oil and Gas Steel Pipelines Based on Inline Inspection and Field Data

Authors: Tammeen Siraj, Wenxing Zhou, Terry Huang, Mohammad Al-Amin

Abstract:

The high resolution inline inspection (ILI) tool is used extensively in the pipeline industry to identify, locate, and measure metal-loss corrosion anomalies on buried oil and gas steel pipelines. Corrosion anomalies may occur singly (i.e. individual anomalies) or as clusters (i.e. a colony of corrosion anomalies). Although the ILI technology has advanced immensely, there are measurement errors associated with the sizes of corrosion anomalies reported by ILI tools due limitations of the tools and associated sizing algorithms, and detection threshold of the tools (i.e. the minimum detectable feature dimension). Quantifying the measurement error in the ILI data is crucial for corrosion management and developing maintenance strategies that satisfy the safety and economic constraints. Studies on the measurement error associated with the length of the corrosion anomalies (in the longitudinal direction of the pipeline) has been scarcely reported in the literature and will be investigated in the present study. Limitations in the ILI tool and clustering process can sometimes cause clustering error, which is defined as the error introduced during the clustering process by including or excluding a single or group of anomalies in or from a cluster. Clustering error has been found to be one of the biggest contributory factors for relatively high uncertainties associated with ILI reported anomaly length. As such, this study focuses on developing a consistent and comprehensive framework to quantify the measurement errors in the ILI-reported anomaly length by comparing the ILI data and corresponding field measurements for individual and clustered corrosion anomalies. The analysis carried out in this study is based on the ILI and field measurement data for a set of anomalies collected from two segments of a buried natural gas pipeline currently in service in Alberta, Canada. Data analyses showed that the measurement error associated with the ILI-reported length of the anomalies without clustering error, denoted as Type I anomalies is markedly less than that for anomalies with clustering error, denoted as Type II anomalies. A methodology employing data mining techniques is further proposed to classify the Type I and Type II anomalies based on the ILI-reported corrosion anomaly information.

Keywords: clustered corrosion anomaly, corrosion anomaly assessment, corrosion anomaly length, individual corrosion anomaly, metal-loss corrosion, oil and gas steel pipeline

Procedia PDF Downloads 282
3659 Agglomerative Hierarchical Clustering Based on Morphmetric Parameters of the Populations of Labeo rohita

Authors: Fayyaz Rasool, Naureen Aziz Qureshi, Shakeela Parveen

Abstract:

Labeo rohita populations from five geographical locations from the hatchery and riverine system of Punjab-Pakistan were studied for the clustering on the basis of similarities and differences based on morphometric parameters within the species. Agglomerative Hierarchical Clustering (AHC) was done by using Pearson Correlation Coefficient and Unweighted Pair Group Method with Arithmetic Mean (UPGMA) as Agglomeration method by XLSTAT 2012 version 1.02. A dendrogram with the data on the morphometrics of the representative samples of each site divided the populations of Labeo rohita in to five major clusters or classes. The variance decomposition for the optimal classification values remained as 19.24% for within class variation, while 80.76% for the between class differences. The representative central objects of the each class, the distances between the class centroids and also the distance between the central objects of the classes were generated by the analysis. A measurable distinction between the classes of the populations of the Labeo rohita was indicated in this study which determined the impacts of changing environment and other possible factors influencing the variation level among the populations of the same species.

Keywords: AHC, Labeo rohita, hatchery, riverine, morphometric

Procedia PDF Downloads 418
3658 Travel Planning in Public Transport Networks Applying the Algorithm A* for Metropolitan District of Quito

Authors: M. Fernanda Salgado, Alfonso Tierra, Wilbert Aguilar

Abstract:

The present project consists in applying the informed search algorithm A star (A*) to solve traveler problems, applying it by urban public transportation routes. The digitization of the information allowed to identify 26% of the total of routes that are registered within the Metropolitan District of Quito. For the validation of this information, data were taken in field on the travel times and the difference with respect to the times estimated by the program, resulting in that the difference between them was not greater than 2:20 minutes. We validate A* algorithm with the Dijkstra algorithm, comparing nodes vectors based on the public transport stops, the validation was established through the student t-test hypothesis. Then we verified that the times estimated by the program using the A* algorithm are similar to those registered on field. Furthermore, we review the performance of the algorithm generating iterations in both algorithms. Finally, with these iterations, a hypothesis test was carried out again with student t-test where it was concluded that the iterations of the base algorithm Dijsktra are greater than those generated by the algorithm A*.

Keywords: algorithm A*, graph, mobility, public transport, travel planning, routes

Procedia PDF Downloads 209
3657 Hyperspectral Data Classification Algorithm Based on the Deep Belief and Self-Organizing Neural Network

Authors: Li Qingjian, Li Ke, He Chun, Huang Yong

Abstract:

In this paper, the method of combining the Pohl Seidman's deep belief network with the self-organizing neural network is proposed to classify the target. This method is mainly aimed at the high nonlinearity of the hyperspectral image, the high sample dimension and the difficulty in designing the classifier. The main feature of original data is extracted by deep belief network. In the process of extracting features, adding known labels samples to fine tune the network, enriching the main characteristics. Then, the extracted feature vectors are classified into the self-organizing neural network. This method can effectively reduce the dimensions of data in the spectrum dimension in the preservation of large amounts of raw data information, to solve the traditional clustering and the long training time when labeled samples less deep learning algorithm for training problems, improve the classification accuracy and robustness. Through the data simulation, the results show that the proposed network structure can get a higher classification precision in the case of a small number of known label samples.

Keywords: DBN, SOM, pattern classification, hyperspectral, data compression

Procedia PDF Downloads 311
3656 Finding the Longest Common Subsequence in Normal DNA and Disease Affected Human DNA Using Self Organizing Map

Authors: G. Tamilpavai, C. Vishnuppriya

Abstract:

Bioinformatics is an active research area which combines biological matter as well as computer science research. The longest common subsequence (LCSS) is one of the major challenges in various bioinformatics applications. The computation of the LCSS plays a vital role in biomedicine and also it is an essential task in DNA sequence analysis in genetics. It includes wide range of disease diagnosing steps. The objective of this proposed system is to find the longest common subsequence which presents in a normal and various disease affected human DNA sequence using Self Organizing Map (SOM) and LCSS. The human DNA sequence is collected from National Center for Biotechnology Information (NCBI) database. Initially, the human DNA sequence is separated as k-mer using k-mer separation rule. Mean and median values are calculated from each separated k-mer. These calculated values are fed as input to the Self Organizing Map for the purpose of clustering. Then obtained clusters are given to the Longest Common Sub Sequence (LCSS) algorithm for finding common subsequence which presents in every clusters. It returns nx(n-1)/2 subsequence for each cluster where n is number of k-mer in a specific cluster. Experimental outcomes of this proposed system produce the possible number of longest common subsequence of normal and disease affected DNA data. Thus the proposed system will be a good initiative aid for finding disease causing sequence. Finally, performance analysis is carried out for different DNA sequences. The obtained values show that the retrieval of LCSS is done in a shorter time than the existing system.

Keywords: clustering, k-mers, longest common subsequence, SOM

Procedia PDF Downloads 231
3655 Enhanced Cluster Based Connectivity Maintenance in Vehicular Ad Hoc Network

Authors: Manverpreet Kaur, Amarpreet Singh

Abstract:

The demand of Vehicular ad hoc networks is increasing day by day, due to offering the various applications and marvelous benefits to VANET users. Clustering in VANETs is most important to overcome the connectivity problems of VANETs. In this paper, we proposed a new clustering technique Enhanced cluster based connectivity maintenance in vehicular ad hoc network. Our objective is to form long living clusters. The proposed approach is grouping the vehicles, on the basis of the longest list of neighbors to form clusters. The cluster formation and cluster head selection process done by the RSU that may results it reduces the chances of overhead on to the network. The cluster head selection procedure is the vehicle which has closest speed to average speed will elect as a cluster Head by the RSU and if two vehicles have same speed which is closest to average speed then they will be calculate by one of the new parameter i.e. distance to their respective destination. The vehicle which has largest distance to their destination will be choosing as a cluster Head by the RSU. Our simulation outcomes show that our technique performs better than the existing technique.

Keywords: VANETs, clustering, connectivity, cluster head, intelligent transportation system (ITS)

Procedia PDF Downloads 209
3654 Assessment of Memetic and Genetic Algorithm for a Flexible Integrated Logistics Network

Authors: E. Behmanesh, J. Pannek

Abstract:

The distribution-allocation problem is known as one of the most comprehensive strategic decision. In real-world cases, it is impossible to solve a distribution-allocation problem in traditional ways with acceptable time. Hence researchers develop efficient non-traditional techniques for the large-term operation of the whole supply chain. These techniques provide near-optimal solutions particularly for large scales test problems. This paper, presents an integrated supply chain model which is flexible in the delivery path. As the solution methodology, we apply a memetic algorithm with a novelty in population presentation. To illustrate the performance of the proposed memetic algorithm, LINGO optimization software serves as a comparison basis for small size problems. In large size cases that we are dealing with in the real world, the Genetic algorithm as the second metaheuristic algorithm is considered to compare the results and show the efficiency of the memetic algorithm.

Keywords: integrated logistics network, flexible path, memetic algorithm, genetic algorithm

Procedia PDF Downloads 346
3653 A Speeded up Robust Scale-Invariant Feature Transform Currency Recognition Algorithm

Authors: Daliyah S. Aljutaili, Redna A. Almutlaq, Suha A. Alharbi, Dina M. Ibrahim

Abstract:

All currencies around the world look very different from each other. For instance, the size, color, and pattern of the paper are different. With the development of modern banking services, automatic methods for paper currency recognition become important in many applications like vending machines. One of the currency recognition architecture’s phases is Feature detection and description. There are many algorithms that are used for this phase, but they still have some disadvantages. This paper proposes a feature detection algorithm, which merges the advantages given in the current SIFT and SURF algorithms, which we call, Speeded up Robust Scale-Invariant Feature Transform (SR-SIFT) algorithm. Our proposed SR-SIFT algorithm overcomes the problems of both the SIFT and SURF algorithms. The proposed algorithm aims to speed up the SIFT feature detection algorithm and keep it robust. Simulation results demonstrate that the proposed SR-SIFT algorithm decreases the average response time, especially in small and minimum number of best key points, increases the distribution of the number of best key points on the surface of the currency. Furthermore, the proposed algorithm increases the accuracy of the true best point distribution inside the currency edge than the other two algorithms.

Keywords: currency recognition, feature detection and description, SIFT algorithm, SURF algorithm, speeded up and robust features

Procedia PDF Downloads 208
3652 Features Reduction Using Bat Algorithm for Identification and Recognition of Parkinson Disease

Authors: P. Shrivastava, A. Shukla, K. Verma, S. Rungta

Abstract:

Parkinson's disease is a chronic neurological disorder that directly affects human gait. It leads to slowness of movement, causes muscle rigidity and tremors. Gait serve as a primary outcome measure for studies aiming at early recognition of disease. Using gait techniques, this paper implements efficient binary bat algorithm for an early detection of Parkinson's disease by selecting optimal features required for classification of affected patients from others. The data of 166 people, both fit and affected is collected and optimal feature selection is done using PSO and Bat algorithm. The reduced dataset is then classified using neural network. The experiments indicate that binary bat algorithm outperforms traditional PSO and genetic algorithm and gives a fairly good recognition rate even with the reduced dataset.

Keywords: parkinson, gait, feature selection, bat algorithm

Procedia PDF Downloads 512
3651 Performance Analysis of Deterministic Stable Election Protocol Using Fuzzy Logic in Wireless Sensor Network

Authors: Sumanpreet Kaur, Harjit Pal Singh, Vikas Khullar

Abstract:

In Wireless Sensor Network (WSN), the sensor containing motes (nodes) incorporate batteries that can lament at some extent. To upgrade the energy utilization, clustering is one of the prototypical approaches for split sensor motes into a number of clusters where one mote (also called as node) proceeds as a Cluster Head (CH). CH selection is one of the optimization techniques for enlarging stability and network lifespan. Deterministic Stable Election Protocol (DSEP) is an effectual clustering protocol that makes use of three kinds of nodes with dissimilar residual energy for CH election. Fuzzy Logic technology is used to expand energy level of DSEP protocol by using fuzzy inference system. This paper presents protocol DSEP using Fuzzy Logic (DSEP-FL) CH by taking into account four linguistic variables such as energy, concentration, centrality and distance to base station. Simulation results show that our proposed method gives more effective results in term of a lifespan of network and stability as compared to the performance of other clustering protocols.

Keywords: DSEP, fuzzy logic, energy model, WSN

Procedia PDF Downloads 170
3650 Selecting the Best RBF Neural Network Using PSO Algorithm for ECG Signal Prediction

Authors: Najmeh Mohsenifar, Narjes Mohsenifar, Abbas Kargar

Abstract:

In this paper, has been presented a stable method for predicting the ECG signals through the RBF neural networks, by the PSO algorithm. In spite of quasi-periodic ECG signal from a healthy person, there are distortions in electro cardiographic data for a patient. Therefore, there is no precise mathematical model for prediction. Here, we have exploited neural networks that are capable of complicated nonlinear mapping. Although the architecture and spread of RBF networks are usually selected through trial and error, the PSO algorithm has been used for choosing the best neural network. In this way, 2 second of a recorded ECG signal is employed to predict duration of 20 second in advance. Our simulations show that PSO algorithm can find the RBF neural network with minimum MSE and the accuracy of the predicted ECG signal is 97 %.

Keywords: electrocardiogram, RBF artificial neural network, PSO algorithm, predict, accuracy

Procedia PDF Downloads 591
3649 Application of a New Efficient Normal Parameter Reduction Algorithm of Soft Sets in Online Shopping

Authors: Xiuqin Ma, Hongwu Qin

Abstract:

A new efficient normal parameter reduction algorithm of soft set in decision making was proposed. However, up to the present, few documents have focused on real-life applications of this algorithm. Accordingly, we apply a New Efficient Normal Parameter Reduction algorithm into real-life datasets of online shopping, such as Blackberry Mobile Phone Dataset. Experimental results show that this algorithm is not only suitable but feasible for dealing with the online shopping.

Keywords: soft sets, parameter reduction, normal parameter reduction, online shopping

Procedia PDF Downloads 485
3648 Web Proxy Detection via Bipartite Graphs and One-Mode Projections

Authors: Zhipeng Chen, Peng Zhang, Qingyun Liu, Li Guo

Abstract:

With the Internet becoming the dominant channel for business and life, many IPs are increasingly masked using web proxies for illegal purposes such as propagating malware, impersonate phishing pages to steal sensitive data or redirect victims to other malicious targets. Moreover, as Internet traffic continues to grow in size and complexity, it has become an increasingly challenging task to detect the proxy service due to their dynamic update and high anonymity. In this paper, we present an approach based on behavioral graph analysis to study the behavior similarity of web proxy users. Specifically, we use bipartite graphs to model host communications from network traffic and build one-mode projections of bipartite graphs for discovering social-behavior similarity of web proxy users. Based on the similarity matrices of end-users from the derived one-mode projection graphs, we apply a simple yet effective spectral clustering algorithm to discover the inherent web proxy users behavior clusters. The web proxy URL may vary from time to time. Still, the inherent interest would not. So, based on the intuition, by dint of our private tools implemented by WebDriver, we examine whether the top URLs visited by the web proxy users are web proxies. Our experiment results based on real datasets show that the behavior clusters not only reduce the number of URLs analysis but also provide an effective way to detect the web proxies, especially for the unknown web proxies.

Keywords: bipartite graph, one-mode projection, clustering, web proxy detection

Procedia PDF Downloads 220
3647 A Mixture Vine Copula Structures Model for Dependence Wind Speed among Wind Farms and Its Application in Reactive Power Optimization

Authors: Yibin Qiu, Yubo Ouyang, Shihan Li, Guorui Zhang, Qi Li, Weirong Chen

Abstract:

This paper aims at exploring the impacts of high dimensional dependencies of wind speed among wind farms on probabilistic optimal power flow. To obtain the reactive power optimization faster and more accurately, a mixture vine Copula structure model combining the K-means clustering, C vine copula and D vine copula is proposed in this paper, through which a more accurate correlation model can be obtained. Moreover, a Modified Backtracking Search Algorithm (MBSA), the three-point estimate method is applied to probabilistic optimal power flow. The validity of the mixture vine copula structure model and the MBSA are respectively tested in IEEE30 node system with measured data of 3 adjacent wind farms in a certain area, and the results indicate effectiveness of these methods.

Keywords: mixture vine copula structure model, three-point estimate method, the probability integral transform, modified backtracking search algorithm, reactive power optimization

Procedia PDF Downloads 226
3646 Discretization of Cuckoo Optimization Algorithm for Solving Quadratic Assignment Problems

Authors: Elham Kazemi

Abstract:

Quadratic Assignment Problem (QAP) is one the combinatorial optimization problems about which research has been done in many companies for allocating some facilities to some locations. The issue of particular importance in this process is the costs of this allocation and the attempt in this problem is to minimize this group of costs. Since the QAP’s are from NP-hard problem, they cannot be solved by exact solution methods. Cuckoo Optimization Algorithm is a Meta-heuristicmethod which has higher capability to find the global optimal points. It is an algorithm which is basically raised to search a continuous space. The Quadratic Assignment Problem is the issue which can be solved in the discrete space, thus the standard arithmetic operators of Cuckoo Optimization Algorithm need to be redefined on the discrete space in order to apply the Cuckoo Optimization Algorithm on the discrete searching space. This paper represents the way of discretizing the Cuckoo optimization algorithm for solving the quadratic assignment problem.

Keywords: Quadratic Assignment Problem (QAP), Discrete Cuckoo Optimization Algorithm (DCOA), meta-heuristic algorithms, optimization algorithms

Procedia PDF Downloads 482
3645 A Concept of Data Mining with XML Document

Authors: Akshay Agrawal, Anand K. Srivastava

Abstract:

The increasing amount of XML datasets available to casual users increases the necessity of investigating techniques to extract knowledge from these data. Data mining is widely applied in the database research area in order to extract frequent correlations of values from both structured and semi-structured datasets. The increasing availability of heterogeneous XML sources has raised a number of issues concerning how to represent and manage these semi structured data. In recent years due to the importance of managing these resources and extracting knowledge from them, lots of methods have been proposed in order to represent and cluster them in different ways.

Keywords: XML, similarity measure, clustering, cluster quality, semantic clustering

Procedia PDF Downloads 345
3644 Analysis of Production Forecasting in Unconventional Gas Resources Development Using Machine Learning and Data-Driven Approach

Authors: Dongkwon Han, Sangho Kim, Sunil Kwon

Abstract:

Unconventional gas resources have dramatically changed the future energy landscape. Unlike conventional gas resources, the key challenges in unconventional gas have been the requirement that applies to advanced approaches for production forecasting due to uncertainty and complexity of fluid flow. In this study, artificial neural network (ANN) model which integrates machine learning and data-driven approach was developed to predict productivity in shale gas. The database of 129 wells of Eagle Ford shale basin used for testing and training of the ANN model. The Input data related to hydraulic fracturing, well completion and productivity of shale gas were selected and the output data is a cumulative production. The performance of the ANN using all data sets, clustering and variables importance (VI) models were compared in the mean absolute percentage error (MAPE). ANN model using all data sets, clustering, and VI were obtained as 44.22%, 10.08% (cluster 1), 5.26% (cluster 2), 6.35%(cluster 3), and 32.23% (ANN VI), 23.19% (SVM VI), respectively. The results showed that the pre-trained ANN model provides more accurate results than the ANN model using all data sets.

Keywords: unconventional gas, artificial neural network, machine learning, clustering, variables importance

Procedia PDF Downloads 172
3643 Constant Factor Approximation Algorithm for p-Median Network Design Problem with Multiple Cable Types

Authors: Chaghoub Soraya, Zhang Xiaoyan

Abstract:

This research presents the first constant approximation algorithm to the p-median network design problem with multiple cable types. This problem was addressed with a single cable type and there is a bifactor approximation algorithm for the problem. To the best of our knowledge, the algorithm proposed in this paper is the first constant approximation algorithm for the p-median network design with multiple cable types. The addressed problem is a combination of two well studied problems which are p-median problem and network design problem. The introduced algorithm is a random sampling approximation algorithm of constant factor which is conceived by using some random sampling techniques form the literature. It is based on a redistribution Lemma from the literature and a steiner tree problem as a subproblem. This algorithm is simple, and it relies on the notions of random sampling and probability. The proposed approach gives an approximation solution with one constant ratio without violating any of the constraints, in contrast to the one proposed in the literature. This paper provides a (21 + 2)-approximation algorithm for the p-median network design problem with multiple cable types using random sampling techniques.

Keywords: approximation algorithms, buy-at-bulk, combinatorial optimization, network design, p-median

Procedia PDF Downloads 165
3642 Design of a Fuzzy Luenberger Observer for Fault Nonlinear System

Authors: Mounir Bekaik, Messaoud Ramdani

Abstract:

We present in this work a new technique of stabilization for fault nonlinear systems. The approach we adopt focus on a fuzzy Luenverger observer. The T-S approximation of the nonlinear observer is based on fuzzy C-Means clustering algorithm to find local linear subsystems. The MOESP identification approach was applied to design an empirical model describing the subsystems state variables. The gain of the observer is given by the minimization of the estimation error through Lyapunov-krasovskii functional and LMI approach. We consider a three tank hydraulic system for an illustrative example.

Keywords: nonlinear system, fuzzy, faults, TS, Lyapunov-Krasovskii, observer

Procedia PDF Downloads 301
3641 An Optimized Association Rule Mining Algorithm

Authors: Archana Singh, Jyoti Agarwal, Ajay Rana

Abstract:

Data Mining is an efficient technology to discover patterns in large databases. Association Rule Mining techniques are used to find the correlation between the various item sets in a database, and this co-relation between various item sets are used in decision making and pattern analysis. In recent years, the problem of finding association rules from large datasets has been proposed by many researchers. Various research papers on association rule mining (ARM) are studied and analyzed first to understand the existing algorithms. Apriori algorithm is the basic ARM algorithm, but it requires so many database scans. In DIC algorithm, less amount of database scan is needed but complex data structure lattice is used. The main focus of this paper is to propose a new optimized algorithm (Friendly Algorithm) and compare its performance with the existing algorithms A data set is used to find out frequent itemsets and association rules with the help of existing and proposed (Friendly Algorithm) and it has been observed that the proposed algorithm also finds all the frequent itemsets and essential association rules from databases as compared to existing algorithms in less amount of database scan. In the proposed algorithm, an optimized data structure is used i.e. Graph and Adjacency Matrix.

Keywords: association rules, data mining, dynamic item set counting, FP-growth, friendly algorithm, graph

Procedia PDF Downloads 389
3640 An Empirical Study to Predict Myocardial Infarction Using K-Means and Hierarchical Clustering

Authors: Md. Minhazul Islam, Shah Ashisul Abed Nipun, Majharul Islam, Md. Abdur Rakib Rahat, Jonayet Miah, Salsavil Kayyum, Anwar Shadaab, Faiz Al Faisal

Abstract:

The target of this research is to predict Myocardial Infarction using unsupervised Machine Learning algorithms. Myocardial Infarction Prediction related to heart disease is a challenging factor faced by doctors & hospitals. In this prediction, accuracy of the heart disease plays a vital role. From this concern, the authors have analyzed on a myocardial dataset to predict myocardial infarction using some popular Machine Learning algorithms K-Means and Hierarchical Clustering. This research includes a collection of data and the classification of data using Machine Learning Algorithms. The authors collected 345 instances along with 26 attributes from different hospitals in Bangladesh. This data have been collected from patients suffering from myocardial infarction along with other symptoms. This model would be able to find and mine hidden facts from historical Myocardial Infarction cases. The aim of this study is to analyze the accuracy level to predict Myocardial Infarction by using Machine Learning techniques.

Keywords: Machine Learning, K-means, Hierarchical Clustering, Myocardial Infarction, Heart Disease

Procedia PDF Downloads 174
3639 A Genetic Algorithm to Schedule the Flow Shop Problem under Preventive Maintenance Activities

Authors: J. Kaabi, Y. Harrath

Abstract:

This paper studied the flow shop scheduling problem under machine availability constraints. The machines are subject to flexible preventive maintenance activities. The nonresumable scenario for the jobs was considered. That is, when a job is interrupted by an unavailability period of a machine it should be restarted from the beginning. The objective is to minimize the total tardiness time for the jobs and the advance/tardiness for the maintenance activities. To solve the problem, a genetic algorithm was developed and successfully tested and validated on many problem instances. The computational results showed that the new genetic algorithm outperforms another earlier proposed algorithm.

Keywords: flow shop scheduling, genetic algorithm, maintenance, priority rules

Procedia PDF Downloads 447
3638 Memetic Algorithm for Solving the One-To-One Shortest Path Problem

Authors: Omar Dib, Alexandre Caminada, Marie-Ange Manier

Abstract:

The purpose of this study is to introduce a novel approach to solve the one-to-one shortest path problem. A directed connected graph is assumed in which all edges’ weights are positive. Our method is based on a memetic algorithm in which we combine a genetic algorithm (GA) and a variable neighborhood search method (VNS). We compare our approximate method with two exact algorithms Dijkstra and Integer Programming (IP). We made experimentations using random generated, complete and real graph instances. In most case studies, numerical results show that our method outperforms exact methods with 5% average gap to the optimality. Our algorithm’s average speed is 20-times faster than Dijkstra and more than 1000-times compared to IP. The details of the experimental results are also discussed and presented in the paper.

Keywords: shortest path problem, Dijkstra’s algorithm, integer programming, memetic algorithm

Procedia PDF Downloads 436
3637 Constructing the Density of States from the Parallel Wang Landau Algorithm Overlapping Data

Authors: Arman S. Kussainov, Altynbek K. Beisekov

Abstract:

This work focuses on building an efficient universal procedure to construct a single density of states from the multiple pieces of data provided by the parallel implementation of the Wang Landau Monte Carlo based algorithm. The Ising and Pott models were used as the examples of the two-dimensional spin lattices to construct their densities of states. Sampled energy space was distributed between the individual walkers with certain overlaps. This was made to include the latest development of the algorithm as the density of states replica exchange technique. Several factors of immediate importance for the seamless stitching process have being considered. These include but not limited to the speed and universality of the initial parallel algorithm implementation as well as the data post-processing to produce the expected smooth density of states.

Keywords: density of states, Monte Carlo, parallel algorithm, Wang Landau algorithm

Procedia PDF Downloads 367