Search results for: Constraint Based Mining

11161 Classification of Political Affiliations by Reduced Number of Features

Abstract:

By the evolvement in technology, the way of expressing opinions switched direction to the digital world. The domain of politics, as one of the hottest topics of opinion mining research, merged together with the behavior analysis for affiliation determination in texts, which constitutes the subject of this paper. This study aims to classify the text in news/blogs either as Republican or Democrat with the minimum number of features. As an initial set, 68 features which 64 were constituted by Linguistic Inquiry and Word Count (LIWC) features were tested against 14 benchmark classification algorithms. In the later experiments, the dimensions of the feature vector reduced based on the 7 feature selection algorithms. The results show that the “Decision Tree”, “Rule Induction” and “M5 Rule” classifiers when used with “SVM” and “IGR” feature selection algorithms performed the best up to 82.5% accuracy on a given dataset. Further tests on a single feature and the linguistic based feature sets showed the similar results. The feature “Function”, as an aggregate feature of the linguistic category, was found as the most differentiating feature among the 68 features with the accuracy of 81% in classifying articles either as Republican or Democrat.

Keywords: Politics, machine learning, feature selection, LIWC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2338

11160 A New Evolutionary Algorithm for Cluster Analysis

Authors: B.Bahmani Firouzi, T. Niknam, M. Nayeripour

Abstract:

Clustering is a very well known technique in data mining. One of the most widely used clustering techniques is the kmeans algorithm. Solutions obtained from this technique depend on the initialization of cluster centers and the final solution converges to local minima. In order to overcome K-means algorithm shortcomings, this paper proposes a hybrid evolutionary algorithm based on the combination of PSO, SA and K-means algorithms, called PSO-SA-K, which can find better cluster partition. The performance is evaluated through several benchmark data sets. The simulation results show that the proposed algorithm outperforms previous approaches, such as PSO, SA and K-means for partitional clustering problem.

Keywords: Data clustering, Hybrid evolutionary optimization algorithm, K-means algorithm, Simulated Annealing (SA), Particle Swarm Optimization (PSO).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2254

11159 An Alternative Proof for the Topological Entropy of the Motzkin Shift

Authors: Fahad Alsharari, Mohd Salmi Md Noorani

Abstract:

A Motzkin shift is a mathematical model for constraints on genetic sequences. In terms of the theory of symbolic dynamics, the Motzkin shift is nonsofic, and therefore, we cannot use the Perron- Frobenius theory to calculate its topological entropy. The Motzkin shift M(M,N) which comes from language theory, is defined to be the shift system over an alphabet A that consists of N negative symbols, N positive symbols and M neutral symbols. For an x in the full shift, x will be in the Motzkin subshift M(M,N) if and only if every finite block appearing in x has a non-zero reduced form. Therefore, the constraint for x cannot be bounded in length. K. Inoue has shown that the entropy of the Motzkin shift M(M,N) is log(M + N + 1). In this paper, a new direct method of calculating the topological entropy of the Motzkin shift is given without any measure theoretical discussion.

Keywords: Motzkin shift, topological entropy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1987

11158 The Development of the Multi-Agent Classification System (MACS) in Compliance with FIPA Specifications

Authors: Mohamed R. Mhereeg

Abstract:

The paper investigates the feasibility of constructing a software multi-agent based monitoring and classification system and utilizing it to provide an automated and accurate classification of end users developing applications in the spreadsheet domain. The agents function autonomously to provide continuous and periodic monitoring of excels spreadsheet workbooks. Resulting in, the development of the MultiAgent classification System (MACS) that is in compliance with the specifications of the Foundation for Intelligent Physical Agents (FIPA). However, different technologies have been brought together to build MACS. The strength of the system is the integration of the agent technology with the FIPA specifications together with other technologies that are Windows Communication Foundation (WCF) services, Service Oriented Architecture (SOA), and Oracle Data Mining (ODM). The Microsoft's .NET widows service based agents were utilized to develop the monitoring agents of MACS, the .NET WCF services together with SOA approach allowed the distribution and communication between agents over the WWW that is in order to satisfy the monitoring and classification of the multiple developer aspect. ODM was used to automate the classification phase of MACS.

Keywords: Autonomous, Classification, MACS, Multi-Agent, SOA, WCF.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1560

11157 Robust Batch Process Scheduling in Pharmaceutical Industries: A Case Study

Authors: Tommaso Adamo, Gianpaolo Ghiani, Antonio D. Grieco, Emanuela Guerriero

Abstract:

Batch production plants provide a wide range of scheduling problems. In pharmaceutical industries a batch process is usually described by a recipe, consisting of an ordering of tasks to produce the desired product. In this research work we focused on pharmaceutical production processes requiring the culture of a microorganism population (i.e. bacteria, yeasts or antibiotics). Several sources of uncertainty may influence the yield of the culture processes, including (i) low performance and quality of the cultured microorganism population or (ii) microbial contamination. For these reasons, robustness is a valuable property for the considered application context. In particular, a robust schedule will not collapse immediately when a cell of microorganisms has to be thrown away due to a microbial contamination. Indeed, a robust schedule should change locally in small proportions and the overall performance measure (i.e. makespan, lateness) should change a little if at all. In this research work we formulated a constraint programming optimization (COP) model for the robust planning of antibiotics production. We developed a discrete-time model with a multi-criteria objective, ordering the different criteria and performing a lexicographic optimization. A feasible solution of the proposed COP model is a schedule of a given set of tasks onto available resources. The schedule has to satisfy tasks precedence constraints, resource capacity constraints and time constraints. In particular time constraints model tasks duedates and resource availability time windows constraints. To improve the schedule robustness, we modeled the concept of (a, b) super-solutions, where (a, b) are input parameters of the COP model. An (a, b) super-solution is one in which if a variables (i.e. the completion times of a culture tasks) lose their values (i.e. cultures are contaminated), the solution can be repaired by assigning these variables values with a new values (i.e. the completion times of a backup culture tasks) and at most b other variables (i.e. delaying the completion of at most b other tasks). The efficiency and applicability of the proposed model is demonstrated by solving instances taken from a real-life pharmaceutical company. Computational results showed that the determined super-solutions are near-optimal.

Keywords: Constraint programming, super-solutions, robust scheduling, batch process, pharmaceutical industries.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1950

11156 Annual Power Load Forecasting Using Support Vector Regression Machines: A Study on Guangdong Province of China 1985-2008

Authors: Zhiyong Li, Zhigang Chen, Chao Fu, Shipeng Zhang

Abstract:

Load forecasting has always been the essential part of an efficient power system operation and planning. A novel approach based on support vector machines is proposed in this paper for annual power load forecasting. Different kernel functions are selected to construct a combinatorial algorithm. The performance of the new model is evaluated with a real-world dataset, and compared with two neural networks and some traditional forecasting techniques. The results show that the proposed method exhibits superior performance.

Keywords: combinatorial algorithm, data mining, load forecasting, support vector machines

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1618

11155 Educational Data Mining: The Case of Department of Mathematics and Computing in the Period 2009-2018

Authors: M. Sitoe, O. Zacarias

Abstract:

University education is influenced by several factors that range from the adoption of strategies to strengthen the whole process to the academic performance improvement of the students themselves. This work uses data mining techniques to develop a predictive model to identify students with a tendency to evasion and retention. To this end, a database of real students’ data from the Department of University Admission (DAU) and the Department of Mathematics and Informatics (DMI) was used. The data comprised 388 undergraduate students admitted in the years 2009 to 2014. The Weka tool was used for model building, using three different techniques, namely: K-nearest neighbor, random forest, and logistic regression. To allow for training on multiple train-test splits, a cross-validation approach was employed with a varying number of folds. To reduce bias variance and improve the performance of the models, ensemble methods of Bagging and Stacking were used. After comparing the results obtained by the three classifiers, Logistic Regression using Bagging with seven folds obtained the best performance, showing results above 90% in all evaluated metrics: accuracy, rate of true positives, and precision. Retention is the most common tendency.

Keywords: Evasion and retention, cross validation, bagging, stacking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 73

11154 Iterative Clustering Algorithm for Analyzing Temporal Patterns of Gene Expression

Authors: Seo Young Kim, Jae Won Lee, Jong Sung Bae

Abstract:

Microarray experiments are information rich; however, extensive data mining is required to identify the patterns that characterize the underlying mechanisms of action. For biologists, a key aim when analyzing microarray data is to group genes based on the temporal patterns of their expression levels. In this paper, we used an iterative clustering method to find temporal patterns of gene expression. We evaluated the performance of this method by applying it to real sporulation data and simulated data. The patterns obtained using the iterative clustering were found to be superior to those obtained using existing clustering algorithms.

Keywords: Clustering, microarray experiment, temporal pattern of gene expression data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1334

11153 Open Problems on Zeros of Analytic Functions in Finite Quantum Systems

Authors: Muna Tabuni

Abstract:

The paper contains an investigation on basic problems about the zeros of analytic theta functions. A brief introduction to analytic representation of finite quantum systems is given. The zeros of this function and there evolution time are discussed. Two open problems are introduced. The first problem discusses the cases when the zeros follow the same path. As the basis change the quantum state |f transforms into different quantum state. The second problem is to define a map between two toruses where the domain and the range of this map are the analytic functions on toruses.

Keywords: open problems, constraint, change of basis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1522

11152 Periodic Topology and Size Optimization Design of Tower Crane Boom

Authors: Wu Qinglong, Zhou Qicai, Xiong Xiaolei, Zhang Richeng

Abstract:

In order to achieve the layout and size optimization of the web members of tower crane boom, a truss topology and cross section size optimization method based on continuum is proposed considering three typical working conditions. Firstly, the optimization model is established by replacing web members with web plates. And the web plates are divided into several sub-domains so that periodic soft kill option (SKO) method can be carried out for topology optimization of the slender boom. After getting the optimized topology of web plates, the optimized layout of web members is formed through extracting the principal stress distribution. Finally, using the web member radius as design variable, the boom compliance as objective and the material volume of the boom as constraint, the cross section size optimization mathematical model is established. The size optimization criterion is deduced from the mathematical model by Lagrange multiplier method and Kuhn-Tucker condition. By comparing the original boom with the optimal boom, it is identified that this optimization method can effectively lighten the boom and improve its performance.

Keywords: Tower crane boom, topology optimization, size optimization, periodic, soft kill option, optimization criterion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1321

11151 A Text Clustering System based on k-means Type Subspace Clustering and Ontology

Authors: Liping Jing, Michael K. Ng, Xinhua Yang, Joshua Zhexue Huang

Abstract:

This paper presents a text clustering system developed based on a k-means type subspace clustering algorithm to cluster large, high dimensional and sparse text data. In this algorithm, a new step is added in the k-means clustering process to automatically calculate the weights of keywords in each cluster so that the important words of a cluster can be identified by the weight values. For understanding and interpretation of clustering results, a few keywords that can best represent the semantic topic are extracted from each cluster. Two methods are used to extract the representative words. The candidate words are first selected according to their weights calculated by our new algorithm. Then, the candidates are fed to the WordNet to identify the set of noun words and consolidate the synonymy and hyponymy words. Experimental results have shown that the clustering algorithm is superior to the other subspace clustering algorithms, such as PROCLUS and HARP and kmeans type algorithm, e.g., Bisecting-KMeans. Furthermore, the word extraction method is effective in selection of the words to represent the topics of the clusters.

Keywords: Subspace Clustering, Text Mining, Feature Weighting, Cluster Interpretation, Ontology

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2428

11150 The Effect of Symmetry on the Perception of Happiness and Boredom in Design Products

Authors: Michele Sinico

Abstract:

The present research investigates the effect of symmetry on the perception of happiness and boredom in design products. Three experiments were carried out in order to verify the degree of the visual expressive value on different models of bookcases, wall clocks, and chairs. 60 participants directly indicated the degree of happiness and boredom using 7-point rating scales. The findings show that the participants acknowledged a different value of expressive quality in the different product models. Results show also that symmetry is not a significant constraint for an emotional design project.

Keywords: Product experience, emotional design, symmetry, expressive qualities.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 782

11149 Fuzzy Relatives of the CLARANS Algorithm With Application to Text Clustering

Authors: Mohamed A. Mahfouz, M. A. Ismail

Abstract:

This paper introduces new algorithms (Fuzzy relative of the CLARANS algorithm FCLARANS and Fuzzy c Medoids based on randomized search FCMRANS) for fuzzy clustering of relational data. Unlike existing fuzzy c-medoids algorithm (FCMdd) in which the within cluster dissimilarity of each cluster is minimized in each iteration by recomputing new medoids given current memberships, FCLARANS minimizes the same objective function minimized by FCMdd by changing current medoids in such away that that the sum of the within cluster dissimilarities is minimized. Computing new medoids may be effected by noise because outliers may join the computation of medoids while the choice of medoids in FCLARANS is dictated by the location of a predominant fraction of points inside a cluster and, therefore, it is less sensitive to the presence of outliers. In FCMRANS the step of computing new medoids in FCMdd is modified to be based on randomized search. Furthermore, a new initialization procedure is developed that add randomness to the initialization procedure used with FCMdd. Both FCLARANS and FCMRANS are compared with the robust and linearized version of fuzzy c-medoids (RFCMdd). Experimental results with different samples of the Reuter-21578, Newsgroups (20NG) and generated datasets with noise show that FCLARANS is more robust than both RFCMdd and FCMRANS. Finally, both FCMRANS and FCLARANS are more efficient and their outputs are almost the same as that of RFCMdd in terms of classification rate.

Keywords: Data Mining, Fuzzy Clustering, Relational Clustering, Medoid-Based Clustering, Cluster Analysis, Unsupervised Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2377

11148 An Optimal Algorithm for Finding (r, Q) Policy in a Price-Dependent Order Quantity Inventory System with Soft Budget Constraint

Authors: S. Hamid Mirmohammadi, Shahrazad Tamjidzad

Abstract:

This paper is concerned with the single-item continuous review inventory system in which demand is stochastic and discrete. The budget consumed for purchasing the ordered items is not restricted but it incurs extra cost when exceeding specific value. The unit purchasing price depends on the quantity ordered under the all-units discounts cost structure. In many actual systems, the budget as a resource which is occupied by the purchased items is limited and the system is able to confront the resource shortage by charging more costs. Thus, considering the resource shortage costs as a part of system costs, especially when the amount of resource occupied by the purchased item is influenced by quantity discounts, is well motivated by practical concerns. In this paper, an optimization problem is formulated for finding the optimal (r, Q) policy, when the system is influenced by the budget limitation and a discount pricing simultaneously. Properties of the cost function are investigated and then an algorithm based on a one-dimensional search procedure is proposed for finding an optimal (r, Q) policy which minimizes the expected system costs.

Keywords: (r, Q) policy, Stochastic demand, backorders, limited resource, quantity discounts.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1840

11147 Intelligent Recognition of Diabetes Disease via FCM Based Attribute Weighting

Authors: Kemal Polat

Abstract:

In this paper, an attribute weighting method called fuzzy C-means clustering based attribute weighting (FCMAW) for classification of Diabetes disease dataset has been used. The aims of this study are to reduce the variance within attributes of diabetes dataset and to improve the classification accuracy of classifier algorithm transforming from non-linear separable datasets to linearly separable datasets. Pima Indians Diabetes dataset has two classes including normal subjects (500 instances) and diabetes subjects (268 instances). Fuzzy C-means clustering is an improved version of K-means clustering method and is one of most used clustering methods in data mining and machine learning applications. In this study, as the first stage, fuzzy C-means clustering process has been used for finding the centers of attributes in Pima Indians diabetes dataset and then weighted the dataset according to the ratios of the means of attributes to centers of theirs. Secondly, after weighting process, the classifier algorithms including support vector machine (SVM) and k-NN (k- nearest neighbor) classifiers have been used for classifying weighted Pima Indians diabetes dataset. Experimental results show that the proposed attribute weighting method (FCMAW) has obtained very promising results in the classification of Pima Indians diabetes dataset.

Keywords: Fuzzy C-means clustering, Fuzzy C-means clustering based attribute weighting, Pima Indians diabetes dataset, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1740

11146 Numerical Study on Parametrical Design of Long Shrouded Contra-Rotating Propulsion System in Hovering

Authors: Chao. Huo, Roger. Barènes, Jérémie. Gressier, Gilles.Grondin

Abstract:

The parametrical study of Shrouded Contra-rotating Rotor was done in this paper based on 2D axisymmetric simulations. The calculations were made with an actuator disk as double rotor model. It objects to explore and quantify the effects of different shroud geometry parameters mainly using the performance of power loading (PL), which could evaluate the whole propulsion system capability as 5 Newtontotal thrust generationfor hover demand. The numerical results show that:The increase of nozzle radius is desired but limited by the flow separation, its optimal design is around 1.15 times rotor radius, the viscosity effects greatly constraint the influence of nozzle shape, the divergent angle around 10.5° performs best for chosen nozzle length;The parameters of inlet such as leading edge curvature, radius and internal shape do not affect thrust great but play an important role in pressure distribution which could produce most part of shroud thrust, they should be chosen according to the reduction of adverse pressure gradients to reduce the risk of boundary separation.

Keywords: Axisymmetric simulation, parametrical design, power loading, Shrouded Contra-Rotating Rotor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1844

11145 A Review: Comparative Analysis of Different Categorical Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, M. Sharmila

Abstract:

Over the past epoch a rampant amount of work has been done in the data clustering research under the unsupervised learning technique in Data mining. Furthermore several algorithms and methods have been proposed focusing on clustering different data types, representation of cluster models, and accuracy rates of the clusters. However no single clustering algorithm proves to be the most efficient in providing best results. Accordingly in order to find the solution to this issue a new technique, called Cluster ensemble method was bloomed. This cluster ensemble is a good alternative approach for facing the cluster analysis problem. The main hope of the cluster ensemble is to merge different clustering solutions in such a way to achieve accuracy and to improve the quality of individual data clustering. Due to the substantial and unremitting development of new methods in the sphere of data mining and also the incessant interest in inventing new algorithms, makes obligatory to scrutinize a critical analysis of the existing techniques and the future novelty. This paper exposes the comparative study of different cluster ensemble methods along with their features, systematic working process and the average accuracy and error rates of each ensemble methods. Consequently this speculative and comprehensive analysis will be very useful for the community of clustering practitioners and also helps in deciding the most suitable one to rectify the problem in hand.

Keywords: Clustering, Cluster Ensemble methods, Co-association matrix, Consensus function, Median partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2577

11144 Object Identification with Color, Texture, and Object-Correlation in CBIR System

Authors: Awais Adnan, Muhammad Nawaz, Sajid Anwar, Tamleek Ali, Muhammad Ali

Abstract:

Needs of an efficient information retrieval in recent years in increased more then ever because of the frequent use of digital information in our life. We see a lot of work in the area of textual information but in multimedia information, we cannot find much progress. In text based information, new technology of data mining and data marts are now in working that were started from the basic concept of database some where in 1960. In image search and especially in image identification, computerized system at very initial stages. Even in the area of image search we cannot see much progress as in the case of text based search techniques. One main reason for this is the wide spread roots of image search where many area like artificial intelligence, statistics, image processing, pattern recognition play their role. Even human psychology and perception and cultural diversity also have their share for the design of a good and efficient image recognition and retrieval system. A new object based search technique is presented in this paper where object in the image are identified on the basis of their geometrical shapes and other features like color and texture where object-co-relation augments this search process. To be more focused on objects identification, simple images are selected for the work to reduce the role of segmentation in overall process however same technique can also be applied for other images.

Keywords: Object correlation, Geometrical shape, Color, texture, features, contents.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2006

11143 Cost Sensitive Feature Selection in Decision-Theoretic Rough Set Models for Customer Churn Prediction: The Case of Telecommunication Sector Customers

Authors: Emel Kızılkaya Aydogan, Mihrimah Ozmen, Yılmaz Delice

Abstract:

In recent days, there is a change and the ongoing development of the telecommunications sector in the global market. In this sector, churn analysis techniques are commonly used for analysing why some customers terminate their service subscriptions prematurely. In addition, customer churn is utmost significant in this sector since it causes to important business loss. Many companies make various researches in order to prevent losses while increasing customer loyalty. Although a large quantity of accumulated data is available in this sector, their usefulness is limited by data quality and relevance. In this paper, a cost-sensitive feature selection framework is developed aiming to obtain the feature reducts to predict customer churn. The framework is a cost based optional pre-processing stage to remove redundant features for churn management. In addition, this cost-based feature selection algorithm is applied in a telecommunication company in Turkey and the results obtained with this algorithm.

Keywords: Churn prediction, data mining, decision-theoretic rough set, feature selection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1743

11142 Analyzing The Effect of Variable Round Time for Clustering Approach in Wireless Sensor Networks

Authors: Vipin Pal, Girdhari Singh, R P Yadav

Abstract:

As wireless sensor networks are energy constraint networks so energy efficiency of sensor nodes is the main design issue. Clustering of nodes is an energy efficient approach. It prolongs the lifetime of wireless sensor networks by avoiding long distance communication. Clustering algorithms operate in rounds. Performance of clustering algorithm depends upon the round time. A large round time consumes more energy of cluster heads while a small round time causes frequent re-clustering. So existing clustering algorithms apply a trade off to round time and calculate it from the initial parameters of networks. But it is not appropriate to use initial parameters based round time value throughout the network lifetime because wireless sensor networks are dynamic in nature (nodes can be added to the network or some nodes go out of energy). In this paper a variable round time approach is proposed that calculates round time depending upon the number of active nodes remaining in the field. The proposed approach makes the clustering algorithm adaptive to network dynamics. For simulation the approach is implemented with LEACH in NS-2 and the results show that there is 6% increase in network lifetime, 7% increase in 50% node death time and 5% improvement over the data units gathered at the base station.

Keywords: Wireless Sensor Network, Clustering, Energy Efficiency, Round Time.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1763

11141 A Fast Block-based Evolutional Algorithm for Combinatorial Problems

Authors: Huang, Wei-Hsiu Chang, Pei-Chann, Wang, Lien-Chun

Abstract:

The problems with high complexity had been the challenge in combinatorial problems. Due to the none-determined and polynomial characteristics, these problems usually face to unreasonable searching budget. Hence combinatorial optimizations attracted numerous researchers to develop better algorithms. In recent academic researches, most focus on developing to enhance the conventional evolutional algorithms and facilitate the local heuristics, such as VNS, 2-opt and 3-opt. Despite the performances of the introduction of the local strategies are significant, however, these improvement cannot improve the performance for solving the different problems. Therefore, this research proposes a meta-heuristic evolutional algorithm which can be applied to solve several types of problems. The performance validates BBEA has the ability to solve the problems even without the design of local strategies.

Keywords: Combinatorial problems, Artificial Chromosomes, Blocks Mining, Block Recombination

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1395

11140 An Engineering Approach to Forecast Volatility of Financial Indices

Authors: Irwin Ma, Tony Wong, Thiagas Sankar

Abstract:

By systematically applying different engineering methods, difficult financial problems become approachable. Using a combination of theory and techniques such as wavelet transform, time series data mining, Markov chain based discrete stochastic optimization, and evolutionary algorithms, this work formulated a strategy to characterize and forecast non-linear time series. It attempted to extract typical features from the volatility data sets of S&P100 and S&P500 indices that include abrupt drops, jumps and other non-linearity. As a result, accuracy of forecasting has reached an average of over 75% surpassing any other publicly available results on the forecast of any financial index.

Keywords: Discrete stochastic optimization, genetic algorithms, genetic programming, volatility forecast

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1611

11139 Unsupervised Outlier Detection in Streaming Data Using Weighted Clustering

Authors: Yogita, Durga Toshniwal

Abstract:

Outlier detection in streaming data is very challenging because streaming data cannot be scanned multiple times and also new concepts may keep evolving. Irrelevant attributes can be termed as noisy attributes and such attributes further magnify the challenge of working with data streams. In this paper, we propose an unsupervised outlier detection scheme for streaming data. This scheme is based on clustering as clustering is an unsupervised data mining task and it does not require labeled data, both density based and partitioning clustering are combined for outlier detection. In this scheme partitioning clustering is also used to assign weights to attributes depending upon their respective relevance and weights are adaptive. Weighted attributes are helpful to reduce or remove the effect of noisy attributes. Keeping in view the challenges of streaming data, the proposed scheme is incremental and adaptive to concept evolution. Experimental results on synthetic and real world data sets show that our proposed approach outperforms other existing approach (CORM) in terms of outlier detection rate, false alarm rate, and increasing percentages of outliers.

Keywords: Concept Evolution, Irrelevant Attributes, Streaming Data, Unsupervised Outlier Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2609

11138 The Implementation of the Multi-Agent Classification System (MACS) in Compliance with FIPA Specifications

Authors: Mohamed R. Mhereeg

Abstract:

The paper discusses the implementation of the MultiAgent classification System (MACS) and utilizing it to provide an automated and accurate classification of end users developing applications in the spreadsheet domain. However, different technologies have been brought together to build MACS. The strength of the system is the integration of the agent technology with the FIPA specifications together with other technologies, which are the .NET widows service based agents, the Windows Communication Foundation (WCF) services, the Service Oriented Architecture (SOA), and Oracle Data Mining (ODM). The Microsoft's .NET widows service based agents were utilized to develop the monitoring agents of MACS, the .NET WCF services together with SOA approach allowed the distribution and communication between agents over the WWW. The Monitoring Agents (MAs) were configured to execute automatically to monitor excel spreadsheets development activities by content. Data gathered by the Monitoring Agents from various resources over a period of time was collected and filtered by a Database Updater Agent (DUA) residing in the .NET client application of the system. This agent then transfers and stores the data in Oracle server database via Oracle stored procedures for further processing that leads to the classification of the end user developers.

Keywords: MACS, Implementation, Multi-Agent, SOA, Autonomous, WCF.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1688

11137 Learning to Order Terms: Supervised Interestingness Measures in Terminology Extraction

Authors: Jérôme Azé, Mathieu Roche, Yves Kodratoff, Michèle Sebag

Abstract:

Term Extraction, a key data preparation step in Text Mining, extracts the terms, i.e. relevant collocation of words, attached to specific concepts (e.g. genetic-algorithms and decisiontrees are terms associated to the concept “Machine Learning" ). In this paper, the task of extracting interesting collocations is achieved through a supervised learning algorithm, exploiting a few collocations manually labelled as interesting/not interesting. From these examples, the ROGER algorithm learns a numerical function, inducing some ranking on the collocations. This ranking is optimized using genetic algorithms, maximizing the trade-off between the false positive and true positive rates (Area Under the ROC curve). This approach uses a particular representation for the word collocations, namely the vector of values corresponding to the standard statistical interestingness measures attached to this collocation. As this representation is general (over corpora and natural languages), generality tests were performed by experimenting the ranking function learned from an English corpus in Biology, onto a French corpus of Curriculum Vitae, and vice versa, showing a good robustness of the approaches compared to the state-of-the-art Support Vector Machine (SVM).

Keywords: Text-mining, Terminology Extraction, Evolutionary algorithm, ROC Curve.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1641

11136 An Improved GA to Address Integrated Formulation of Project Scheduling and Material Ordering with Discount Options

Authors: Babak H. Tabrizi, Seyed Farid Ghaderi

Abstract:

Concurrent planning of the resource constraint project scheduling and material ordering problems have received significant attention within the last decades. Hence, the issue has been investigated here with the aim to minimize total project costs. Furthermore, the presented model considers different discount options in order to approach the real world conditions. The incorporated alternatives consist of all-unit and incremental discount strategies. On the other hand, a modified version of the genetic algorithm is applied in order to solve the model for larger sizes, in particular. Finally, the applicability and efficiency of the given model is tested by different numerical instances.

Keywords: Genetic algorithm, material ordering, project management, project scheduling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1279

11135 Increase of Organization in Complex Systems

Authors: Georgi Yordanov Georgiev, Michael Daly, Erin Gombos, Amrit Vinod, Gajinder Hoonjan

Abstract:

Measures of complexity and entropy have not converged to a single quantitative description of levels of organization of complex systems. The need for such a measure is increasingly necessary in all disciplines studying complex systems. To address this problem, starting from the most fundamental principle in Physics, here a new measure for quantity of organization and rate of self-organization in complex systems based on the principle of least (stationary) action is applied to a model system - the central processing unit (CPU) of computers. The quantity of organization for several generations of CPUs shows a double exponential rate of change of organization with time. The exact functional dependence has a fine, S-shaped structure, revealing some of the mechanisms of self-organization. The principle of least action helps to explain the mechanism of increase of organization through quantity accumulation and constraint and curvature minimization with an attractor, the least average sum of actions of all elements and for all motions. This approach can help describe, quantify, measure, manage, design and predict future behavior of complex systems to achieve the highest rates of self organization to improve their quality. It can be applied to other complex systems from Physics, Chemistry, Biology, Ecology, Economics, Cities, network theory and others where complex systems are present.

Keywords: Organization, self-organization, complex system, complexification, quantitative measure, principle of least action, principle of stationary action, attractor, progressive development, acceleration, stochastic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1619

11134 Socio-Technical Systems: Transforming Theory into Practice

Authors: L. Ngowi, N. H. Mvungi

Abstract:

This paper critically examines the evolution of socio-technical systems theory, its practices, and challenges in system design and development. It examines concepts put forward by researchers focusing on the application of the theory in software engineering. There are various methods developed that use socio-technical concepts based on systems engineering without remarkable success. The main constraint is the large amount of data and inefficient techniques used in the application of the concepts in system engineering for developing time-bound systems and within a limited/controlled budget. This paper critically examines each of the methods, highlight bottlenecks and suggest the way forward. Since socio-technical systems theory only explains what to do, but not how doing it, hence engineers are not using the concept to save time, costs and reduce risks associated with new frameworks. Hence, a new framework, which can be considered as a practical approach is proposed that borrows concepts from soft systems method, agile systems development and object-oriented analysis and design to bridge the gap between theory and practice. The approach will enable the development of systems using socio-technical systems theory to attract/enable the system engineers/software developers to use socio-technical systems theory in building worthwhile information systems to avoid fragilities and hostilities in the work environment.

Keywords: Socio-technical systems, human centered design, software engineering, cognitive engineering, soft systems, systems engineering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2786

11133 A Modified Fuzzy C-Means Algorithm for Natural Data Exploration

Authors: Binu Thomas, Raju G., Sonam Wangmo

Abstract:

In Data mining, Fuzzy clustering algorithms have demonstrated advantage over crisp clustering algorithms in dealing with the challenges posed by large collections of vague and uncertain natural data. This paper reviews concept of fuzzy logic and fuzzy clustering. The classical fuzzy c-means algorithm is presented and its limitations are highlighted. Based on the study of the fuzzy c-means algorithm and its extensions, we propose a modification to the cmeans algorithm to overcome the limitations of it in calculating the new cluster centers and in finding the membership values with natural data. The efficiency of the new modified method is demonstrated on real data collected for Bhutan-s Gross National Happiness (GNH) program.

Keywords: Adaptive fuzzy clustering, clustering, fuzzy logic, fuzzy clustering, c-means.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1955

11132 Improved Performance Scheme for Joint Transmission in Downlink Coordinated Multi-Point Transmission

Authors: Young-Su Ryu, Su-Hyun Jung, Myoung-Jin Kim, Hyoung-Kyu Song

Abstract:

In this paper, improved performance scheme for joint transmission (JT) is proposed in downlink (DL) coordinated multi-point (CoMP) in case of the constraint transmission power. This scheme is that a serving transmission point (TP) requests the JT to an inter-TP and it selects a precoding technique according to the channel state information (CSI) from user equipment (UE). The simulation results show that the bit error rate (BER) and the throughput performances of the proposed scheme provide the high spectral efficiency and the reliable data at the cell edge.

Keywords: CoMP, joint transmission, minimum mean square error, zero-forcing, zero-forcing dirty paper coding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1751