Search results for: Constraint Based Mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 11604

Search results for: Constraint Based Mining

11604 Discovery of Sequential Patterns Based On Constraint Patterns

Authors: Shigeaki Sakurai, Youichi Kitahata, Ryohei Orihara

Abstract:

This paper proposes a method that discovers sequential patterns corresponding to user-s interests from sequential data. This method expresses the interests as constraint patterns. The constraint patterns can define relationships among attributes of the items composing the data. The method recursively decomposes the constraint patterns into constraint subpatterns. The method evaluates the constraint subpatterns in order to efficiently discover sequential patterns satisfying the constraint patterns. Also, this paper applies the method to the sequential data composed of stock price indexes and verifies its effectiveness through comparing it with a method without using the constraint patterns.

Keywords: Sequential pattern mining, Constraint pattern, Attribute constraint, Stock price indexes

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1381
11603 Constraint Based Frequent Pattern Mining Technique for Solving GCS Problem

Authors: First G.M. Karthik, Second Ramachandra.V.Pujeri, Dr.

Abstract:

Generalized Center String (GCS) problem are generalized from Common Approximate Substring problem and Common substring problems. GCS are known to be NP-hard allowing the problems lies in the explosion of potential candidates. Finding longest center string without concerning the sequence that may not contain any motifs is not known in advance in any particular biological gene process. GCS solved by frequent pattern-mining techniques and known to be fixed parameter tractable based on the fixed input sequence length and symbol set size. Efficient method known as Bpriori algorithms can solve GCS with reasonable time/space complexities. Bpriori 2 and Bpriori 3-2 algorithm are been proposed of any length and any positions of all their instances in input sequences. In this paper, we reduced the time/space complexity of Bpriori algorithm by Constrained Based Frequent Pattern mining (CBFP) technique which integrates the idea of Constraint Based Mining and FP-tree mining. CBFP mining technique solves the GCS problem works for all center string of any length, but also for the positions of all their mutated copies of input sequence. CBFP mining technique construct TRIE like with FP tree to represent the mutated copies of center string of any length, along with constraints to restraint growth of the consensus tree. The complexity analysis for Constrained Based FP mining technique and Bpriori algorithm is done based on the worst case and average case approach. Algorithm's correctness compared with the Bpriori algorithm using artificial data is shown.

Keywords: Constraint Based Mining, FP tree, Data mining, GCS problem, CBFP mining technique.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1642
11602 A Comparative Analysis of Different Web Content Mining Tools

Authors: T. Suresh Kumar, M. Arthanari, N. Shanthi

Abstract:

Nowadays, the Web has become one of the most pervasive platforms for information change and retrieval. It collects the suitable and perfectly fitting information from websites that one requires. Data mining is the form of extracting data’s available in the internet. Web mining is one of the elements of data mining Technique, which relates to various research communities such as information recovery, folder managing system and simulated intellects. In this Paper we have discussed the concepts of Web mining. We contain generally focused on one of the categories of Web mining, specifically the Web Content Mining and its various farm duties. The mining tools are imperative to scanning the many images, text, and HTML documents and then, the result is used by the various search engines. We conclude by presenting a comparative table of these tools based on some pertinent criteria.

Keywords: Data Mining, Web Mining, Web Content Mining, Mining Tools, Information retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3505
11601 An Efficient Approach to Mining Frequent Itemsets on Data Streams

Authors: Sara Ansari, Mohammad Hadi Sadreddini

Abstract:

The increasing importance of data stream arising in a wide range of advanced applications has led to the extensive study of mining frequent patterns. Mining data streams poses many new challenges amongst which are the one-scan nature, the unbounded memory requirement and the high arrival rate of data streams. In this paper, we propose a new approach for mining itemsets on data stream. Our approach SFIDS has been developed based on FIDS algorithm. The main attempts were to keep some advantages of the previous approach and resolve some of its drawbacks, and consequently to improve run time and memory consumption. Our approach has the following advantages: using a data structure similar to lattice for keeping frequent itemsets, separating regions from each other with deleting common nodes that results in a decrease in search space, memory consumption and run time; and Finally, considering CPU constraint, with increasing arrival rate of data that result in overloading system, SFIDS automatically detect this situation and discard some of unprocessing data. We guarantee that error of results is bounded to user pre-specified threshold, based on a probability technique. Final results show that SFIDS algorithm could attain about 50% run time improvement than FIDS approach.

Keywords: Data stream, frequent itemset, stream mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1374
11600 New Data Reuse Adaptive Filters with Noise Constraint

Authors: Young-Seok Choi

Abstract:

We present a new framework of the data-reusing (DR) adaptive algorithms by incorporating a constraint on noise, referred to as a noise constraint. The motivation behind this work is that the use of the statistical knowledge of the channel noise can contribute toward improving the convergence performance of an adaptive filter in identifying a noisy linear finite impulse response (FIR) channel. By incorporating the noise constraint into the cost function of the DR adaptive algorithms, the noise constrained DR (NC-DR) adaptive algorithms are derived. Experimental results clearly indicate their superior performance over the conventional DR ones.

Keywords: Adaptive filter, data-reusing, least-mean square (LMS), affine projection (AP), noise constraint.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1578
11599 Bi-Directional Evolutionary Topology Optimization Based on Critical Fatigue Constraint

Authors: Khodamorad Nabaki, Jianhu Shen, Xiaodong Huang

Abstract:

This paper develops a method for considering the critical fatigue stress as a constraint in the Bi-directional Evolutionary Structural Optimization (BESO) method. Our aim is to reach an optimal design in which high cycle fatigue failure does not occur for a specific life time. The critical fatigue stress is calculated based on modified Goodman criteria and used as a stress constraint in our topology optimization problem. Since fatigue generally does not occur for compressive stresses, we use the p-norm approach of the stress measurement that considers the highest tensile principal stress in each point as stress measure to calculate the sensitivity numbers. The BESO method has been extended to minimize volume an object subjected to the critical fatigue stress constraint. The optimization results are compared with the results from the compliance minimization problem which shows clearly the merits of our newly developed approach.

Keywords: Topology optimization, BESO method, p-norm, fatigue constraint.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1000
11598 Data-Reusing Adaptive Filtering Algorithms with Adaptive Error Constraint

Authors: Young-Seok Choi

Abstract:

We present a family of data-reusing and affine projection algorithms. For identification of a noisy linear finite impulse response channel, a partial knowledge of a channel, especially noise, can be used to improve the performance of the adaptive filter. Motivated by this fact, the proposed scheme incorporates an estimate of a knowledge of noise. A constraint, called the adaptive noise constraint, estimates an unknown information of noise. By imposing this constraint on a cost function of data-reusing and affine projection algorithms, a cost function based on the adaptive noise constraint and Lagrange multiplier is defined. Minimizing the new cost function leads to the adaptive noise constrained (ANC) data-reusing and affine projection algorithms. Experimental results comparing the proposed schemes to standard data-reusing and affine projection algorithms clearly indicate their superior performance.

Keywords: Data-reusing, affine projection algorithm, error constraint, system identification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1552
11597 Extended Constraint Mask Based One-Bit Transform for Low-Complexity Fast Motion Estimation

Authors: Oğuzhan Urhan

Abstract:

In this paper, an improved motion estimation (ME) approach based on weighted constrained one-bit transform is proposed for block-based ME employed in video encoders. Binary ME approaches utilize low bit-depth representation of the original image frames with a Boolean exclusive-OR based hardware efficient matching criterion to decrease computational burden of the ME stage. Weighted constrained one-bit transform (WC‑1BT) based approach improves the performance of conventional C-1BT based ME employing 2-bit depth constraint mask instead of a 1-bit depth mask. In this work, the range of constraint mask is further extended to increase ME performance of WC-1BT approach. Experiments reveal that the proposed method provides better ME accuracy compared existing similar ME methods in the literature.

Keywords: Fast motion estimation, low-complexity motion estimation, video coding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 804
11596 Combining Variable Ordering Heuristics for Improving Search Algorithms Performance

Authors: Abdolreza Hatamlou, Yusef Farhang, Mohammad Reza Meybodi

Abstract:

Variable ordering heuristics are used in constraint satisfaction algorithms. Different characteristics of various variable ordering heuristics are complementary. Therefore we have tried to get the advantages of all heuristics to improve search algorithms performance for solving constraint satisfaction problems. This paper considers combinations based on products and quotients, and then a newer form of combination based on weighted sums of ratings from a set of base heuristics, some of which result in definite improvements in performance.

Keywords: Constraint Satisfaction Problems, Variable Ordering Heuristics, Combination, Search Algorithms

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1316
11595 A Web Text Mining Flexible Architecture

Authors: M. Castellano, G. Mastronardi, A. Aprile, G. Tarricone

Abstract:

Text Mining is an important step of Knowledge Discovery process. It is used to extract hidden information from notstructured o semi-structured data. This aspect is fundamental because much of the Web information is semi-structured due to the nested structure of HTML code, much of the Web information is linked, much of the Web information is redundant. Web Text Mining helps whole knowledge mining process to mining, extraction and integration of useful data, information and knowledge from Web page contents. In this paper, we present a Web Text Mining process able to discover knowledge in a distributed and heterogeneous multiorganization environment. The Web Text Mining process is based on flexible architecture and is implemented by four steps able to examine web content and to extract useful hidden information through mining techniques. Our Web Text Mining prototype starts from the recovery of Web job offers in which, through a Text Mining process, useful information for fast classification of the same are drawn out, these information are, essentially, job offer place and skills.

Keywords: Web text mining, flexible architecture, knowledgediscovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2613
11594 ATM Service Analysis Using Predictive Data Mining

Authors: S. Madhavi, S. Abirami, C. Bharathi, B. Ekambaram, T. Krishna Sankar, A. Nattudurai, N. Vijayarangan

Abstract:

The high utilization rate of Automated Teller Machine (ATM) has inevitably caused the phenomena of waiting for a long time in the queue. This in turn has increased the out of stock situations. The ATM utilization helps to determine the usage level and states the necessity of the ATM based on the utilization of the ATM system. The time in which the ATM used more frequently (peak time) and based on the predicted solution the necessary actions are taken by the bank management. The analysis can be done by using the concept of Data Mining and the major part are analyzed based on the predictive data mining. The results are predicted from the historical data (past data) and track the relevant solution which is required. Weka tool is used for the analysis of data based on predictive data mining.

Keywords: ATM, Bank Management, Data Mining, Historical data, Predictive Data Mining, Weka tool.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5565
11593 Concurrency in Web Access Patterns Mining

Authors: Jing Lu, Malcolm Keech, Weiru Chen

Abstract:

Web usage mining is an interesting application of data mining which provides insight into customer behaviour on the Internet. An important technique to discover user access and navigation trails is based on sequential patterns mining. One of the key challenges for web access patterns mining is tackling the problem of mining richly structured patterns. This paper proposes a novel model called Web Access Patterns Graph (WAP-Graph) to represent all of the access patterns from web mining graphically. WAP-Graph also motivates the search for new structural relation patterns, i.e. Concurrent Access Patterns (CAP), to identify and predict more complex web page requests. Corresponding CAP mining and modelling methods are proposed and shown to be effective in the search for and representation of concurrency between access patterns on the web. From experiments conducted on large-scale synthetic sequence data as well as real web access data, it is demonstrated that CAP mining provides a powerful method for structural knowledge discovery, which can be visualised through the CAP-Graph model.

Keywords: concurrent access patterns (CAP), CAP mining and modelling, CAP-Graph, web access patterns (WAP), WAP-Graph, Web usage mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1674
11592 A Kernel Based Rejection Method for Supervised Classification

Authors: Abdenour Bounsiar, Edith Grall, Pierre Beauseroy

Abstract:

In this paper we are interested in classification problems with a performance constraint on error probability. In such problems if the constraint cannot be satisfied, then a rejection option is introduced. For binary labelled classification, a number of SVM based methods with rejection option have been proposed over the past few years. All of these methods use two thresholds on the SVM output. However, in previous works, we have shown on synthetic data that using thresholds on the output of the optimal SVM may lead to poor results for classification tasks with performance constraint. In this paper a new method for supervised classification with rejection option is proposed. It consists in two different classifiers jointly optimized to minimize the rejection probability subject to a given constraint on error rate. This method uses a new kernel based linear learning machine that we have recently presented. This learning machine is characterized by its simplicity and high training speed which makes the simultaneous optimization of the two classifiers computationally reasonable. The proposed classification method with rejection option is compared to a SVM based rejection method proposed in recent literature. Experiments show the superiority of the proposed method.

Keywords: rejection, Chow's rule, error-reject tradeoff, SupportVector Machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1401
11591 A Distributed Approach to Extract High Utility Itemsets from XML Data

Authors: S. Kannimuthu, K. Premalatha

Abstract:

This paper investigates a new data mining capability that entails mining of High Utility Itemsets (HUI) in a distributed environment. Existing research in data mining deals with only presence or absence of an items and do not consider the semantic measures like weight or cost of the items. Thus, HUI mining algorithm has evolved. HUI mining is the one kind of utility mining concept, aims to identify itemsets whose utility satisfies a given threshold. Although, the approach of mining HUIs in a distributed environment and mining of the same from XML data have not explored yet. In this work, a novel approach is proposed to mine HUIs from the XML based data in a distributed environment. This work utilizes Service Oriented Computing (SOC) paradigm which provides Knowledge as a Service (KaaS). The interesting patterns are provided via the web services with the help of knowledge server to answer the queries of the consumers. The performance of the approach is evaluated on various databases using execution time and memory consumption.

Keywords: Data mining, Knowledge as a Service, service oriented computing, utility mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2405
11590 Subband Adaptive Filter Exploiting Sparsity of System

Authors: Young-Seok Choi

Abstract:

This paper presents a normalized subband adaptive filtering (NSAF) algorithm to cope with the sparsity condition of an underlying system in the context of compressive sensing. By regularizing a weighted l1-norm of the filter taps estimate onto the cost function of the NSAF and utilizing a subgradient analysis, the update recursion of the l1-norm constraint NSAF is derived. Considering two distinct weighted l1-norm regularization cases, two versions of the l1-norm constraint NSAF are presented. Simulation results clearly indicate the superior performance of the proposed l1-norm constraint NSAFs comparing with the classical NSAF.

Keywords: Subband adaptive filtering, sparsity constraint, weighted l1-norm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1483
11589 Weka Based Desktop Data Mining as Web Service

Authors: Sujala.D.Shetty, S.Vadivel, Sakshi Vaghella

Abstract:

Data mining is the process of sifting through large volumes of data, analyzing data from different perspectives and summarizing it into useful information. One of the widely used desktop applications for data mining is the Weka tool which is nothing but a collection of machine learning algorithms implemented in Java and open sourced under the General Public License (GPL). A web service is a software system designed to support interoperable machine to machine interaction over a network using SOAP messages. Unlike a desktop application, a web service is easy to upgrade, deliver and access and does not occupy any memory on the system. Keeping in mind the advantages of a web service over a desktop application, in this paper we are demonstrating how this Java based desktop data mining application can be implemented as a web service to support data mining across the internet.

Keywords: desktop application, Weka mining, web service

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4025
11588 Bottom Up Text Mining through Hierarchical Document Representation

Authors: Y. Djouadi., F. Souam.

Abstract:

Most of the existing text mining approaches are proposed, keeping in mind, transaction databases model. Thus, the mined dataset is structured using just one concept: the “transaction", whereas the whole dataset is modeled using the “set" abstract type. In such cases, the structure of the whole dataset and the relationships among the transactions themselves are not modeled and consequently, not considered in the mining process. We believe that taking into account structure properties of hierarchically structured information (e.g. textual document, etc ...) in the mining process, can leads to best results. For this purpose, an hierarchical associations rule mining approach for textual documents is proposed in this paper and the classical set-oriented mining approach is reconsidered profits to a Direct Acyclic Graph (DAG) oriented approach. Natural languages processing techniques are used in order to obtain the DAG structure. Based on this graph model, an hierarchical bottom up algorithm is proposed. The main idea is that each node is mined with its parent node.

Keywords: Graph based association rules mining, Hierarchical document structure, Text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2008
11587 Predictions Using Data Mining and Case-based Reasoning: A Case Study for Retinopathy

Authors: Vimala Balakrishnan, Mohammad R. Shakouri, Hooman Hoodeh, Loo, Huck-Soo

Abstract:

Diabetes is one of the high prevalence diseases worldwide with increased number of complications, with retinopathy as one of the most common one. This paper describes how data mining and case-based reasoning were integrated to predict retinopathy prevalence among diabetes patients in Malaysia. The knowledge base required was built after literature reviews and interviews with medical experts. A total of 140 diabetes patients- data were used to train the prediction system. A voting mechanism selects the best prediction results from the two techniques used. It has been successfully proven that both data mining and case-based reasoning can be used for retinopathy prediction with an improved accuracy of 85%.

Keywords: Case-Based Reasoning, Data Mining, Prediction, Retinopathy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2966
11586 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: Mining Big Data, Big Data, Machine learning, Data Streams, Telecommunication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2418
11585 Efficient Web Usage Mining Based on K-Medoids Clustering Technique

Authors: P. Sengottuvelan, T. Gopalakrishnan

Abstract:

Web Usage Mining is the application of data mining techniques to find usage patterns from web log data, so as to grasp required patterns and serve the requirements of Web-based applications. User’s expertise on the internet may be improved by minimizing user’s web access latency. This may be done by predicting the future search page earlier and the same may be prefetched and cached. Therefore, to enhance the standard of web services, it is needed topic to research the user web navigation behavior. Analysis of user’s web navigation behavior is achieved through modeling web navigation history. We propose this technique which cluster’s the user sessions, based on the K-medoids technique.

Keywords: Clustering, K-medoids, Recommendation, User Session, Web Usage Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1353
11584 Powerful Tool to Expand Business Intelligence: Text Mining

Authors: Li Gao, Elizabeth Chang, Song Han

Abstract:

With the extensive inclusion of document, especially text, in the business systems, data mining does not cover the full scope of Business Intelligence. Data mining cannot deliver its impact on extracting useful details from the large collection of unstructured and semi-structured written materials based on natural languages. The most pressing issue is to draw the potential business intelligence from text. In order to gain competitive advantages for the business, it is necessary to develop the new powerful tool, text mining, to expand the scope of business intelligence. In this paper, we will work out the strong points of text mining in extracting business intelligence from huge amount of textual information sources within business systems. We will apply text mining to each stage of Business Intelligence systems to prove that text mining is the powerful tool to expand the scope of BI. After reviewing basic definitions and some related technologies, we will discuss the relationship and the benefits of these to text mining. Some examples and applications of text mining will also be given. The motivation behind is to develop new approach to effective and efficient textual information analysis. Thus we can expand the scope of Business Intelligence using the powerful tool, text mining.

Keywords: Business intelligence, document warehouse, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2608
11583 Web Application to Profiling Scientific Institutions through Citation Mining

Authors: Hector D. Cortes, Jesus A. del Rio, Esther O. Garcia, Miguel Robles

Abstract:

Recently the use of data mining to scientific bibliographic data bases has been implemented to analyze the pathways of the knowledge or the core scientific relevances of a laureated novel or a country. This specific case of data mining has been named citation mining, and it is the integration of citation bibliometrics and text mining. In this paper we present an improved WEB implementation of statistical physics algorithms to perform the text mining component of citation mining. In particular we use an entropic like distance between the compression of text as an indicator of the similarity between them. Finally, we have included the recently proposed index h to characterize the scientific production. We have used this web implementation to identify users, applications and impact of the Mexican scientific institutions located in the State of Morelos.

Keywords: Citation Mining, Text Mining, Science Impact

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1690
11582 Navigation Patterns Mining Approach based on Expectation Maximization Algorithm

Authors: Norwati Mustapha, Manijeh Jalali, Abolghasem Bozorgniya, Mehrdad Jalali

Abstract:

Web usage mining algorithms have been widely utilized for modeling user web navigation behavior. In this study we advance a model for mining of user-s navigation pattern. The model makes user model based on expectation-maximization (EM) algorithm.An EM algorithm is used in statistics for finding maximum likelihood estimates of parameters in probabilistic models, where the model depends on unobserved latent variables. The experimental results represent that by decreasing the number of clusters, the log likelihood converges toward lower values and probability of the largest cluster will be decreased while the number of the clusters increases in each treatment.

Keywords: Web Usage Mining, Expectation maximization, navigation pattern mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1532
11581 Survey on Image Mining Using Genetic Algorithm

Authors: Jyoti Dua

Abstract:

One image is worth more than thousand words. Images if analyzed can reveal useful information. Low level image processing deals with the extraction of specific feature from a single image. Now the question arises: What technique should be used to extract patterns of very large and detailed image database? The answer of the question is: “Image Mining”. Image Mining deals with the extraction of image data relationship, implicit knowledge, and another pattern from the collection of images or image database. It is nothing but the extension of Data Mining. In the following paper, not only we are going to scrutinize the current techniques of image mining but also present a new technique for mining images using Genetic Algorithm.

Keywords: Image Mining, Data Mining, Genetic Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2399
11580 A Framework for Data Mining Based Multi-Agent: An Application to Spatial Data

Authors: H. Baazaoui Zghal, S. Faiz, H. Ben Ghezala

Abstract:

Data mining is an extraordinarily demanding field referring to extraction of implicit knowledge and relationships, which are not explicitly stored in databases. A wide variety of methods of data mining have been introduced (classification, characterization, generalization...). Each one of these methods includes more than algorithm. A system of data mining implies different user categories,, which mean that the user-s behavior must be a component of the system. The problem at this level is to know which algorithm of which method to employ for an exploratory end, which one for a decisional end, and how can they collaborate and communicate. Agent paradigm presents a new way of conception and realizing of data mining system. The purpose is to combine different algorithms of data mining to prepare elements for decision-makers, benefiting from the possibilities offered by the multi-agent systems. In this paper the agent framework for data mining is introduced, and its overall architecture and functionality are presented. The validation is made on spatial data. Principal results will be presented.

Keywords: Databases, data mining, multi-agent, spatial datamart.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2000
11579 Mining Educational Data to Analyze the Student Motivation Behavior

Authors: Kunyanuth Kularbphettong, Cholticha Tongsiri

Abstract:

The purpose of this research aims to discover the knowledge for analysis student motivation behavior on e-Learning based on Data Mining Techniques, in case of the Information Technology for Communication and Learning Course at Suan Sunandha Rajabhat University. The data mining techniques was applied in this research including association rules, classification techniques. The results showed that using data mining technique can indicate the important variables that influence the student motivation behavior on e-Learning.

Keywords: association rule mining, classification techniques, e- Learning, Moodle log Motivation Behavior

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3038
11578 Introducing Sequence-Order Constraint into Prediction of Protein Binding Sites with Automatically Extracted Templates

Authors: Yi-Zhong Weng, Chien-Kang Huang, Yu-Feng Huang, Chi-Yuan Yu, Darby Tien-Hao Chang

Abstract:

Search for a tertiary substructure that geometrically matches the 3D pattern of the binding site of a well-studied protein provides a solution to predict protein functions. In our previous work, a web server has been built to predict protein-ligand binding sites based on automatically extracted templates. However, a drawback of such templates is that the web server was prone to resulting in many false positive matches. In this study, we present a sequence-order constraint to reduce the false positive matches of using automatically extracted templates to predict protein-ligand binding sites. The binding site predictor comprises i) an automatically constructed template library and ii) a local structure alignment algorithm for querying the library. The sequence-order constraint is employed to identify the inconsistency between the local regions of the query protein and the templates. Experimental results reveal that the sequence-order constraint can largely reduce the false positive matches and is effective for template-based binding site prediction.

Keywords: Protein structure, binding site, functional prediction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1410
11577 A New Model for Discovering XML Association Rules from XML Documents

Authors: R. AliMohammadzadeh, M. Rahgozar, A. Zarnani

Abstract:

The inherent flexibilities of XML in both structure and semantics makes mining from XML data a complex task with more challenges compared to traditional association rule mining in relational databases. In this paper, we propose a new model for the effective extraction of generalized association rules form a XML document collection. We directly use frequent subtree mining techniques in the discovery process and do not ignore the tree structure of data in the final rules. The frequent subtrees based on the user provided support are split to complement subtrees to form the rules. We explain our model within multi-steps from data preparation to rule generation.

Keywords: XML, Data Mining, Association Rule Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1586
11576 Analysis of Student Motivation Behavior on e-Learning Based on Association Rule Mining

Authors: Kunyanuth Kularbphettong, Phanu Waraporn, Cholticha Tongsiri

Abstract:

This research aims to create a model for analysis of student motivation behavior on e-Learning based on association rule mining techniques in case of the Information Technology for Communication and Learning Course at Suan Sunandha Rajabhat University. The model was created under association rules, one of the data mining techniques with minimum confidence. The results showed that the student motivation behavior model by using association rule technique can indicate the important variables that influence the student motivation behavior on e-Learning.

Keywords: Motivation behavior, e-learning, moodle log, association rule mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1830
11575 Semantically Enriched Web Usage Mining for Personalization

Authors: Suresh Shirgave, Prakash Kulkarni, José Borges

Abstract:

The continuous growth in the size of the World Wide Web has resulted in intricate Web sites, demanding enhanced user skills and more sophisticated tools to help the Web user to find the desired information. In order to make Web more user friendly, it is necessary to provide personalized services and recommendations to the Web user. For discovering interesting and frequent navigation patterns from Web server logs many Web usage mining techniques have been applied. The recommendation accuracy of usage based techniques can be improved by integrating Web site content and site structure in the personalization process.

Herein, we propose semantically enriched Web Usage Mining method for Personalization (SWUMP), an extension to solely usage based technique. This approach is a combination of the fields of Web Usage Mining and Semantic Web. In the proposed method, we envisage enriching the undirected graph derived from usage data with rich semantic information extracted from the Web pages and the Web site structure. The experimental results show that the SWUMP generates accurate recommendations and is able to achieve 10-20% better accuracy than the solely usage based model. The SWUMP addresses the new item problem inherent to solely usage based techniques.

Keywords: Prediction, Recommendation, Semantic Web Usage Mining, Web Usage Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2977