Search results for: Sequential Pattern mining.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1594

Search results for: Sequential Pattern mining.

1504 Video Mining for Creative Rendering

Authors: Mei Chen

Abstract:

More and more home videos are being generated with the ever growing popularity of digital cameras and camcorders. For many home videos, a photo rendering, whether capturing a moment or a scene within the video, provides a complementary representation to the video. In this paper, a video motion mining framework for creative rendering is presented. The user-s capture intent is derived by analyzing video motions, and respective metadata is generated for each capture type. The metadata can be used in a number of applications, such as creating video thumbnail, generating panorama posters, and producing slideshows of video.

Keywords: Motion mining, semantic abstraction, video mining, video representation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1605
1503 Application of Neural Networks in Financial Data Mining

Authors: Defu Zhang, Qingshan Jiang, Xin Li

Abstract:

This paper deals with the application of a well-known neural network technique, multilayer back-propagation (BP) neural network, in financial data mining. A modified neural network forecasting model is presented, and an intelligent mining system is developed. The system can forecast the buying and selling signs according to the prediction of future trends to stock market, and provide decision-making for stock investors. The simulation result of seven years to Shanghai Composite Index shows that the return achieved by this mining system is about three times as large as that achieved by the buy and hold strategy, so it is advantageous to apply neural networks to forecast financial time series, the different investors could benefit from it.

Keywords: Data mining, neural network, stock forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3544
1502 Prospects, Problems of Marketing Research and Data Mining in Turkey

Authors: Sema Kurtuluş, Kemal Kurtuluş

Abstract:

The objective of this paper is to review and assess the methodological issues and problems in marketing research, data and knowledge mining in Turkey. As a summary, academic marketing research publications in Turkey have significant problems. The most vital problem seems to be related with modeling. Most of the publications had major weaknesses in modeling. There were also, serious problems regarding measurement and scaling, sampling and analyses. Analyses myopia seems to be the most important problem for young academia in Turkey. Another very important finding is the lack of publications on data and knowledge mining in the academic world.

Keywords: Marketing research, data mining, knowledge mining, research modeling, analyses.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1925
1501 A New Brazilian Friction-Resistant Low Alloy High Strength Steel – A Life Testing Approach

Authors: D. I. De Souza, G. P. Azevedo, R. Rocha

Abstract:

In this paper we will develop a sequential life test approach applied to a modified low alloy-high strength steel part used in highway overpasses in Brazil.We will consider two possible underlying sampling distributions: the Normal and theInverse Weibull models. The minimum life will be considered equal to zero. We will use the two underlying models to analyze a fatigue life test situation, comparing the results obtained from both.Since a major chemical component of this low alloy-high strength steel part has been changed, there is little information available about the possible values that the parameters of the corresponding Normal and Inverse Weibull underlying sampling distributions could have. To estimate the shape and the scale parameters of these two sampling models we will use a maximum likelihood approach for censored failure data. We will also develop a truncation mechanism for the Inverse Weibull and Normal models. We will provide rules to truncate a sequential life testing situation making one of the two possible decisions at the moment of truncation; that is, accept or reject the null hypothesis H0. An example will develop the proposed truncated sequential life testing approach for the Inverse Weibull and Normal models.

Keywords: Sequential life testing, normal and inverse Weibull models, maximum likelihood approach, truncation mechanism.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1390
1500 Knowledge Discovery and Data Mining Techniques in Textile Industry

Authors: Filiz Ersoz, Taner Ersoz, Erkin Guler

Abstract:

This paper addresses the issues and technique for textile industry using data mining techniques. Data mining has been applied to the stitching of garments products that were obtained from a textile company. Data mining techniques were applied to the data obtained from the CHAID algorithm, CART algorithm, Regression Analysis and, Artificial Neural Networks. Classification technique based analyses were used while data mining and decision model about the production per person and variables affecting about production were found by this method. In the study, the results show that as the daily working time increases, the production per person also decreases. In addition, the relationship between total daily working and production per person shows a negative result and the production per person show the highest and negative relationship.

Keywords: Data mining, textile production, decision trees, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1491
1499 Classification of the Latin Alphabet as Pattern on ARToolkit Markers for Augmented Reality Applications

Authors: Mohamed Badeche, Mohamed Benmohammed

Abstract:

augmented reality is a technique used to insert virtual objects in real scenes. One of the most used libraries in the area is the ARToolkit library. It is based on the recognition of the markers that are in the form of squares with a pattern inside. This pattern which is mostly textual is source of confusing. In this paper, we present the results of a classification of Latin characters as a pattern on the ARToolkit markers to know the most distinguishable among them.

Keywords: ARToolkit library, augmented reality, K-means, patterns

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1782
1498 Web Log Mining by an Improved AprioriAll Algorithm

Authors: Wang Tong, He Pi-lian

Abstract:

This paper sets forth the possibility and importance about applying Data Mining in Web logs mining and shows some problems in the conventional searching engines. Then it offers an improved algorithm based on the original AprioriAll algorithm which has been used in Web logs mining widely. The new algorithm adds the property of the User ID during the every step of producing the candidate set and every step of scanning the database by which to decide whether an item in the candidate set should be put into the large set which will be used to produce next candidate set. At the meantime, in order to reduce the number of the database scanning, the new algorithm, by using the property of the Apriori algorithm, limits the size of the candidate set in time whenever it is produced. Test results show the improved algorithm has a more lower complexity of time and space, better restrain noise and fit the capacity of memory.

Keywords: Candidate Sets Pruning, Data Mining, ImprovedAlgorithm, Noise Restrain, Web Log

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2238
1497 Opinion Mining and Sentiment Analysis on DEFT

Authors: Najiba Ouled Omar, Azza Harbaoui, Henda Ben Ghezala

Abstract:

Current research practices sentiment analysis with a focus on social networks, DEfi Fouille de Texte (DEFT) (Text Mining Challenge) evaluation campaign focuses on opinion mining and sentiment analysis on social networks, especially social network Twitter. It aims to confront the systems produced by several teams from public and private research laboratories. DEFT offers participants the opportunity to work on regularly renewed themes and proposes to work on opinion mining in several editions. The purpose of this article is to scrutinize and analyze the works relating to opinions mining and sentiment analysis in the Twitter social network realized by DEFT. It examines the tasks proposed by the organizers of the challenge and the methods used by the participants.

Keywords: Opinion mining, sentiment analysis, emotion, polarity, annotation, OSEE, figurative language, DEFT, Twitter, Tweet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 760
1496 Investigation of Different Stimulation Patterns to Reduce Muscle Fatigue during Functional Electrical Stimulation

Authors: R. Ruslee, H. Gollee

Abstract:

Functional electrical stimulation (FES) is a commonly used technique in rehabilitation and often associated with rapid muscle fatigue which becomes the limiting factor in its applications. The objective of this study is to investigate the effects on the onset of fatigue of conventional synchronous stimulation, as well as asynchronous stimulation that mimic voluntary muscle activation targeting different motor units which are activated sequentially or randomly via multiple pairs of stimulation electrodes. We investigate three different approaches with various electrode configurations, as well as different patterns of stimulation applied to the gastrocnemius muscle: Conventional Synchronous Stimulation (CSS), Asynchronous Sequential Stimulation (ASS) and Asynchronous Random Stimulation (ARS). Stimulation was applied repeatedly for 300 ms followed by 700 ms of no-stimulation with 40 Hz effective frequency for all protocols. Ten able-bodied volunteers (28±3 years old) participated in this study. As fatigue indicators, we focused on the analysis of Normalized Fatigue Index (NFI), Fatigue Time Interval (FTI) and pre-post Twitch-Tetanus Ratio (ΔTTR). The results demonstrated that ASS and ARS give higher NFI and longer FTI confirming less fatigue for asynchronous stimulation. In addition, ASS and ARS resulted in higher ΔTTR than conventional CSS. In this study, we proposed a randomly distributed stimulation method for the application of FES and investigated its suitability for reducing muscle fatigue compared to previously applied methods. The results validated that asynchronous stimulation reduces fatigue, and indicates that random stimulation may improve fatigue resistance in some conditions.

Keywords: Asynchronous stimulation, electrode configuration, functional electrical stimulation, muscle fatigue, pattern stimulation, random stimulation, sequential stimulation, synchronous stimulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1198
1495 Finding Fuzzy Association Rules Using FWFP-Growth with Linguistic Supports and Confidences

Authors: Chien-Hua Wang, Chin-Tzong Pang

Abstract:

In data mining, the association rules are used to search for the relations of items of the transactions database. Following the data is collected and stored, it can find rules of value through association rules, and assist manager to proceed marketing strategy and plan market framework. In this paper, we attempt fuzzy partition methods and decide membership function of quantitative values of each transaction item. Also, by managers we can reflect the importance of items as linguistic terms, which are transformed as fuzzy sets of weights. Next, fuzzy weighted frequent pattern growth (FWFP-Growth) is used to complete the process of data mining. The method above is expected to improve Apriori algorithm for its better efficiency of the whole association rules. An example is given to clearly illustrate the proposed approach.

Keywords: Association Rule, Fuzzy Partition Methods, FWFP-Growth, Apiroir algorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1603
1494 Discovering User Behaviour Patterns from Web Log Analysis to Enhance the Accessibility and Usability of Website

Authors: Harpreet Singh

Abstract:

Finding relevant information on the World Wide Web is becoming highly challenging day by day. Web usage mining is used for the extraction of relevant and useful knowledge, such as user behaviour patterns, from web access log records. Web access log records all the requests for individual files that the users have requested from the website. Web usage mining is important for Customer Relationship Management (CRM), as it can ensure customer satisfaction as far as the interaction between the customer and the organization is concerned. Web usage mining is helpful in improving website structure or design as per the user’s requirement by analyzing the access log file of a website through a log analyzer tool. The focus of this paper is to enhance the accessibility and usability of a guitar selling web site by analyzing their access log through Deep Log Analyzer tool. The results show that the maximum number of users is from the United States and that they use Opera 9.8 web browser and the Windows XP operating system.

Keywords: Web usage mining, log file, web mining, data mining, deep log analyser.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1018
1493 STATISTICA Software: A State of the Art Review

Authors: S. Sarumathi, N. Shanthi, S. Vidhya, P. Ranjetha

Abstract:

Data mining idea is mounting rapidly in admiration and also in their popularity. The foremost aspire of data mining method is to extract data from a huge data set into several forms that could be comprehended for additional use. The data mining is a technology that contains with rich potential resources which could be supportive for industries and businesses that pay attention to collect the necessary information of the data to discover their customer’s performances. For extracting data there are several methods are available such as Classification, Clustering, Association, Discovering, and Visualization… etc., which has its individual and diverse algorithms towards the effort to fit an appropriate model to the data. STATISTICA mostly deals with excessive groups of data that imposes vast rigorous computational constraints. These results trials challenge cause the emergence of powerful STATISTICA Data Mining technologies. In this survey an overview of the STATISTICA software is illustrated along with their significant features.

Keywords: Data Mining, STATISTICA Data Miner, Text Miner, Enterprise Server, Classification, Association, Clustering, Regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2560
1492 Time-Derivative Estimation of Noisy Movie Data using Adaptive Control Theory

Authors: Soon-Hyun Park, Takami Matsuo

Abstract:

This paper presents an adaptive differentiator of sequential data based on the adaptive control theory. The algorithm is applied to detect moving objects by estimating a temporal gradient of sequential data at a specified pixel. We adopt two nonlinear intensity functions to reduce the influence of noises. The derivatives of the nonlinear intensity functions are estimated by an adaptive observer with σ-modification update law.

Keywords: Adaptive estimation, parameter adjustmentlaw, motion detection, temporal gradient, differential filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1826
1491 The Data Mining usage in Production System Management

Authors: Pavel Vazan, Pavol Tanuska, Michal Kebisek

Abstract:

The paper gives the pilot results of the project that is oriented on the use of data mining techniques and knowledge discoveries from production systems through them. They have been used in the management of these systems. The simulation models of manufacturing systems have been developed to obtain the necessary data about production. The authors have developed the way of storing data obtained from the simulation models in the data warehouse. Data mining model has been created by using specific methods and selected techniques for defined problems of production system management. The new knowledge has been applied to production management system. Gained knowledge has been tested on simulation models of the production system. An important benefit of the project has been proposal of the new methodology. This methodology is focused on data mining from the databases that store operational data about the production process.

Keywords: data mining, data warehousing, management of production system, simulation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3441
1490 Bayesian Online Learning of Corresponding Points of Objects with Sequential Monte Carlo

Authors: Miika Toivanen, Jouko Lampinen

Abstract:

This paper presents an online method that learns the corresponding points of an object from un-annotated grayscale images containing instances of the object. In the first image being processed, an ensemble of node points is automatically selected which is matched in the subsequent images. A Bayesian posterior distribution for the locations of the nodes in the images is formed. The likelihood is formed from Gabor responses and the prior assumes the mean shape of the node ensemble to be similar in a translation and scale free space. An association model is applied for separating the object nodes and background nodes. The posterior distribution is sampled with Sequential Monte Carlo method. The matched object nodes are inferred to be the corresponding points of the object instances. The results show that our system matches the object nodes as accurately as other methods that train the model with annotated training images.

Keywords: Bayesian modeling, Gabor filters, Online learning, Sequential Monte Carlo.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1544
1489 Pattern Matching Based on Regular Tree Grammars

Authors: Riad S. Jabri

Abstract:

Pattern matching based on regular tree grammars have been widely used in many areas of computer science. In this paper, we propose a pattern matcher within the framework of code generation, based on a generic and a formalized approach. According to this approach, parsers for regular tree grammars are adapted to a general pattern matching solution, rather than adapting the pattern matching according to their parsing behavior. Hence, we first formalize the construction of the pattern matches respective to input trees drawn from a regular tree grammar in a form of the so-called match trees. Then, we adopt a recently developed generic parser and tightly couple its parsing behavior with such construction. In addition to its generality, the resulting pattern matcher is characterized by its soundness and efficient implementation. This is demonstrated by the proposed theory and by the derived algorithms for its implementation. A comparison with similar and well-known approaches, such as the ones based on tree automata and LR parsers, has shown that our pattern matcher can be applied to a broader class of grammars, and achieves better approximation of pattern matches in one pass. Furthermore, its use as a machine code selector is characterized by a minimized overhead, due to the balanced distribution of the cost computations into static ones, during parser generation time, and into dynamic ones, during parsing time.

Keywords: Bottom-up automata, Code selection, Pattern matching, Regular tree grammars, Match trees.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1224
1488 A Combined Approach of a Sequential Life Testing and an Accelerated Life Testing Applied to a Low-Alloy High Strength Steel Component

Authors: D. I. De Souza, D. R. Fonseca, G. P. Azevedo

Abstract:

Sometimes the amount of time available for testing could be considerably less than the expected lifetime of the component. To overcome such a problem, there is the accelerated life-testing alternative aimed at forcing components to fail by testing them at much higher-than-intended application conditions. These models are known as acceleration models. One possible way to translate test results obtained under accelerated conditions to normal using conditions could be through the application of the “Maxwell Distribution Law.” In this paper we will apply a combined approach of a sequential life testing and an accelerated life testing to a low alloy high-strength steel component used in the construction of overpasses in Brazil. The underlying sampling distribution will be three-parameter Inverse Weibull model. To estimate the three parameters of the Inverse Weibull model we will use a maximum likelihood approach for censored failure data. We will be assuming a linear acceleration condition. To evaluate the accuracy (significance) of the parameter values obtained under normal conditions for the underlying Inverse Weibull model we will apply to the expected normal failure times a sequential life testing using a truncation mechanism. An example will illustrate the application of this procedure.

Keywords: Sequential Life Testing, Accelerated Life Testing, Underlying Three-Parameter Weibull Model, Maximum Likelihood Approach, Hypothesis Testing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1607
1487 MATLAB-Based Graphical User Interface (GUI) for Data Mining as a Tool for Environment Management

Authors: M. Awawdeh, A. Fedi

Abstract:

The application of data mining to environmental monitoring has become crucial for a number of tasks related to emergency management. Over recent years, many tools have been developed for decision support system (DSS) for emergency management. In this article a graphical user interface (GUI) for environmental monitoring system is presented. This interface allows accomplishing (i) data collection and observation and (ii) extraction for data mining. This tool may be the basis for future development along the line of the open source software paradigm.

Keywords: Data Mining, Environmental data, Mathematical Models, Matlab Graphical User Interface.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4693
1486 A New Pattern for Handwritten Persian/Arabic Digit Recognition

Authors: A. Harifi, A. Aghagolzadeh

Abstract:

The main problem for recognition of handwritten Persian digits using Neural Network is to extract an appropriate feature vector from image matrix. In this research an asymmetrical segmentation pattern is proposed to obtain the feature vector. This pattern can be adjusted as an optimum model thanks to its one degree of freedom as a control point. Since any chosen algorithm depends on digit identity, a Neural Network is used to prevail over this dependence. Inputs of this Network are the moment of inertia and the center of gravity which do not depend on digit identity. Recognizing the digit is carried out using another Neural Network. Simulation results indicate the high recognition rate of 97.6% for new introduced pattern in comparison to the previous models for recognition of digits.

Keywords: Pattern recognition, Persian digits, NeuralNetwork.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1633
1485 Deep Web Content Mining

Authors: Shohreh Ajoudanian, Mohammad Davarpanah Jazi

Abstract:

The rapid expansion of the web is causing the constant growth of information, leading to several problems such as increased difficulty of extracting potentially useful knowledge. Web content mining confronts this problem gathering explicit information from different web sites for its access and knowledge discovery. Query interfaces of web databases share common building blocks. After extracting information with parsing approach, we use a new data mining algorithm to match a large number of schemas in databases at a time. Using this algorithm increases the speed of information matching. In addition, instead of simple 1:1 matching, they do complex (m:n) matching between query interfaces. In this paper we present a novel correlation mining algorithm that matches correlated attributes with smaller cost. This algorithm uses Jaccard measure to distinguish positive and negative correlated attributes. After that, system matches the user query with different query interfaces in special domain and finally chooses the nearest query interface with user query to answer to it.

Keywords: Content mining, complex matching, correlation mining, information extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2241
1484 An Improved k Nearest Neighbor Classifier Using Interestingness Measures for Medical Image Mining

Authors: J. Alamelu Mangai, Satej Wagle, V. Santhosh Kumar

Abstract:

The exponential increase in the volume of medical image database has imposed new challenges to clinical routine in maintaining patient history, diagnosis, treatment and monitoring. With the advent of data mining and machine learning techniques it is possible to automate and/or assist physicians in clinical diagnosis. In this research a medical image classification framework using data mining techniques is proposed. It involves feature extraction, feature selection, feature discretization and classification. In the classification phase, the performance of the traditional kNN k nearest neighbor classifier is improved using a feature weighting scheme and a distance weighted voting instead of simple majority voting. Feature weights are calculated using the interestingness measures used in association rule mining. Experiments on the retinal fundus images show that the proposed framework improves the classification accuracy of traditional kNN from 78.57 % to 92.85 %.

Keywords: Medical Image Mining, Data Mining, Feature Weighting, Association Rule Mining, k nearest neighbor classifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3260
1483 Syntactic Recognition of Distorted Patterns

Authors: Marek Skomorowski

Abstract:

In syntactic pattern recognition a pattern can be represented by a graph. Given an unknown pattern represented by a graph g, the problem of recognition is to determine if the graph g belongs to a language L(G) generated by a graph grammar G. The so-called IE graphs have been defined in [1] for a description of patterns. The IE graphs are generated by so-called ETPL(k) graph grammars defined in [1]. An efficient, parsing algorithm for ETPL(k) graph grammars for syntactic recognition of patterns represented by IE graphs has been presented in [1]. In practice, structural descriptions may contain pattern distortions, so that the assignment of a graph g, representing an unknown pattern, to a graph language L(G) generated by an ETPL(k) graph grammar G is rejected by the ETPL(k) type parsing. Therefore, there is a need for constructing effective parsing algorithms for recognition of distorted patterns. The purpose of this paper is to present a new approach to syntactic recognition of distorted patterns. To take into account all variations of a distorted pattern under study, a probabilistic description of the pattern is needed. A random IE graph approach is proposed here for such a description ([2]).

Keywords: Syntactic pattern recognition, Distorted patterns, Random graphs, Graph grammars.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1352
1482 A Hybrid Data Mining Method for the Medical Classification of Chest Pain

Authors: Sung Ho Ha, Seong Hyeon Joo

Abstract:

Data mining techniques have been used in medical research for many years and have been known to be effective. In order to solve such problems as long-waiting time, congestion, and delayed patient care, faced by emergency departments, this study concentrates on building a hybrid methodology, combining data mining techniques such as association rules and classification trees. The methodology is applied to real-world emergency data collected from a hospital and is evaluated by comparing with other techniques. The methodology is expected to help physicians to make a faster and more accurate classification of chest pain diseases.

Keywords: Data mining, medical decisions, medical domainknowledge, chest pain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2157
1481 Developing Structured Sizing Systems for Manufacturing Ready-Made Garments of Indian Females Using Decision Tree-Based Data Mining

Authors: Hina Kausher, Sangita Srivastava

Abstract:

In India, there is a lack of standard, systematic sizing approach for producing readymade garments. Garments manufacturing companies use their own created size tables by modifying international sizing charts of ready-made garments. The purpose of this study is to tabulate the anthropometric data which cover the variety of figure proportions in both height and girth. 3,000 data have been collected by an anthropometric survey undertaken over females between the ages of 16 to 80 years from the some states of India to produce the sizing system suitable for clothing manufacture and retailing. The data are used for the statistical analysis of body measurements, the formulation of sizing systems and body measurements tables. Factor analysis technique is used to filter the control body dimensions from the large number of variables. Decision tree-based data mining is used to cluster the data. The standard and structured sizing system can facilitate pattern grading and garment production. Moreover, it can exceed buying ratios and upgrade size allocations to retail segments.

Keywords: Anthropometric data, data mining, decision tree, garments manufacturing, ready-made garments, sizing systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 881
1480 Efficient Web Usage Mining Based on K-Medoids Clustering Technique

Authors: P. Sengottuvelan, T. Gopalakrishnan

Abstract:

Web Usage Mining is the application of data mining techniques to find usage patterns from web log data, so as to grasp required patterns and serve the requirements of Web-based applications. User’s expertise on the internet may be improved by minimizing user’s web access latency. This may be done by predicting the future search page earlier and the same may be prefetched and cached. Therefore, to enhance the standard of web services, it is needed topic to research the user web navigation behavior. Analysis of user’s web navigation behavior is achieved through modeling web navigation history. We propose this technique which cluster’s the user sessions, based on the K-medoids technique.

Keywords: Clustering, K-medoids, Recommendation, User Session, Web Usage Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1358
1479 Pattern Recognition of Partial Discharge by Using Simplified Fuzzy ARTMAP

Authors: S. Boonpoke, B. Marungsri

Abstract:

This paper presents the effectiveness of artificial intelligent technique to apply for pattern recognition and classification of Partial Discharge (PD). Characteristics of PD signal for pattern recognition and classification are computed from the relation of the voltage phase angle, the discharge magnitude and the repeated existing of partial discharges by using statistical and fractal methods. The simplified fuzzy ARTMAP (SFAM) is used for pattern recognition and classification as artificial intelligent technique. PDs quantities, 13 parameters from statistical method and fractal method results, are inputted to Simplified Fuzzy ARTMAP to train system for pattern recognition and classification. The results confirm the effectiveness of purpose technique.

Keywords: Partial discharges, PD Pattern recognition, PDClassification, Artificial intelligent, Simplified Fuzzy ARTMAP

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3022
1478 Sign Pattern Matrices that Admit P0 Matrices

Authors: Ling Zhang, Ting-Zhu Huang

Abstract:

A P0-matrix is a real square matrix all of whose principle minors are nonnegative. In this paper, we consider the class of P0-matrix. Our main aim is to determine which sign pattern matrices are admissible for this class of real matrices.

Keywords: Sign pattern matrices, P0 matrices, graph, digraph.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1166
1477 Analysis of Student Motivation Behavior on e-Learning Based on Association Rule Mining

Authors: Kunyanuth Kularbphettong, Phanu Waraporn, Cholticha Tongsiri

Abstract:

This research aims to create a model for analysis of student motivation behavior on e-Learning based on association rule mining techniques in case of the Information Technology for Communication and Learning Course at Suan Sunandha Rajabhat University. The model was created under association rules, one of the data mining techniques with minimum confidence. The results showed that the student motivation behavior model by using association rule technique can indicate the important variables that influence the student motivation behavior on e-Learning.

Keywords: Motivation behavior, e-learning, moodle log, association rule mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1832
1476 Data Mining Classification Methods Applied in Drug Design

Authors: Mária Stachová, Lukáš Sobíšek

Abstract:

Data mining incorporates a group of statistical methods used to analyze a set of information, or a data set. It operates with models and algorithms, which are powerful tools with the great potential. They can help people to understand the patterns in certain chunk of information so it is obvious that the data mining tools have a wide area of applications. For example in the theoretical chemistry data mining tools can be used to predict moleculeproperties or improve computer-assisted drug design. Classification analysis is one of the major data mining methodologies. The aim of thecontribution is to create a classification model, which would be able to deal with a huge data set with high accuracy. For this purpose logistic regression, Bayesian logistic regression and random forest models were built using R software. TheBayesian logistic regression in Latent GOLD software was created as well. These classification methods belong to supervised learning methods. It was necessary to reduce data matrix dimension before construct models and thus the factor analysis (FA) was used. Those models were applied to predict the biological activity of molecules, potential new drug candidates.

Keywords: data mining, classification, drug design, QSAR

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2798
1475 Cirrhosis Mortality Prediction as Classification Using Frequent Subgraph Mining

Authors: Abdolghani Ebrahimi, Diego Klabjan, Chenxi Ge, Daniela Ladner, Parker Stride

Abstract:

In this work, we use machine learning and data analysis techniques to predict the one-year mortality of cirrhotic patients. Data from 2,322 patients with liver cirrhosis are collected at a single medical center. Different machine learning models are applied to predict one-year mortality. A comprehensive feature space including demographic information, comorbidity, clinical procedure and laboratory tests is being analyzed. A temporal pattern mining technic called Frequent Subgraph Mining (FSM) is being used. Model for End-stage liver disease (MELD) prediction of mortality is used as a comparator. All of our models statistically significantly outperform the MELD-score model and show an average 10% improvement of the area under the curve (AUC). The FSM technic itself does not improve the model significantly, but FSM, together with a machine learning technique called an ensemble, further improves the model performance. With the abundance of data available in healthcare through electronic health records (EHR), existing predictive models can be refined to identify and treat patients at risk for higher mortality. However, due to the sparsity of the temporal information needed by FSM, the FSM model does not yield significant improvements. Our work applies modern machine learning algorithms and data analysis methods on predicting one-year mortality of cirrhotic patients and builds a model that predicts one-year mortality significantly more accurate than the MELD score. We have also tested the potential of FSM and provided a new perspective of the importance of clinical features.

Keywords: machine learning, liver cirrhosis, subgraph mining, supervised learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 380