Search results for: association rule mining.
1031 Constraint Based Frequent Pattern Mining Technique for Solving GCS Problem
Authors: First G.M. Karthik, Second Ramachandra.V.Pujeri, Dr.
Abstract:
Generalized Center String (GCS) problem are generalized from Common Approximate Substring problem and Common substring problems. GCS are known to be NP-hard allowing the problems lies in the explosion of potential candidates. Finding longest center string without concerning the sequence that may not contain any motifs is not known in advance in any particular biological gene process. GCS solved by frequent pattern-mining techniques and known to be fixed parameter tractable based on the fixed input sequence length and symbol set size. Efficient method known as Bpriori algorithms can solve GCS with reasonable time/space complexities. Bpriori 2 and Bpriori 3-2 algorithm are been proposed of any length and any positions of all their instances in input sequences. In this paper, we reduced the time/space complexity of Bpriori algorithm by Constrained Based Frequent Pattern mining (CBFP) technique which integrates the idea of Constraint Based Mining and FP-tree mining. CBFP mining technique solves the GCS problem works for all center string of any length, but also for the positions of all their mutated copies of input sequence. CBFP mining technique construct TRIE like with FP tree to represent the mutated copies of center string of any length, along with constraints to restraint growth of the consensus tree. The complexity analysis for Constrained Based FP mining technique and Bpriori algorithm is done based on the worst case and average case approach. Algorithm's correctness compared with the Bpriori algorithm using artificial data is shown.Keywords: Constraint Based Mining, FP tree, Data mining, GCS problem, CBFP mining technique.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17031030 Deep Web Content Mining
Authors: Shohreh Ajoudanian, Mohammad Davarpanah Jazi
Abstract:
The rapid expansion of the web is causing the constant growth of information, leading to several problems such as increased difficulty of extracting potentially useful knowledge. Web content mining confronts this problem gathering explicit information from different web sites for its access and knowledge discovery. Query interfaces of web databases share common building blocks. After extracting information with parsing approach, we use a new data mining algorithm to match a large number of schemas in databases at a time. Using this algorithm increases the speed of information matching. In addition, instead of simple 1:1 matching, they do complex (m:n) matching between query interfaces. In this paper we present a novel correlation mining algorithm that matches correlated attributes with smaller cost. This algorithm uses Jaccard measure to distinguish positive and negative correlated attributes. After that, system matches the user query with different query interfaces in special domain and finally chooses the nearest query interface with user query to answer to it.Keywords: Content mining, complex matching, correlation mining, information extraction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22781029 Mining Genes Relations in Microarray Data Combined with Ontology in Colon Cancer Automated Diagnosis System
Authors: A. Gruzdz, A. Ihnatowicz, J. Siddiqi, B. Akhgar
Abstract:
MATCH project [1] entitle the development of an automatic diagnosis system that aims to support treatment of colon cancer diseases by discovering mutations that occurs to tumour suppressor genes (TSGs) and contributes to the development of cancerous tumours. The constitution of the system is based on a) colon cancer clinical data and b) biological information that will be derived by data mining techniques from genomic and proteomic sources The core mining module will consist of the popular, well tested hybrid feature extraction methods, and new combined algorithms, designed especially for the project. Elements of rough sets, evolutionary computing, cluster analysis, self-organization maps and association rules will be used to discover the annotations between genes, and their influence on tumours [2]-[11]. The methods used to process the data have to address their high complexity, potential inconsistency and problems of dealing with the missing values. They must integrate all the useful information necessary to solve the expert's question. For this purpose, the system has to learn from data, or be able to interactively specify by a domain specialist, the part of the knowledge structure it needs to answer a given query. The program should also take into account the importance/rank of the particular parts of data it analyses, and adjusts the used algorithms accordingly.Keywords: Bioinformatics, gene expression, ontology, selforganizingmaps.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19741028 Efficient Web Usage Mining Based on K-Medoids Clustering Technique
Authors: P. Sengottuvelan, T. Gopalakrishnan
Abstract:
Web Usage Mining is the application of data mining techniques to find usage patterns from web log data, so as to grasp required patterns and serve the requirements of Web-based applications. User’s expertise on the internet may be improved by minimizing user’s web access latency. This may be done by predicting the future search page earlier and the same may be prefetched and cached. Therefore, to enhance the standard of web services, it is needed topic to research the user web navigation behavior. Analysis of user’s web navigation behavior is achieved through modeling web navigation history. We propose this technique which cluster’s the user sessions, based on the K-medoids technique.Keywords: Clustering, K-medoids, Recommendation, User Session, Web Usage Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13961027 Comprehensive Analysis of Data Mining Tools
Authors: S. Sarumathi, N. Shanthi
Abstract:
Due to the fast and flawless technological innovation there is a tremendous amount of data dumping all over the world in every domain such as Pattern Recognition, Machine Learning, Spatial Data Mining, Image Analysis, Fraudulent Analysis, World Wide Web etc., This issue turns to be more essential for developing several tools for data mining functionalities. The major aim of this paper is to analyze various tools which are used to build a resourceful analytical or descriptive model for handling large amount of information more efficiently and user friendly. In this survey the diverse tools are illustrated with their extensive technical paradigm, outstanding graphical interface and inbuilt multipath algorithms in which it is very useful for handling significant amount of data more indeed.
Keywords: Classification, Clustering, Data Mining, Machine learning, Visualization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24391026 On the Analysis of Localization Accuracy of Wireless Indoor Positioning Systems using Cramer's Rule
Authors: Kriangkrai Maneerat, Chutima Prommak
Abstract:
This paper presents an analysis of the localization accuracy of indoor positioning systems using Cramer-s rule via IEEE 802.15.4 wireless sensor networks. The objective is to study the impact of the methods used to convert the received signal strength into the distance that is used to compute the object location in the wireless indoor positioning system. Various methods were tested and the localization accuracy was analyzed. The experimental results show that the method based on the empirical data measured in the non line-of-sight (NLOS) environment yield the highest localization accuracy; with the minimum error distance less than 3 m.
Keywords: Indoor positioning systems, localization accuracy, wireless networks, Cramer's rule.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19681025 Data Mining Classification Methods Applied in Drug Design
Authors: Mária Stachová, Lukáš Sobíšek
Abstract:
Data mining incorporates a group of statistical methods used to analyze a set of information, or a data set. It operates with models and algorithms, which are powerful tools with the great potential. They can help people to understand the patterns in certain chunk of information so it is obvious that the data mining tools have a wide area of applications. For example in the theoretical chemistry data mining tools can be used to predict moleculeproperties or improve computer-assisted drug design. Classification analysis is one of the major data mining methodologies. The aim of thecontribution is to create a classification model, which would be able to deal with a huge data set with high accuracy. For this purpose logistic regression, Bayesian logistic regression and random forest models were built using R software. TheBayesian logistic regression in Latent GOLD software was created as well. These classification methods belong to supervised learning methods. It was necessary to reduce data matrix dimension before construct models and thus the factor analysis (FA) was used. Those models were applied to predict the biological activity of molecules, potential new drug candidates.Keywords: data mining, classification, drug design, QSAR
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28511024 Fuzzy Rules Generation and Extraction from Support Vector Machine Based on Kernel Function Firing Signals
Authors: Prasan Pitiranggon, Nunthika Benjathepanun, Somsri Banditvilai, Veera Boonjing
Abstract:
Our study proposes an alternative method in building Fuzzy Rule-Based System (FRB) from Support Vector Machine (SVM). The first set of fuzzy IF-THEN rules is obtained through an equivalence of the SVM decision network and the zero-ordered Sugeno FRB type of the Adaptive Network Fuzzy Inference System (ANFIS). The second set of rules is generated by combining the first set based on strength of firing signals of support vectors using Gaussian kernel. The final set of rules is then obtained from the second set through input scatter partitioning. A distinctive advantage of our method is the guarantee that the number of final fuzzy IFTHEN rules is not more than the number of support vectors in the trained SVM. The final FRB system obtained is capable of performing classification with results comparable to its SVM counterpart, but it has an advantage over the black-boxed SVM in that it may reveal human comprehensible patterns.Keywords: Fuzzy Rule Base, Rule Extraction, Rule Generation, Support Vector Machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19021023 Comparing the Efficiency of Simpson’s 1/3 and 3/8 Rules for the Numerical Solution of First Order Volterra Integro-Differential Equations
Authors: N. M. Kamoh, D. G. Gyemang, M. C. Soomiyol
Abstract:
This paper compared the efficiency of Simpson’s 1/3 and 3/8 rules for the numerical solution of first order Volterra integro-differential equations. In developing the solution, collocation approximation method was adopted using the shifted Legendre polynomial as basis function. A block method approach is preferred to the predictor corrector method for being self-starting. Experimental results confirmed that the Simpson’s 3/8 rule is more efficient than the Simpson’s 1/3 rule.
Keywords: Collocation shifted Legendre polynomials, Simpson’s rule and Volterra integro-differential equations.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9771022 A Study of Soil Heavy Metal Pollution in the Manganese Mining in Drama, Greece
Authors: A. Argiri, A. Molla, Tzouvalekas, E. Skoufogianni, N. Danalatos
Abstract:
The release of heavy metals into the environment has increased over the last years. In this study, 25 soil samples (0-15 cm) from the fields near the mining area in Drama region were selected. The samples were analyzed in the laboratory for their physicochemical properties and for seven “pseudo-total’’ heavy metals content, namely Pb, Zn, Cd, Cr, Cu, Ni, and Mn. The total metal concentrations (Pb, Zn, Cd, Cr, Cu, Ni and Mn) in digests were determined by using the atomic absorption spectrophotometer. According to the results, the mean concentration of the listed heavy metals in 25 soil samples are Cd 1.1 mg/kg, Cr 15 mg/kg, Cu 21.7 mg/kg, Ni 30.1 mg/kg, Pd 50.8 mg/kg, Zn 99.5 mg/kg and Mn 815.3 mg/kg. The results show that the heavy metals remain in the soil even if the mining closed many years ago.
Keywords: Greece, heavy metals, mining, pollution
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5851021 Distributed Data-Mining by Probability-Based Patterns
Authors: M. Kargar, F. Gharbalchi
Abstract:
In this paper a new method is suggested for distributed data-mining by the probability patterns. These patterns use decision trees and decision graphs. The patterns are cared to be valid, novel, useful, and understandable. Considering a set of functions, the system reaches to a good pattern or better objectives. By using the suggested method we will be able to extract the useful information from massive and multi-relational data bases.Keywords: Data-mining, Decision tree, Decision graph, Pattern, Relationship.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15571020 M2LGP: Mining Multiple Level Gradual Patterns
Authors: Yogi Satrya Aryadinata, Anne Laurent, Michel Sala
Abstract:
Gradual patterns have been studied for many years as they contain precious information. They have been integrated in many expert systems and rule-based systems, for instance to reason on knowledge such as “the greater the number of turns, the greater the number of car crashes”. In many cases, this knowledge has been considered as a rule “the greater the number of turns → the greater the number of car crashes” Historically, works have thus been focused on the representation of such rules, studying how implication could be defined, especially fuzzy implication. These rules were defined by experts who were in charge to describe the systems they were working on in order to turn them to operate automatically. More recently, approaches have been proposed in order to mine databases for automatically discovering such knowledge. Several approaches have been studied, the main scientific topics being: how to determine what is an relevant gradual pattern, and how to discover them as efficiently as possible (in terms of both memory and CPU usage). However, in some cases, end-users are not interested in raw level knowledge, and are rather interested in trends. Moreover, it may be the case that no relevant pattern can be discovered at a low level of granularity (e.g. city), whereas some can be discovered at a higher level (e.g. county). In this paper, we thus extend gradual pattern approaches in order to consider multiple level gradual patterns. For this purpose, we consider two aggregation policies, namely horizontal and vertical.Keywords: Gradual Pattern.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15001019 Quantification of GHGs Emissions from Electricity and Diesel Fuel Consumption in Basalt Mining Industry in Thailand
Authors: S. Kittipongvises, A. Dubsok
Abstract:
The mineral and mining industry is necessary for countries to have an adequate and reliable supply of materials to meet their socio-economic development. Despite its importance, the environmental impacts from mineral exploration are hugely significant. This study aimed to investigate and quantify the amount of GHGs emissions emitted from both electricity and diesel vehicle fuel consumption in basalt mining in Thailand. Plant A, located in the northeastern region of Thailand, was selected as a case study. Results indicated that total GHGs emissions from basalt mining and operation (Plant A) were approximately 2,501,086 kgCO2e and 1,997,412 kgCO2e in 2014 and 2015, respectively. The estimated carbon intensity ranged between 1.824 kgCO2e to 2.284 kgCO2e per ton of rock product. Scope 1 (direct emissions) was the dominant driver of its total GHGs compared to scope 2 (indirect emissions). As such, transport related combustion of diesel fuels generated the highest GHGs emission (65%) compared to emissions from purchased electricity (35%). Some of the potential implications for mining entities were also presented.
Keywords: Basalt mining, diesel fuel, electricity, GHGs emissions, Thailand.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10551018 A Data Mining Model for Detecting Financial and Operational Risk Indicators of SMEs
Authors: Ali Serhan Koyuncugil, Nermin Ozgulbas
Abstract:
In this paper, a data mining model to SMEs for detecting financial and operational risk indicators by data mining is presenting. The identification of the risk factors by clarifying the relationship between the variables defines the discovery of knowledge from the financial and operational variables. Automatic and estimation oriented information discovery process coincides the definition of data mining. During the formation of model; an easy to understand, easy to interpret and easy to apply utilitarian model that is far from the requirement of theoretical background is targeted by the discovery of the implicit relationships between the data and the identification of effect level of every factor. In addition, this paper is based on a project which was funded by The Scientific and Technological Research Council of Turkey (TUBITAK).
Keywords: Risk Management, Financial Risk, Operational Risk, Financial Early Warning System, Data Mining, CHAID Decision Tree Algorithm, SMEs.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 31241017 Studies of Rule Induction by STRIM from the Decision Table with Contaminated Attribute Values from Missing Data and Noise — In the Case of Critical Dataset Size —
Authors: Tetsuro Saeki, Yuichi Kato, Shoutarou Mizuno
Abstract:
STRIM (Statistical Test Rule Induction Method) has been proposed as a method to effectively induct if-then rules from the decision table which is considered as a sample set obtained from the population of interest. Its usefulness has been confirmed by simulation experiments specifying rules in advance, and by comparison with conventional methods. However, scope for future development remains before STRIM can be applied to the analysis of real-world data sets. The first requirement is to determine the size of the dataset needed for inducting true rules, since finding statistically significant rules is the core of the method. The second is to examine the capacity of rule induction from datasets with contaminated attribute values created by missing data and noise, since real-world datasets usually contain such contaminated data. This paper examines the first problem theoretically, in connection with the rule length. The second problem is then examined in a simulation experiment, utilizing the critical size of dataset derived from the first step. The experimental results show that STRIM is highly robust in the analysis of datasets with contaminated attribute values, and hence is applicable to real-world data
Keywords: Rule induction, decision table, missing data, noise.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14641016 Signed Approach for Mining Web Content Outliers
Authors: G. Poonkuzhali, K.Thiagarajan, K.Sarukesi, G.V.Uma
Abstract:
The emergence of the Internet has brewed the revolution of information storage and retrieval. As most of the data in the web is unstructured, and contains a mix of text, video, audio etc, there is a need to mine information to cater to the specific needs of the users without loss of important hidden information. Thus developing user friendly and automated tools for providing relevant information quickly becomes a major challenge in web mining research. Most of the existing web mining algorithms have concentrated on finding frequent patterns while neglecting the less frequent ones that are likely to contain outlying data such as noise, irrelevant and redundant data. This paper mainly focuses on Signed approach and full word matching on the organized domain dictionary for mining web content outliers. This Signed approach gives the relevant web documents as well as outlying web documents. As the dictionary is organized based on the number of characters in a word, searching and retrieval of documents takes less time and less space.Keywords: Outliers, Relevant document, , Signed Approach, Web content mining, Web documents..
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23491015 Predictions Using Data Mining and Case-based Reasoning: A Case Study for Retinopathy
Authors: Vimala Balakrishnan, Mohammad R. Shakouri, Hooman Hoodeh, Loo, Huck-Soo
Abstract:
Diabetes is one of the high prevalence diseases worldwide with increased number of complications, with retinopathy as one of the most common one. This paper describes how data mining and case-based reasoning were integrated to predict retinopathy prevalence among diabetes patients in Malaysia. The knowledge base required was built after literature reviews and interviews with medical experts. A total of 140 diabetes patients- data were used to train the prediction system. A voting mechanism selects the best prediction results from the two techniques used. It has been successfully proven that both data mining and case-based reasoning can be used for retinopathy prediction with an improved accuracy of 85%.Keywords: Case-Based Reasoning, Data Mining, Prediction, Retinopathy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30231014 Optimal Classifying and Extracting Fuzzy Relationship from Query Using Text Mining Techniques
Authors: Faisal Alshuwaier, Ali Areshey
Abstract:
Text mining techniques are generally applied for classifying the text, finding fuzzy relations and structures in data sets. This research provides plenty text mining capabilities. One common application is text classification and event extraction, which encompass deducing specific knowledge concerning incidents referred to in texts. The main contribution of this paper is the clarification of a concept graph generation mechanism, which is based on a text classification and optimal fuzzy relationship extraction. Furthermore, the work presented in this paper explains the application of fuzzy relationship extraction and branch and bound (BB) method to simplify the texts.
Keywords: Extraction, Max-Prod, Fuzzy Relations, Text Mining, Memberships, Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21841013 Benefits and Issues of Open-Cut Coal Mining on the Socio-Economic Environment - The Iban Community in Mukah, Sarawak, Malaysia
Authors: Edward Lim
Abstract:
This paper deals principally with the socio-economic impact on the local Iban community in Mukah Division, Sarawak; with the commencement of the open-cut coal mining industry since 2003. To-date there are no actual studies being carried out by either the public or private sector to truly analyze how the Iban community is coping with the advent of a large influx of cash into their society. The Iban community has traditionally been practicing shifting cultivation and farming of domesticated animals; with a portion of the younger generation working as laborers and professional. This paper represents the views and observations of the author supported by some statistical facts extracted from published articles and non-published reports. The paper deals primarily in the following areas: • Background of the coal mining industry in Mukah Division, Sarawak; • Benefits of the coal mining industry towards the Iban community; • Issues / Problems arise in the Iban community because of the presence of the coal mining industry; and • Possible actions that need to be taken to overcome these issues/ problems.
Keywords: Coal Mining, Iban Community, Malaysia, Sub-Bituminous Coal.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24431012 The Application of Data Mining Technology in Building Energy Consumption Data Analysis
Authors: Liang Zhao, Jili Zhang, Chongquan Zhong
Abstract:
Energy consumption data, in particular those involving public buildings, are impacted by many factors: the building structure, climate/environmental parameters, construction, system operating condition, and user behavior patterns. Traditional methods for data analysis are insufficient. This paper delves into the data mining technology to determine its application in the analysis of building energy consumption data including energy consumption prediction, fault diagnosis, and optimal operation. Recent literature are reviewed and summarized, the problems faced by data mining technology in the area of energy consumption data analysis are enumerated, and research points for future studies are given.
Keywords: Data mining, data analysis, prediction, optimization, building operational performance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 37091011 An Application for Web Mining Systems with Services Oriented Architecture
Authors: Thiago M. R. Dias, Gray F. Moita, Paulo E. M. Almeida
Abstract:
Although the World Wide Web is considered the largest source of information there exists nowadays, due to its inherent dynamic characteristics, the task of finding useful and qualified information can become a very frustrating experience. This study presents a research on the information mining systems in the Web; and proposes an implementation of these systems by means of components that can be built using the technology of Web services. This implies that they can encompass features offered by a services oriented architecture (SOA) and specific components may be used by other tools, independent of platforms or programming languages. Hence, the main objective of this work is to provide an architecture to Web mining systems, divided into stages, where each step is a component that will incorporate the characteristics of SOA. The separation of these steps was designed based upon the existing literature. Interesting results were obtained and are shown here.Keywords: Web Mining, Service Oriented Architecture, WebServices.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14721010 AniMoveMineR: Animal Behavior Exploratory Analysis Using Association Rules Mining
Authors: Suelane Garcia Fontes, Silvio Luiz Stanzani, Pedro L. Pizzigatti Corrła Ronaldo G. Morato
Abstract:
Environmental changes and major natural disasters are most prevalent in the world due to the damage that humanity has caused to nature and these damages directly affect the lives of animals. Thus, the study of animal behavior and their interactions with the environment can provide knowledge that guides researchers and public agencies in preservation and conservation actions. Exploratory analysis of animal movement can determine the patterns of animal behavior and with technological advances the ability of animals to be tracked and, consequently, behavioral studies have been expanded. There is a lot of research on animal movement and behavior, but we note that a proposal that combines resources and allows for exploratory analysis of animal movement and provide statistical measures on individual animal behavior and its interaction with the environment is missing. The contribution of this paper is to present the framework AniMoveMineR, a unified solution that aggregates trajectory analysis and data mining techniques to explore animal movement data and provide a first step in responding questions about the animal individual behavior and their interactions with other animals over time and space. We evaluated the framework through the use of monitored jaguar data in the city of Miranda Pantanal, Brazil, in order to verify if the use of AniMoveMineR allows to identify the interaction level between these jaguars. The results were positive and provided indications about the individual behavior of jaguars and about which jaguars have the highest or lowest correlation.Keywords: Data mining, data science, trajectory, animal behavior.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9191009 Three Computational Mathematics Techniques: Comparative Determination of Area under Curve
Authors: Khalid Pervaiz Akhter, Mahmood Ahmad, Ghulam Murtaza, Ishrat Shafi, Zafar Javed
Abstract:
The objective of this manuscript is to find area under the plasma concentration- time curve (AUC) for multiple doses of salbutamol sulphate sustained release tablets (Ventolin® oral tablets SR 8 mg, GSK, Pakistan) in the group of 18 healthy adults by using computational mathematics techniques. Following the administration of 4 doses of Ventolin® tablets 12 hourly to 24 healthy human subjects and bioanalysis of obtained plasma samples, plasma drug concentration-time profile was constructed. AUC, an important pharmacokinetic parameter, was measured using integrated equation of multiple oral dose regimens. The approximated AUC was also calculated by using computational mathematics techniques such as repeated rectangular, repeated trapezium and repeated Simpson's rule and compared with exact value of AUC calculated by using integrated equation of multiple oral dose regimens to find best computational mathematics method that gives AUC values closest to exact. The exact values of AUC for four consecutive doses of Ventolin® oral tablets were 150.5819473, 157.8131756, 164.4178231 and 162.78 ng.h/ml while the closest values approximated AUC values were 149.245962, 157.336171, 164.2585768 and 162.289224 ng.h/ml, respectively as found by repeated rectangular rule. The errors in the approximated values of AUC were negligible. It is concluded that all computational tools approximated values of AUC accurately but the repeated rectangular rule gives slightly better approximated values of AUC as compared to repeated trapezium and repeated Simpson's rules.
Keywords: Salbutamol sulphate, Area under curve (AUC), repeated rectangular rule, repeated trapezium rule, repeated Simpson's rule.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18421008 A Web Designer Agent, Based On Usage Mining Online Behavior of Visitors
Authors: Babak Abedin, Babak Sohrabi
Abstract:
Website plays a significant role in success of an e-business. It is the main start point of any organization and corporation for its customers, so it's important to customize and design it according to the visitors' preferences. Also, websites are a place to introduce services of an organization and highlight new service to the visitors and audiences. In this paper, we will use web usage mining techniques, as a new field of research in data mining and knowledge discovery, in an Iranian government website. Using the results, a framework for web content layour is proposed. An agent is designed to dynamically update and improve web links locations and layout. Then, we will explain how it is used to directly enable top managers of the organization to influence on the arrangement of web contents and also to enhance customization of web site navigation due to online users' behaviors.
Keywords: Web usage mining, website design, agent, website customization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19311007 Class Outliers Mining: Distance-Based Approach
Authors: Nabil M. Hewahi, Motaz K. Saad
Abstract:
In large datasets, identifying exceptional or rare cases with respect to a group of similar cases is considered very significant problem. The traditional problem (Outlier Mining) is to find exception or rare cases in a dataset irrespective of the class label of these cases, they are considered rare events with respect to the whole dataset. In this research, we pose the problem that is Class Outliers Mining and a method to find out those outliers. The general definition of this problem is “given a set of observations with class labels, find those that arouse suspicions, taking into account the class labels". We introduce a novel definition of Outlier that is Class Outlier, and propose the Class Outlier Factor (COF) which measures the degree of being a Class Outlier for a data object. Our work includes a proposal of a new algorithm towards mining of the Class Outliers, presenting experimental results applied on various domains of real world datasets and finally a comparison study with other related methods is performed.Keywords: Class Outliers, Distance-Based Approach, Outliers Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 33881006 Modelling of Powered Roof Supports Work
Authors: Marcin Michalak
Abstract:
Due to the increasing efforts on saving our natural environment a change in the structure of energy resources can be observed - an increasing fraction of a renewable energy sources. In many countries traditional underground coal mining loses its significance but there are still countries, like Poland or Germany, in which the coal based technologies have the greatest fraction in a total energy production. This necessitates to make an effort to limit the costs and negative effects of underground coal mining. The longwall complex is as essential part of the underground coal mining. The safety and the effectiveness of the work is strongly dependent of the diagnostic state of powered roof supports. The building of a useful and reliable diagnostic system requires a lot of data. As the acquisition of a data of any possible operating conditions it is important to have a possibility to generate a demanded artificial working characteristics. In this paper a new approach of modelling a leg pressure in the single unit of powered roof support. The model is a result of the analysis of a typical working cycles.Keywords: Machine modelling, underground mining, coal mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19261005 Modified Data Mining Approach for Defective Diagnosis in Hard Disk Drive Industry
Authors: S. Soommat, S. Patamatamkul, T. Prempridi, M. Sritulyachot, P. Ineure, S. Yimman
Abstract:
Currently, slider process of Hard Disk Drive Industry become more complex, defective diagnosis for yield improvement becomes more complicated and time-consumed. Manufacturing data analysis with data mining approach is widely used for solving that problem. The existing mining approach from combining of the KMean clustering, the machine oriented Kruskal-Wallis test and the multivariate chart were applied for defective diagnosis but it is still be a semiautomatic diagnosis system. This article aims to modify an algorithm to support an automatic decision for the existing approach. Based on the research framework, the new approach can do an automatic diagnosis and help engineer to find out the defective factors faster than the existing approach about 50%.Keywords: Slider process, Defective diagnosis and Data mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11991004 Roof Material Detection Based on Object-Based Approach Using WorldView-2 Satellite Imagery
Authors: Ebrahim Taherzadeh, Helmi Z. M. Shafri, Kaveh Shahi
Abstract:
One of the most important tasks in urban remote sensing is the detection of impervious surfaces (IS), such as roofs and roads. However, detection of IS in heterogeneous areas still remains one of the most challenging tasks. In this study, detection of concrete roof using an object-based approach was proposed. A new rule-based classification was developed to detect concrete roof tile. This proposed rule-based classification was applied to WorldView-2 image and results showed that the proposed rule has good potential to predict concrete roof material from WorldView-2 images, with 85% accuracy.
Keywords: Urban remote sensing, impervious surface, Object- Based, Roof Material, Concrete tile, WorldView-2.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 37931003 Discovery of Fuzzy Censored Production Rules from Large Set of Discovered Fuzzy if then Rules
Authors: Tamanna Siddiqui, M. Afshar Alam
Abstract:
Censored Production Rule is an extension of standard production rule, which is concerned with problems of reasoning with incomplete information, subject to resource constraints and problem of reasoning efficiently with exceptions. A CPR has a form: IF A (Condition) THEN B (Action) UNLESS C (Censor), Where C is the exception condition. Fuzzy CPR are obtained by augmenting ordinary fuzzy production rule “If X is A then Y is B with an exception condition and are written in the form “If X is A then Y is B Unless Z is C. Such rules are employed in situation in which the fuzzy conditional statement “If X is A then Y is B" holds frequently and the exception condition “Z is C" holds rarely. Thus “If X is A then Y is B" part of the fuzzy CPR express important information while the unless part acts only as a switch that changes the polarity of “Y is B" to “Y is not B" when the assertion “Z is C" holds. The proposed approach is an attempt to discover fuzzy censored production rules from set of discovered fuzzy if then rules in the form: A(X) ÔçÆ B(Y) || C(Z).Keywords: Uncertainty Quantification, Fuzzy if then rules, Fuzzy Censored Production Rules, Learning algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14861002 Semantically Enriched Web Usage Mining for Personalization
Authors: Suresh Shirgave, Prakash Kulkarni, José Borges
Abstract:
The continuous growth in the size of the World Wide Web has resulted in intricate Web sites, demanding enhanced user skills and more sophisticated tools to help the Web user to find the desired information. In order to make Web more user friendly, it is necessary to provide personalized services and recommendations to the Web user. For discovering interesting and frequent navigation patterns from Web server logs many Web usage mining techniques have been applied. The recommendation accuracy of usage based techniques can be improved by integrating Web site content and site structure in the personalization process.
Herein, we propose semantically enriched Web Usage Mining method for Personalization (SWUMP), an extension to solely usage based technique. This approach is a combination of the fields of Web Usage Mining and Semantic Web. In the proposed method, we envisage enriching the undirected graph derived from usage data with rich semantic information extracted from the Web pages and the Web site structure. The experimental results show that the SWUMP generates accurate recommendations and is able to achieve 10-20% better accuracy than the solely usage based model. The SWUMP addresses the new item problem inherent to solely usage based techniques.
Keywords: Prediction, Recommendation, Semantic Web Usage Mining, Web Usage Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3023