Search results for: Association rules mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1266

Search results for: Association rules mining

966 Enhance the Power of Sentiment Analysis

Authors: Yu Zhang, Pedro Desouza

Abstract:

Since big data has become substantially more accessible and manageable due to the development of powerful tools for dealing with unstructured data, people are eager to mine information from social media resources that could not be handled in the past. Sentiment analysis, as a novel branch of text mining, has in the last decade become increasingly important in marketing analysis, customer risk prediction and other fields. Scientists and researchers have undertaken significant work in creating and improving their sentiment models. In this paper, we present a concept of selecting appropriate classifiers based on the features and qualities of data sources by comparing the performances of five classifiers with three popular social media data sources: Twitter, Amazon Customer Reviews, and Movie Reviews. We introduced a couple of innovative models that outperform traditional sentiment classifiers for these data sources, and provide insights on how to further improve the predictive power of sentiment analysis. The modeling and testing work was done in R and Greenplum in-database analytic tools.

Keywords: Sentiment Analysis, Social Media, Twitter, Amazon, Data Mining, Machine Learning, Text Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3476
965 A Novel Approach to Improve Users Search Goal in Web Usage Mining

Authors: R. Lokeshkumar, P. Sengottuvelan

Abstract:

Web mining is to discover and extract useful Information. Different users may have different search goals when they search by giving queries and submitting it to a search engine. The inference and analysis of user search goals can be very useful for providing an experience result for a user search query. In this project, we propose a novel approach to infer user search goals by analyzing search web logs. First, we propose a novel approach to infer user search goals by analyzing search engine query logs, the feedback sessions are constructed from user click-through logs and it efficiently reflect the information needed for users. Second we propose a preprocessing technique to clean the unnecessary data’s from web log file (feedback session). Third we propose a technique to generate pseudo-documents to representation of feedback sessions for clustering. Finally we implement k-medoids clustering algorithm to discover different user search goals and to provide a more optimal result for a search query based on feedback sessions for the user.

Keywords: Data Preprocessing, Session Identification, Web log mining, Web Personalization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1985
964 Total and Leachable Concentration of Trace Elements in Soil towards Human Health Risk, Related with Coal Mine in Jorong, South Kalimantan, Indonesia

Authors: Arie Pujiwati, Kengo Nakamura, Noriaki Watanabe, Takeshi Komai

Abstract:

Coal mining is well known to cause considerable environmental impacts, including trace element contamination of soil. This study aimed to assess the trace element (As, Cd, Co, Cu, Ni, Pb, Sb, and Zn) contamination of soil in the vicinity of coal mining activities, using the case study of Asam-asam River basin, South Kalimantan, Indonesia, and to assess the human health risk, incorporating total and bioavailable (water-leachable and acid-leachable) concentrations. The results show the enrichment of As and Co in soil, surpassing the background soil value. Contamination was evaluated based on the index of geo-accumulation, Igeo and the pollution index, PI. Igeo values showed that the soil was generally uncontaminated (Igeo ≤ 0), except for elevated As and Co. Mean PI for Ni and Cu indicated slight contamination. Regarding the assessment of health risks, the Hazard Index, HI showed adverse risks (HI > 1) for Ni, Co, and As. Further, Ni and As were found to pose unacceptable carcinogenic risk (risk > 1.10-5). Farming, settlement, and plantation were found to present greater risk than coal mines. These results show that coal mining activity in the study area contaminates the soils by particular elements and may pose potential human health risk in its surrounding area. This study is important for setting appropriate countermeasure actions and improving basic coal mining management in Indonesia.

Keywords: Coal mine, risk, soil, trace elements.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1128
963 The Power of Indigenous Peoples in Decision-Making Processes of Mining Projects: The Pilbara Region

Authors: K. N. Penna, J. P. English

Abstract:

The destruction of the Juukan Gorge rock shelters in 2020 has catalysed impetus within Australian society for a significant change in engagement with Indigenous Peoples, and the approach to Indigenous cultural heritage, both within the Pilbara region and more broadly across Australia. Culture-based and people-centred approaches are inherent to inclusive sustainable development and Free, Prior, Informed Consent, outcomes encouraged by international and local recommendations on the human rights and cultural heritage preservation of Indigenous peoples. In this paper, we present an interpretive model of an evolved process for mining project development, incorporating culture-based and people-centred approaches, based on the Theory U system change method. The evolved process advocates a change in organisational mindset and culture, and a comprehensive understanding of Indigenous Peoples’ culture and values, as the foundations for increasing their influence and achieving mutually beneficial developments.

Keywords: Indigenous Engagement, mining industry, culture-based approach, people-centred approach, Theory U.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 352
962 Implementing an Intuitive Reasoner with a Large Weather Database

Authors: Yung-Chien Sun, O. Grant Clark

Abstract:

In this paper, the implementation of a rule-based intuitive reasoner is presented. The implementation included two parts: the rule induction module and the intuitive reasoner. A large weather database was acquired as the data source. Twelve weather variables from those data were chosen as the “target variables" whose values were predicted by the intuitive reasoner. A “complex" situation was simulated by making only subsets of the data available to the rule induction module. As a result, the rules induced were based on incomplete information with variable levels of certainty. The certainty level was modeled by a metric called "Strength of Belief", which was assigned to each rule or datum as ancillary information about the confidence in its accuracy. Two techniques were employed to induce rules from the data subsets: decision tree and multi-polynomial regression, respectively for the discrete and the continuous type of target variables. The intuitive reasoner was tested for its ability to use the induced rules to predict the classes of the discrete target variables and the values of the continuous target variables. The intuitive reasoner implemented two types of reasoning: fast and broad where, by analogy to human thought, the former corresponds to fast decision making and the latter to deeper contemplation. . For reference, a weather data analysis approach which had been applied on similar tasks was adopted to analyze the complete database and create predictive models for the same 12 target variables. The values predicted by the intuitive reasoner and the reference approach were compared with actual data. The intuitive reasoner reached near-100% accuracy for two continuous target variables. For the discrete target variables, the intuitive reasoner predicted at least 70% as accurately as the reference reasoner. Since the intuitive reasoner operated on rules derived from only about 10% of the total data, it demonstrated the potential advantages in dealing with sparse data sets as compared with conventional methods.

Keywords: Artificial intelligence, intuition, knowledge acquisition, limited certainty.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1346
961 Comparative Study of Universities’ Web Structure Mining

Authors: Z. Abdullah, A. R. Hamdan

Abstract:

This paper is meant to analyze the ranking of University of Malaysia Terengganu, UMT’s website in the World Wide Web. There are only few researches have been done on comparing the ranking of universities’ websites so this research will be able to determine whether the existing UMT’s website is serving its purpose which is to introduce UMT to the world. The ranking is based on hub and authority values which are accordance to the structure of the website. These values are computed using two websearching algorithms, HITS and SALSA. Three other universities’ websites are used as the benchmarks which are UM, Harvard and Stanford. The result is clearly showing that more work has to be done on the existing UMT’s website where important pages according to the benchmarks, do not exist in UMT’s pages. The ranking of UMT’s website will act as a guideline for the web-developer to develop a more efficient website.

Keywords: Algorithm, ranking, website, web structure mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1632
960 A Testbed for the Experiments Performed in Missing Value Treatments

Authors: Dias de J. C. Lilian, Lobato M. F. Fábio, de Santana L. Ádamo

Abstract:

The occurrence of missing values in database is a serious problem for Data Mining tasks, responsible for degrading data quality and accuracy of analyses. In this context, the area has shown a lack of standardization for experiments to treat missing values, introducing difficulties to the evaluation process among different researches due to the absence in the use of common parameters. This paper proposes a testbed intended to facilitate the experiments implementation and provide unbiased parameters using available datasets and suited performance metrics in order to optimize the evaluation and comparison between the state of art missing values treatments.

Keywords: Data imputation, data mining, missing values treatment, testbed.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1471
959 Data Mining Techniques in Computer-Aided Diagnosis: Non-Invasive Cancer Detection

Authors: Florin Gorunescu

Abstract:

Diagnosis can be achieved by building a model of a certain organ under surveillance and comparing it with the real time physiological measurements taken from the patient. This paper deals with the presentation of the benefits of using Data Mining techniques in the computer-aided diagnosis (CAD), focusing on the cancer detection, in order to help doctors to make optimal decisions quickly and accurately. In the field of the noninvasive diagnosis techniques, the endoscopic ultrasound elastography (EUSE) is a recent elasticity imaging technique, allowing characterizing the difference between malignant and benign tumors. Digitalizing and summarizing the main EUSE sample movies features in a vector form concern with the use of the exploratory data analysis (EDA). Neural networks are then trained on the corresponding EUSE sample movies vector input in such a way that these intelligent systems are able to offer a very precise and objective diagnosis, discriminating between benign and malignant tumors. A concrete application of these Data Mining techniques illustrates the suitability and the reliability of this methodology in CAD.

Keywords: Endoscopic ultrasound elastography, exploratorydata analysis, neural networks, non-invasive cancer detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1816
958 Lateral Torsional Buckling of an Eccentrically Loaded Channel Section Beam

Authors: L. Dahmani, S. Drizi, M. Djemai, A. Boudjemia, M. O. Mechiche

Abstract:

Channel sections are widely used in practice as beams. However, design rules for eccentrically loaded (not through shear center) beams with channel cross- sections are not available in Eurocode 3. This paper compares the ultimate loads based on the adjusted design rules for lateral torsional buckling of eccentrically loaded channel beams in bending to the ultimate loads obtained with Finite Element (FE) simulations on the basis of a parameter study. Based on the proposed design rule, this study has led to a new design rule which conforms to Eurocode 3.

Keywords: ANSYS, Eurocode 3, finite element method, lateral torsional buckling, steel channel beam.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2996
957 The Association between C-Reactive Protein and Hypertension of Different United States Participants Categorized by Ethnicity: Applying the National Health and Nutrition Examination Survey from 1999-2010

Authors: Ghada Abo-Zaid

Abstract:

Objectives: The main objective of this study was to examine the association between the elevated level of C-reactive protein (CRP) and incidence of hypertension before and after adjustments for age, BMI, gender, SES, smoking, diabetes, cholesterol LDL and cholesterol HDL, and to determine whether the association differs by race. Method: Cross sectional data for participants from aged 17 years to 74 years, included in The National Health and Nutrition Examination Survey (NHANES) from 1999 to 2010 were analyzed. The CRP level was classified into three categories (> 3 mg/L, between 1 mg/L and 3 mg/L, and < 3 mg/L). Blood pressure categorization was done using JNC 7 indicator. Hypertension is defined as either systolic blood pressure (SBP) of 140 mmHg or more and diastolic blood pressure (DBP) of 90 mmHg or more, otherwise a self-reported prior diagnosis by a physician. Pre-hypertension was defined as 139 ≥ SBP > 120 or 89 ≥ DBP >80. Multinominal regression model was undertaken to measure the association between CRP level and hypertension. Results: In univariable models, CRP concentrations > 3 mg/L were associated with a 73% greater risk of incident hypertension compared with CRP concentrations < 1 mg/L (Hypertension: odds ratio [OR] = 1.73; 95% confidence interval [CI], 1.50-1.99). Ethnic comparisons showed that American Mexicans had the highest risk of incident hypertension (OR = 2.39; 95% CI, 2.21-2.58). This risk was statistically insignificant after controlling by other variables (Hypertension: OR = 0.75; 95% CI, 0.52-1.08), or categorized by race [American Mexican: OR= 1.58; 95% CI, 0.58-4.26, Other Hispanic: OR = 0.87; 95% CI, 0.19-4.42, Non-Hispanic white: OR = 0.90; 95% CI, 0.50-1.59, Non-Hispanic Black: OR = 0.44; 95% CI, 0.22-0.87. The same results were found for pre-hypertension, and the Non-Hispanic black segment showed the highest significant risk for Pre-Hypertension (OR = 1.60; 95% CI, 1.26-2.03). When CRP concentrations were between 1.0 and 3.0 mg/L in unadjusted models, prehypertension was associated with higher likelihood of elevated CRP (OR = 1.37; 95% CI, 1.15-1.62). The same relationship was maintained in Non-Hispanic white, Non-Hispanic black, and other race (Non-Hispanic white: OR = 1.24; 95% CI, 1.03-1.48, Non-Hispanic black: OR = 1.60; 95% CI, 1.27-2.03, other race: OR = 2.50; 95% CI, 1.32-4.74) while the association was insignificant with American Mexican and other Hispanic. In the adjusted model, the relationship between CRP and prehypertension were no longer available. Contrary, hypertension was not independently associated with elevated CRP, and the results were the same after being grouped by race or adjustments for the possible confounder variables. The same results were obtained when SBP or DBP were on a continuous measure. Conclusions: This study confirmed the existence of an association between hypertension, prehypertension and elevated level of CRP, however this association was no longer available after adjusting by other variables. Ethic group differences were statistically significant at the univariable models, while it disappeared after controlling by other variables. 

Keywords: CRP, hypertension, ethnicity, NHANES, blood pressure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1313
956 Why Do Pakistani Customers Patronize Islamic Banks- An Empirical Analysis

Authors: Farjana Mumu, Jia Guozho

Abstract:

Throughout the world, the Islamic way of banking and financing is increasing. The same trend is also visible in Pakistan, where the Islamic banking sector is increasing in size and volume each year. The question immediately arises as why the Pakistanis patronize the Islamic banking system? This study was carried out to find whether following the Islamic rules in finance is the main factor for such selection or whether other factors such as customer service, location, banking hour, physical facilities of the bank etc also have importance. The study was carried by distributing questionnaire and 200 responses were collected from the clients of Islamic banks. The result showed that the service quality and other factors are as important as following the Islamic rules for finance to retain old ustomers and catch new customers. The result is important and Islamic banks can take actions accordingly to look after both the factors

Keywords: Customers' perception, customer satisfaction, customer service, Islamic banking

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2150
955 Improving Convergence of Parameter Tuning Process of the Additive Fuzzy System by New Learning Strategy

Authors: Thi Nguyen, Lee Gordon-Brown, Jim Peterson, Peter Wheeler

Abstract:

An additive fuzzy system comprising m rules with n inputs and p outputs in each rule has at least t m(2n + 2 p + 1) parameters needing to be tuned. The system consists of a large number of if-then fuzzy rules and takes a long time to tune its parameters especially in the case of a large amount of training data samples. In this paper, a new learning strategy is investigated to cope with this obstacle. Parameters that tend toward constant values at the learning process are initially fixed and they are not tuned till the end of the learning time. Experiments based on applications of the additive fuzzy system in function approximation demonstrate that the proposed approach reduces the learning time and hence improves convergence speed considerably.

Keywords: Additive fuzzy system, improving convergence, parameter learning process, unsupervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1473
954 Automatic Generation of OWL Ontologies from UML Class Diagrams Based on Meta- Modelling and Graph Grammars

Authors: Aissam Belghiat, Mustapha Bourahla

Abstract:

Models are placed by modeling paradigm at the center of development process. These models are represented by languages, like UML the language standardized by the OMG which became necessary for development. Moreover the ontology engineering paradigm places ontologies at the center of development process; in this paradigm we find OWL the principal language for knowledge representation. Building ontologies from scratch is generally a difficult task. The bridging between UML and OWL appeared on several regards such as the classes and associations. In this paper, we have to profit from convergence between UML and OWL to propose an approach based on Meta-Modelling and Graph Grammars and registered in the MDA architecture for the automatic generation of OWL ontologies from UML class diagrams. The transformation is based on transformation rules; the level of abstraction in these rules is close to the application in order to have usable ontologies. We illustrate this approach by an example.

Keywords: ATOM3, MDA, Ontology, OWL, UML

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24858
953 Dimensional Modeling of HIV Data Using Open Source

Authors: Charles D. Otine, Samuel B. Kucel, Lena Trojer

Abstract:

Selecting the data modeling technique for an information system is determined by the objective of the resultant data model. Dimensional modeling is the preferred modeling technique for data destined for data warehouses and data mining, presenting data models that ease analysis and queries which are in contrast with entity relationship modeling. The establishment of data warehouses as components of information system landscapes in many organizations has subsequently led to the development of dimensional modeling. This has been significantly more developed and reported for the commercial database management systems as compared to the open sources thereby making it less affordable for those in resource constrained settings. This paper presents dimensional modeling of HIV patient information using open source modeling tools. It aims to take advantage of the fact that the most affected regions by the HIV virus are also heavily resource constrained (sub-Saharan Africa) whereas having large quantities of HIV data. Two HIV data source systems were studied to identify appropriate dimensions and facts these were then modeled using two open source dimensional modeling tools. Use of open source would reduce the software costs for dimensional modeling and in turn make data warehousing and data mining more feasible even for those in resource constrained settings but with data available.

Keywords: About Database, Data Mining, Data warehouse, Dimensional Modeling, Open Source.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1914
952 A Neurofuzzy Learning and its Application to Control System

Authors: Seema Chopra, R. Mitra, Vijay Kumar

Abstract:

A neurofuzzy approach for a given set of input-output training data is proposed in two phases. Firstly, the data set is partitioned automatically into a set of clusters. Then a fuzzy if-then rule is extracted from each cluster to form a fuzzy rule base. Secondly, a fuzzy neural network is constructed accordingly and parameters are tuned to increase the precision of the fuzzy rule base. This network is able to learn and optimize the rule base of a Sugeno like Fuzzy inference system using Hybrid learning algorithm, which combines gradient descent, and least mean square algorithm. This proposed neurofuzzy system has the advantage of determining the number of rules automatically and also reduce the number of rules, decrease computational time, learns faster and consumes less memory. The authors also investigate that how neurofuzzy techniques can be applied in the area of control theory to design a fuzzy controller for linear and nonlinear dynamic systems modelling from a set of input/output data. The simulation analysis on a wide range of processes, to identify nonlinear components on-linely in a control system and a benchmark problem involving the prediction of a chaotic time series is carried out. Furthermore, the well-known examples of linear and nonlinear systems are also simulated under the Matlab/Simulink environment. The above combination is also illustrated in modeling the relationship between automobile trips and demographic factors.

Keywords: Fuzzy control, neuro-fuzzy techniques, fuzzy subtractive clustering, extraction of rules, and optimization of membership functions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2555
951 A Recognition Method for Spatio-Temporal Background in Korean Historical Novels

Authors: Seo-Hee Kim, Kee-Won Kim, Seung-Hoon Kim

Abstract:

The most important elements of a novel are the characters, events and background. The background represents the time, place and situation that character appears, and conveys event and atmosphere more realistically. If readers have the proper knowledge about background of novels, it may be helpful for understanding the atmosphere of a novel and choosing a novel that readers want to read. In this paper, we are targeting Korean historical novels because spatio-temporal background especially performs an important role in historical novels among the genre of Korean novels. To the best of our knowledge, we could not find previous study that was aimed at Korean novels. In this paper, we build a Korean historical national dictionary. Our dictionary has historical places and temple names of kings over many generations as well as currently existing spatial words or temporal words in Korean history. We also present a method for recognizing spatio-temporal background based on patterns of phrasal words in Korean sentences. Our rules utilize postposition for spatial background recognition and temple names for temporal background recognition. The knowledge of the recognized background can help readers to understand the flow of events and atmosphere, and can use to visualize the elements of novels.

Keywords: Data mining, Korean historical novels, Korean linguistic feature, spatio-temporal background.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1085
950 Integrated Method for Detection of Unknown Steganographic Content

Authors: Magdalena Pejas

Abstract:

This article concerns the presentation of an integrated method for detection of steganographic content embedded by new unknown programs. The method is based on data mining and aggregated hypothesis testing. The article contains the theoretical basics used to deploy the proposed detection system and the description of improvement proposed for the basic system idea. Further main results of experiments and implementation details are collected and described. Finally example results of the tests are presented.

Keywords: Steganography, steganalysis, data embedding, data mining, feature extraction, knowledge base, system learning, hypothesis testing, error estimation, black box program, file structure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1524
949 Application of Computer Aided Engineering Tools in Performance Prediction and Fault Detection of Mechanical Equipment of Mining Process Line

Authors: K. Jahani, J. Razavi

Abstract:

Nowadays, to decrease the number of downtimes in the industries such as metal mining, petroleum and chemical industries, predictive maintenance is crucial. In order to have efficient predictive maintenance, knowing the performance of critical equipment of production line such as pumps and hydro-cyclones under variable operating parameters, selecting best indicators of this equipment health situations, best locations for instrumentation, and also measuring of these indicators are very important. In this paper, computer aided engineering (CAE) tools are implemented to study some important elements of copper process line, namely slurry pumps and cyclone to predict the performance of these components under different working conditions. These modeling and simulations can be used in predicting, for example, the damage tolerance of the main shaft of the slurry pump or wear rate and location of cyclone wall or pump case and impeller. Also, the simulations can suggest best-measuring parameters, measuring intervals, and their locations.

Keywords: Computer aided engineering, predictive maintenance, fault detection, mining process line, slurry pump, hydrocyclone.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1794
948 A Review and Comparative Analysis on Cluster Ensemble Methods

Authors: S. Sarumathi, P. Ranjetha, C. Saraswathy, M. Vaishnavi, S. Geetha

Abstract:

Clustering is an unsupervised learning technique for aggregating data objects into meaningful classes so that intra cluster similarity is maximized and inter cluster similarity is minimized in data mining. However, no single clustering algorithm proves to be the most effective in producing the best result. As a result, a new challenging technique known as the cluster ensemble approach has blossomed in order to determine the solution to this problem. For the cluster analysis issue, this new technique is a successful approach. The cluster ensemble's main goal is to combine similar clustering solutions in a way that achieves the precision while also improving the quality of individual data clustering. Because of the massive and rapid creation of new approaches in the field of data mining, the ongoing interest in inventing novel algorithms necessitates a thorough examination of current techniques and future innovation. This paper presents a comparative analysis of various cluster ensemble approaches, including their methodologies, formal working process, and standard accuracy and error rates. As a result, the society of clustering practitioners will benefit from this exploratory and clear research, which will aid in determining the most appropriate solution to the problem at hand.

Keywords: Clustering, cluster ensemble methods, consensus function, data mining, unsupervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 742
947 Improving Classification Accuracy with Discretization on Datasets Including Continuous Valued Features

Authors: Mehmet Hacibeyoglu, Ahmet Arslan, Sirzat Kahramanli

Abstract:

This study analyzes the effect of discretization on classification of datasets including continuous valued features. Six datasets from UCI which containing continuous valued features are discretized with entropy-based discretization method. The performance improvement between the dataset with original features and the dataset with discretized features is compared with k-nearest neighbors, Naive Bayes, C4.5 and CN2 data mining classification algorithms. As the result the classification accuracies of the six datasets are improved averagely by 1.71% to 12.31%.

Keywords: Data mining classification algorithms, entropy-baseddiscretization method

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2421
946 Cirrhosis Mortality Prediction as Classification Using Frequent Subgraph Mining

Authors: Abdolghani Ebrahimi, Diego Klabjan, Chenxi Ge, Daniela Ladner, Parker Stride

Abstract:

In this work, we use machine learning and data analysis techniques to predict the one-year mortality of cirrhotic patients. Data from 2,322 patients with liver cirrhosis are collected at a single medical center. Different machine learning models are applied to predict one-year mortality. A comprehensive feature space including demographic information, comorbidity, clinical procedure and laboratory tests is being analyzed. A temporal pattern mining technic called Frequent Subgraph Mining (FSM) is being used. Model for End-stage liver disease (MELD) prediction of mortality is used as a comparator. All of our models statistically significantly outperform the MELD-score model and show an average 10% improvement of the area under the curve (AUC). The FSM technic itself does not improve the model significantly, but FSM, together with a machine learning technique called an ensemble, further improves the model performance. With the abundance of data available in healthcare through electronic health records (EHR), existing predictive models can be refined to identify and treat patients at risk for higher mortality. However, due to the sparsity of the temporal information needed by FSM, the FSM model does not yield significant improvements. Our work applies modern machine learning algorithms and data analysis methods on predicting one-year mortality of cirrhotic patients and builds a model that predicts one-year mortality significantly more accurate than the MELD score. We have also tested the potential of FSM and provided a new perspective of the importance of clinical features.

Keywords: machine learning, liver cirrhosis, subgraph mining, supervised learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 381
945 Analytical Design of IMC-PID Controller for Ideal Decoupling Embedded in Multivariable Smith Predictor Control System

Authors: Le Hieu Giang, Truong Nguyen Luan Vu, Le Linh

Abstract:

In this paper, the analytical tuning rules of IMC-PID controller are presented for the multivariable Smith predictor that involved the ideal decoupling. Accordingly, the decoupler is first introduced into the multivariable Smith predictor control system by a well-known approach of ideal decoupling, which is compactly extended for general nxn multivariable processes and the multivariable Smith predictor controller is then obtained in terms of the multiple single-loop Smith predictor controllers. The tuning rules of PID controller in series with filter are found by using Maclaurin approximation. Many multivariable industrial processes are employed to demonstrate the simplicity and effectiveness of the presented method. The simulation results show the superior performances of presented method in compared with the other methods.

Keywords: Ideal decoupler, IMC-PID controller, multivariable Smith predictor, Maclaurin approximation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1344
944 A Comparative Study of GTC and PSP Algorithms for Mining Sequential Patterns Embedded in Database with Time Constraints

Authors: Safa Adi

Abstract:

This paper will consider the problem of sequential mining patterns embedded in a database by handling the time constraints as defined in the GSP algorithm (level wise algorithms). We will compare two previous approaches GTC and PSP, that resumes the general principles of GSP. Furthermore this paper will discuss PG-hybrid algorithm, that using PSP and GTC. The results show that PSP and GTC are more efficient than GSP. On the other hand, the GTC algorithm performs better than PSP. The PG-hybrid algorithm use PSP algorithm for the two first passes on the database, and GTC approach for the following scans. Experiments show that the hybrid approach is very efficient for short, frequent sequences.

Keywords: Database, GTC algorithm, PSP algorithm, sequential patterns, time constraints.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 649
943 Response of the Residential Building Structureon Load Technical Seismicity due to Mining Activities

Authors: V. Salajka, Z. Kaláb, J. Kala, P. Hradil

Abstract:

In the territories where high-intensity earthquakes are frequent is paid attention to the solving of the seismic problems. In the paper are described two computational model variants based on finite element method of the construction with different subsoil simulation (rigid or elastic subsoil) is used. For simulation and calculations program system based on method final elements ANSYS was used. Seismic responses calculations of residential building structure were effected on loading characterized by accelerogram for comparing with the responses spectra method.

Keywords: Accelerogram, ANSYS, mining induced seismic, residential building structure, spectra, subsoil.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1482
942 Association of the p53 Codon 72 Polymorphism with Colorectal Cancer in South West of Iran

Authors: A. Doosti, P. Ghasemi Dehkordi, M. Zamani, S. Taheri, M. Banitalebi, M. Mahmoudzadeh

Abstract:

The p53 tumor suppressor gene plays two important roles in genomic stability: blocking cell proliferation after DNA damage until it has been repaired, and starting apoptosis if the damage is too critical. Codon 72 exon4 polymorphism (Arg72Pro) of the P53 gene has been implicated in cancer risk. Various studies have been done to investigate the status of p53 at codon 72 for arginine (Arg) and proline (Pro) alleles in different populations and also the association of this codon 72 polymorphism with various tumors. Our objective was to investigate the possible association between P53 Arg72Pro polymorphism and susceptibility to colorectal cancer among Isfahan and Chaharmahal Va Bakhtiari (a part of south west of Iran) population. We investigated the status of p53 at codon 72 for Arg/Arg, Arg/Pro and Pro/Pro allele polymorphisms in blood samples from 145 colorectal cancer patients and 140 controls by Nested-PCR of p53 exon 4 and digestion with BstUI restriction enzyme and the DNA fragments were then resolved by electrophoresis in 2% agarose gel. The Pro allele was 279 bp, while the Arg allele was restricted into two fragments of 160 and 119 bp. Among the 145 colorectal cancer cases 49 cases (33.79%) were homozygous for the Arg72 allele (Arg/Arg), 18 cases (12.41%) were homozygous for the Pro72 allele (Pro/Pro) and 78 cases (53.8%) found in heterozygous (Arg/Pro). In conclusion, it can be said that p53Arg/Arg genotype may be correlated with possible increased risk of this kind of cancers in south west of Iran.

Keywords: TP53, Polymorphism, Colorectal Cancer, Iran

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2340
941 Scheduling for a Reconfigurable Manufacturing System with Multiple Process Plans and Limited Pallets/Fixtures

Authors: Jae-Min Yu, Hyoung-Ho Doh, Ji-Su Kim, Dong-Ho Lee, Sung-Ho Nam

Abstract:

A reconfigurable manufacturing system (RMS) is an advanced system designed at the outset for rapid changes in its hardware and software components in order to quickly adjust its production capacity and functionally. Among various operational decisions, this study considers the scheduling problem that determines the input sequence and schedule at the same time for a given set of parts. In particular, we consider the practical constraints that the numbers of pallets/fixtures are limited and hence a part can be released into the system only when the fixture required for the part is available. To solve the integrated input sequencing and scheduling problems, we suggest a priority rule based approach in which the two sub-problems are solved using a combination of priority rules. To show the effectiveness of various rule combinations, a simulation experiment was done on the data for a real RMS, and the test results are reported.

Keywords: Reconfigurable manufacturing system, scheduling, priority rules, multiple process plans, pallets/fixtures

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1851
940 Using Swarm Intelligence for Improving Accuracy of Fuzzy Classifiers

Authors: Hassan M. Elragal

Abstract:

This paper discusses a method for improving accuracy of fuzzy-rule-based classifiers using particle swarm optimization (PSO). Two different fuzzy classifiers are considered and optimized. The first classifier is based on Mamdani fuzzy inference system (M_PSO fuzzy classifier). The second classifier is based on Takagi- Sugeno fuzzy inference system (TS_PSO fuzzy classifier). The parameters of the proposed fuzzy classifiers including premise (antecedent) parameters, consequent parameters and structure of fuzzy rules are optimized using PSO. Experimental results show that higher classification accuracy can be obtained with a lower number of fuzzy rules by using the proposed PSO fuzzy classifiers. The performances of M_PSO and TS_PSO fuzzy classifiers are compared to other fuzzy based classifiers

Keywords: Fuzzy classifier, Optimization of fuzzy systemparameters, Particle swarm optimization, Pattern classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2297
939 A New DIDS Design Based on a Combination Feature Selection Approach

Authors: Adel Sabry Eesa, Adnan Mohsin Abdulazeez Brifcani, Zeynep Orman

Abstract:

Feature selection has been used in many fields such as classification, data mining and object recognition and proven to be effective for removing irrelevant and redundant features from the original dataset. In this paper, a new design of distributed intrusion detection system using a combination feature selection model based on bees and decision tree. Bees algorithm is used as the search strategy to find the optimal subset of features, whereas decision tree is used as a judgment for the selected features. Both the produced features and the generated rules are used by Decision Making Mobile Agent to decide whether there is an attack or not in the networks. Decision Making Mobile Agent will migrate through the networks, moving from node to another, if it found that there is an attack on one of the nodes, it then alerts the user through User Interface Agent or takes some action through Action Mobile Agent. The KDD Cup 99 dataset is used to test the effectiveness of the proposed system. The results show that even if only four features are used, the proposed system gives a better performance when it is compared with the obtained results using all 41 features.

Keywords: Distributed intrusion detection system, mobile agent, feature selection, Bees Algorithm, decision tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1884
938 Seismic Response Reduction of Structures using Smart Base Isolation System

Authors: H.S. Kim

Abstract:

In this study, control performance of a smart base isolation system consisting of a friction pendulum system (FPS) and a magnetorheological (MR) damper has been investigated. A fuzzy logic controller (FLC) is used to modulate the MR damper so as to minimize structural acceleration while maintaining acceptable base displacement levels. To this end, a multi-objective optimization scheme is used to optimize parameters of membership functions and find appropriate fuzzy rules. To demonstrate effectiveness of the proposed multi-objective genetic algorithm for FLC, a numerical study of a smart base isolation system is conducted using several historical earthquakes. It is shown that the proposed method can find optimal fuzzy rules and that the optimized FLC outperforms not only a passive control strategy but also a human-designed FLC and a conventional semi-active control algorithm.

Keywords: Fuzzy logic controller, genetic algorithm, MR damper, smart base isolation system

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2153
937 Application of Granular Computing Paradigm in Knowledge Induction

Authors: Iftikhar U. Sikder

Abstract:

This paper illustrates an application of granular computing approach, namely rough set theory in data mining. The paper outlines the formalism of granular computing and elucidates the mathematical underpinning of rough set theory, which has been widely used by the data mining and the machine learning community. A real-world application is illustrated, and the classification performance is compared with other contending machine learning algorithms. The predictive performance of the rough set rule induction model shows comparative success with respect to other contending algorithms.

Keywords: Concept approximation, granular computing, reducts, rough set theory, rule induction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 781