Search results for: Association rule mining

840 Knowledge Discovery Techniques for Talent Forecasting in Human Resource Application

Authors: Hamidah Jantan, Abdul Razak Hamdan, Zulaiha Ali Othman

Abstract:

Human Resource (HR) applications can be used to provide fair and consistent decisions, and to improve the effectiveness of decision making processes. Besides that, among the challenge for HR professionals is to manage organization talents, especially to ensure the right person for the right job at the right time. For that reason, in this article, we attempt to describe the potential to implement one of the talent management tasks i.e. identifying existing talent by predicting their performance as one of HR application for talent management. This study suggests the potential HR system architecture for talent forecasting by using past experience knowledge known as Knowledge Discovery in Database (KDD) or Data Mining. This article consists of three main parts; the first part deals with the overview of HR applications, the prediction techniques and application, the general view of Data mining and the basic concept of talent management in HRM. The second part is to understand the use of Data Mining technique in order to solve one of the talent management tasks, and the third part is to propose the potential HR system architecture for talent forecasting.

Keywords: HR Application, Knowledge Discovery inDatabase (KDD), Talent Forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4436

839 Building an Integrated Relational Database from Swiss Nutrition National Survey and Swiss Health Datasets for Data Mining Purposes

Authors: Ilona Mewes, Helena Jenzer, Farshideh Einsele

Abstract:

Objective: The objective of the study was to integrate two big databases from Swiss nutrition national survey (menuCH) and Swiss health national survey 2012 for data mining purposes. Each database has a demographic base data. An integrated Swiss database is built to later discover critical food consumption patterns linked with lifestyle diseases known to be strongly tied with food consumption. Design: Swiss nutrition national survey (menuCH) with approx. 2000 respondents from two different surveys, one by Phone and the other by questionnaire along with Swiss health national survey 2012 with 21500 respondents were pre-processed, cleaned and finally integrated to a unique relational database. Results: The result of this study is an integrated relational database from the Swiss nutritional and health databases.

Keywords: Health informatics, data mining, nutritional and health databases, nutritional and chronical databases.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1619

838 Attribute Selection Methods Comparison for Classification of Diffuse Large B-Cell Lymphoma

Authors: Helyane Bronoski Borges, Júlio Cesar Nievola

Abstract:

The most important subtype of non-Hodgkin-s lymphoma is the Diffuse Large B-Cell Lymphoma. Approximately 40% of the patients suffering from it respond well to therapy, whereas the remainder needs a more aggressive treatment, in order to better their chances of survival. Data Mining techniques have helped to identify the class of the lymphoma in an efficient manner. Despite that, thousands of genes should be processed to obtain the results. This paper presents a comparison of the use of various attribute selection methods aiming to reduce the number of genes to be searched, looking for a more effective procedure as a whole.

Keywords: Attribute selection, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1370

837 Online Forums Hotspot Detection and Analysis Using Aging Theory

Authors: K. Nirmala Devi, V. Murali Bhaskaran

Abstract:

The exponential growth of social media arouses much attention on public opinion information. The online forums, blogs, micro blogs are proving to be extremely valuable resources and are having bulk volume of information. However, most of the social media data is unstructured and semi structured form. So that it is more difficult to decipher automatically. Therefore, it is very much essential to understand and analyze those data for making a right decision. The online forums hotspot detection is a promising research field in the web mining and it guides to motivate the user to take right decision in right time. The proposed system consist of a novel approach to detect a hotspot forum for any given time period. It uses aging theory to find the hot terms and E-K-means for detecting the hotspot forum. Experimental results demonstrate that the proposed approach outperforms k-means for detecting the hotspot forums with the improved accuracy.

Keywords: Hotspot forums, Micro blog, Blog, Sentiment Analysis, Opinion Mining, Social media, Twitter, Web mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2138

836 Enhance the Power of Sentiment Analysis

Authors: Yu Zhang, Pedro Desouza

Abstract:

Since big data has become substantially more accessible and manageable due to the development of powerful tools for dealing with unstructured data, people are eager to mine information from social media resources that could not be handled in the past. Sentiment analysis, as a novel branch of text mining, has in the last decade become increasingly important in marketing analysis, customer risk prediction and other fields. Scientists and researchers have undertaken significant work in creating and improving their sentiment models. In this paper, we present a concept of selecting appropriate classifiers based on the features and qualities of data sources by comparing the performances of five classifiers with three popular social media data sources: Twitter, Amazon Customer Reviews, and Movie Reviews. We introduced a couple of innovative models that outperform traditional sentiment classifiers for these data sources, and provide insights on how to further improve the predictive power of sentiment analysis. The modeling and testing work was done in R and Greenplum in-database analytic tools.

Keywords: Sentiment Analysis, Social Media, Twitter, Amazon, Data Mining, Machine Learning, Text Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3473

835 A Novel Approach to Improve Users Search Goal in Web Usage Mining

Authors: R. Lokeshkumar, P. Sengottuvelan

Abstract:

Web mining is to discover and extract useful Information. Different users may have different search goals when they search by giving queries and submitting it to a search engine. The inference and analysis of user search goals can be very useful for providing an experience result for a user search query. In this project, we propose a novel approach to infer user search goals by analyzing search web logs. First, we propose a novel approach to infer user search goals by analyzing search engine query logs, the feedback sessions are constructed from user click-through logs and it efficiently reflect the information needed for users. Second we propose a preprocessing technique to clean the unnecessary data’s from web log file (feedback session). Third we propose a technique to generate pseudo-documents to representation of feedback sessions for clustering. Finally we implement k-medoids clustering algorithm to discover different user search goals and to provide a more optimal result for a search query based on feedback sessions for the user.

Keywords: Data Preprocessing, Session Identification, Web log mining, Web Personalization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1982

834 Physical Verification Flow on Multiple Foundries

Authors: R. Abdul Wahab, R. Mohd Fuad Tengku Aziz, N. Othman, S. Saleh, N. Razali, M. Al Baqir Zinal Abidin, M. Hanif Md Nasir

Abstract:

This paper will discuss how we optimize our physical verification flow in our IC Design Department having various rule decks from multiple foundries. Our ultimate goal is to achieve faster time to tape-out and avoid schedule delay. Currently the physical verification runtimes and memory usage have drastically increased with the increasing number of design rules, design complexity, and the size of the chips to be verified. To manage design violations, we use a number of solutions to reduce the amount of violations needed to be checked by physical verification engineers. The most important functions in physical verifications are DRC (design rule check), LVS (layout vs. schematic), and XRC (extraction). Since we have a multiple number of foundries for our design tape-outs, we need a flow that improve the overall turnaround time and ease of use of the physical verification process. The demand for fast turnaround time is even more critical since the physical design is the last stage before sending the layout to the foundries.

Keywords: Physical verification, DRC, LVS, XRC, flow, foundry, runset.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3169

833 Total and Leachable Concentration of Trace Elements in Soil towards Human Health Risk, Related with Coal Mine in Jorong, South Kalimantan, Indonesia

Authors: Arie Pujiwati, Kengo Nakamura, Noriaki Watanabe, Takeshi Komai

Abstract:

Coal mining is well known to cause considerable environmental impacts, including trace element contamination of soil. This study aimed to assess the trace element (As, Cd, Co, Cu, Ni, Pb, Sb, and Zn) contamination of soil in the vicinity of coal mining activities, using the case study of Asam-asam River basin, South Kalimantan, Indonesia, and to assess the human health risk, incorporating total and bioavailable (water-leachable and acid-leachable) concentrations. The results show the enrichment of As and Co in soil, surpassing the background soil value. Contamination was evaluated based on the index of geo-accumulation, I_geo and the pollution index, PI. I_geo values showed that the soil was generally uncontaminated (I_geo ≤ 0), except for elevated As and Co. Mean PI for Ni and Cu indicated slight contamination. Regarding the assessment of health risks, the Hazard Index, HI showed adverse risks (HI > 1) for Ni, Co, and As. Further, Ni and As were found to pose unacceptable carcinogenic risk (risk > 1.10^-5). Farming, settlement, and plantation were found to present greater risk than coal mines. These results show that coal mining activity in the study area contaminates the soils by particular elements and may pose potential human health risk in its surrounding area. This study is important for setting appropriate countermeasure actions and improving basic coal mining management in Indonesia.

Keywords: Coal mine, risk, soil, trace elements.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1127

832 The Power of Indigenous Peoples in Decision-Making Processes of Mining Projects: The Pilbara Region

Authors: K. N. Penna, J. P. English

Abstract:

The destruction of the Juukan Gorge rock shelters in 2020 has catalysed impetus within Australian society for a significant change in engagement with Indigenous Peoples, and the approach to Indigenous cultural heritage, both within the Pilbara region and more broadly across Australia. Culture-based and people-centred approaches are inherent to inclusive sustainable development and Free, Prior, Informed Consent, outcomes encouraged by international and local recommendations on the human rights and cultural heritage preservation of Indigenous peoples. In this paper, we present an interpretive model of an evolved process for mining project development, incorporating culture-based and people-centred approaches, based on the Theory U system change method. The evolved process advocates a change in organisational mindset and culture, and a comprehensive understanding of Indigenous Peoples’ culture and values, as the foundations for increasing their influence and achieving mutually beneficial developments.

Keywords: Indigenous Engagement, mining industry, culture-based approach, people-centred approach, Theory U.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 350

831 Flexible, Adaptable and Scaleable Business Rules Management System for Data Validation

Authors: Kashif Kamran, Farooque Azam

Abstract:

The policies governing the business of any organization are well reflected in her business rules. The business rules are implemented by data validation techniques, coded during the software development process. Any change in business policies results in change in the code written for data validation used to enforce the business policies. Implementing the change in business rules without changing the code is the objective of this paper. The proposed approach enables users to create rule sets at run time once the software has been developed. The newly defined rule sets by end users are associated with the data variables for which the validation is required. The proposed approach facilitates the users to define business rules using all the comparison operators and Boolean operators. Multithreading is used to validate the data entered by end user against the business rules applied. The evaluation of the data is performed by a newly created thread using an enhanced form of the RPN (Reverse Polish Notation) algorithm.

Keywords: Business Rules, data validation, multithreading, Reverse Polish Notation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2229

830 Comparative Study of Universities’ Web Structure Mining

Authors: Z. Abdullah, A. R. Hamdan

Abstract:

This paper is meant to analyze the ranking of University of Malaysia Terengganu, UMT’s website in the World Wide Web. There are only few researches have been done on comparing the ranking of universities’ websites so this research will be able to determine whether the existing UMT’s website is serving its purpose which is to introduce UMT to the world. The ranking is based on hub and authority values which are accordance to the structure of the website. These values are computed using two websearching algorithms, HITS and SALSA. Three other universities’ websites are used as the benchmarks which are UM, Harvard and Stanford. The result is clearly showing that more work has to be done on the existing UMT’s website where important pages according to the benchmarks, do not exist in UMT’s pages. The ranking of UMT’s website will act as a guideline for the web-developer to develop a more efficient website.

Keywords: Algorithm, ranking, website, web structure mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1631

829 A Testbed for the Experiments Performed in Missing Value Treatments

Authors: Dias de J. C. Lilian, Lobato M. F. Fábio, de Santana L. Ádamo

Abstract:

The occurrence of missing values in database is a serious problem for Data Mining tasks, responsible for degrading data quality and accuracy of analyses. In this context, the area has shown a lack of standardization for experiments to treat missing values, introducing difficulties to the evaluation process among different researches due to the absence in the use of common parameters. This paper proposes a testbed intended to facilitate the experiments implementation and provide unbiased parameters using available datasets and suited performance metrics in order to optimize the evaluation and comparison between the state of art missing values treatments.

Keywords: Data imputation, data mining, missing values treatment, testbed.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1468

828 Data Mining Techniques in Computer-Aided Diagnosis: Non-Invasive Cancer Detection

Authors: Florin Gorunescu

Abstract:

Diagnosis can be achieved by building a model of a certain organ under surveillance and comparing it with the real time physiological measurements taken from the patient. This paper deals with the presentation of the benefits of using Data Mining techniques in the computer-aided diagnosis (CAD), focusing on the cancer detection, in order to help doctors to make optimal decisions quickly and accurately. In the field of the noninvasive diagnosis techniques, the endoscopic ultrasound elastography (EUSE) is a recent elasticity imaging technique, allowing characterizing the difference between malignant and benign tumors. Digitalizing and summarizing the main EUSE sample movies features in a vector form concern with the use of the exploratory data analysis (EDA). Neural networks are then trained on the corresponding EUSE sample movies vector input in such a way that these intelligent systems are able to offer a very precise and objective diagnosis, discriminating between benign and malignant tumors. A concrete application of these Data Mining techniques illustrates the suitability and the reliability of this methodology in CAD.

Keywords: Endoscopic ultrasound elastography, exploratorydata analysis, neural networks, non-invasive cancer detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1815

827 The Association between C-Reactive Protein and Hypertension of Different United States Participants Categorized by Ethnicity: Applying the National Health and Nutrition Examination Survey from 1999-2010

Authors: Ghada Abo-Zaid

Abstract:

Objectives: The main objective of this study was to examine the association between the elevated level of C-reactive protein (CRP) and incidence of hypertension before and after adjustments for age, BMI, gender, SES, smoking, diabetes, cholesterol LDL and cholesterol HDL, and to determine whether the association differs by race. Method: Cross sectional data for participants from aged 17 years to 74 years, included in The National Health and Nutrition Examination Survey (NHANES) from 1999 to 2010 were analyzed. The CRP level was classified into three categories (> 3 mg/L, between 1 mg/L and 3 mg/L, and < 3 mg/L). Blood pressure categorization was done using JNC 7 indicator. Hypertension is defined as either systolic blood pressure (SBP) of 140 mmHg or more and diastolic blood pressure (DBP) of 90 mmHg or more, otherwise a self-reported prior diagnosis by a physician. Pre-hypertension was defined as 139 ≥ SBP > 120 or 89 ≥ DBP >80. Multinominal regression model was undertaken to measure the association between CRP level and hypertension. Results: In univariable models, CRP concentrations > 3 mg/L were associated with a 73% greater risk of incident hypertension compared with CRP concentrations < 1 mg/L (Hypertension: odds ratio [OR] = 1.73; 95% confidence interval [CI], 1.50-1.99). Ethnic comparisons showed that American Mexicans had the highest risk of incident hypertension (OR = 2.39; 95% CI, 2.21-2.58). This risk was statistically insignificant after controlling by other variables (Hypertension: OR = 0.75; 95% CI, 0.52-1.08), or categorized by race [American Mexican: OR= 1.58; 95% CI, 0.58-4.26, Other Hispanic: OR = 0.87; 95% CI, 0.19-4.42, Non-Hispanic white: OR = 0.90; 95% CI, 0.50-1.59, Non-Hispanic Black: OR = 0.44; 95% CI, 0.22-0.87. The same results were found for pre-hypertension, and the Non-Hispanic black segment showed the highest significant risk for Pre-Hypertension (OR = 1.60; 95% CI, 1.26-2.03). When CRP concentrations were between 1.0 and 3.0 mg/L in unadjusted models, prehypertension was associated with higher likelihood of elevated CRP (OR = 1.37; 95% CI, 1.15-1.62). The same relationship was maintained in Non-Hispanic white, Non-Hispanic black, and other race (Non-Hispanic white: OR = 1.24; 95% CI, 1.03-1.48, Non-Hispanic black: OR = 1.60; 95% CI, 1.27-2.03, other race: OR = 2.50; 95% CI, 1.32-4.74) while the association was insignificant with American Mexican and other Hispanic. In the adjusted model, the relationship between CRP and prehypertension were no longer available. Contrary, hypertension was not independently associated with elevated CRP, and the results were the same after being grouped by race or adjustments for the possible confounder variables. The same results were obtained when SBP or DBP were on a continuous measure. Conclusions: This study confirmed the existence of an association between hypertension, prehypertension and elevated level of CRP, however this association was no longer available after adjusting by other variables. Ethic group differences were statistically significant at the univariable models, while it disappeared after controlling by other variables.

Keywords: CRP, hypertension, ethnicity, NHANES, blood pressure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1311

826 OCR for Script Identification of Hindi (Devnagari) Numerals using Feature Sub Selection by Means of End-Point with Neuro-Memetic Model

Authors: Banashree N. P., R. Vasanta

Abstract:

Recognition of Indian languages scripts is challenging problems. In Optical Character Recognition [OCR], a character or symbol to be recognized can be machine printed or handwritten characters/numerals. There are several approaches that deal with problem of recognition of numerals/character depending on the type of feature extracted and different way of extracting them. This paper proposes a recognition scheme for handwritten Hindi (devnagiri) numerals; most admired one in Indian subcontinent. Our work focused on a technique in feature extraction i.e. global based approach using end-points information, which is extracted from images of isolated numerals. These feature vectors are fed to neuro-memetic model [18] that has been trained to recognize a Hindi numeral. The archetype of system has been tested on varieties of image of numerals. . In proposed scheme data sets are fed to neuro-memetic algorithm, which identifies the rule with highest fitness value of nearly 100 % & template associates with this rule is nothing but identified numerals. Experimentation result shows that recognition rate is 92-97 % compared to other models.

Keywords: OCR, Global Feature, End-Points, Neuro-Memetic model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1719

825 Dimensional Modeling of HIV Data Using Open Source

Authors: Charles D. Otine, Samuel B. Kucel, Lena Trojer

Abstract:

Selecting the data modeling technique for an information system is determined by the objective of the resultant data model. Dimensional modeling is the preferred modeling technique for data destined for data warehouses and data mining, presenting data models that ease analysis and queries which are in contrast with entity relationship modeling. The establishment of data warehouses as components of information system landscapes in many organizations has subsequently led to the development of dimensional modeling. This has been significantly more developed and reported for the commercial database management systems as compared to the open sources thereby making it less affordable for those in resource constrained settings. This paper presents dimensional modeling of HIV patient information using open source modeling tools. It aims to take advantage of the fact that the most affected regions by the HIV virus are also heavily resource constrained (sub-Saharan Africa) whereas having large quantities of HIV data. Two HIV data source systems were studied to identify appropriate dimensions and facts these were then modeled using two open source dimensional modeling tools. Use of open source would reduce the software costs for dimensional modeling and in turn make data warehousing and data mining more feasible even for those in resource constrained settings but with data available.

Keywords: About Database, Data Mining, Data warehouse, Dimensional Modeling, Open Source.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1912

824 Genetic Folding: Analyzing the Mercer-s Kernels Effect in Support Vector Machine using Genetic Folding

Authors: Mohd A. Mezher, Maysam F. Abbod

Abstract:

Genetic Folding (GF) a new class of EA named as is introduced for the first time. It is based on chromosomes composed of floating genes structurally organized in a parent form and separated by dots. Although, the genotype/phenotype system of GF generates a kernel expression, which is the objective function of superior classifier. In this work the question of the satisfying mapping-s rules in evolving populations is addressed by analyzing populations undergoing either Mercer-s or none Mercer-s rule. The results presented here show that populations undergoing Mercer-s rules improve practically models selection of Support Vector Machine (SVM). The experiment is trained multi-classification problem and tested on nonlinear Ionosphere dataset. The target of this paper is to answer the question of evolving Mercer-s rule in SVM addressed using either genetic folding satisfied kernel-s rules or not applied to complicated domains and problems.

Keywords: Genetic Folding, GF, Evolutionary Algorithms, Support Vector Machine, Genetic Algorithm, Genetic Programming, Multi-Classification, Mercer's Rules

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1575

823 Change Management in Business Process Modeling Based on Object Oriented Petri Net

Authors: Bassam Atieh Rajabi, Sai Peck Lee

Abstract:

Business Process Modeling (BPM) is the first and most important step in business process management lifecycle. Graph based formalism and rule based formalism are the two most predominant formalisms on which process modeling languages are developed. BPM technology continues to face challenges in coping with dynamic business environments where requirements and goals are constantly changing at the execution time. Graph based formalisms incur problems to react to dynamic changes in Business Process (BP) at the runtime instances. In this research, an adaptive and flexible framework based on the integration between Object Oriented diagramming technique and Petri Net modeling language is proposed in order to support change management techniques for BPM and increase the representation capability for Object Oriented modeling for the dynamic changes in the runtime instances. The proposed framework is applied in a higher education environment to achieve flexible, updatable and dynamic BP.

Keywords: Business Process Modeling, Change Management, Graph Based Modeling, Rule Based Modeling, Object Oriented PetriNet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1988

822 Integrated Method for Detection of Unknown Steganographic Content

Authors: Magdalena Pejas

Abstract:

This article concerns the presentation of an integrated method for detection of steganographic content embedded by new unknown programs. The method is based on data mining and aggregated hypothesis testing. The article contains the theoretical basics used to deploy the proposed detection system and the description of improvement proposed for the basic system idea. Further main results of experiments and implementation details are collected and described. Finally example results of the tests are presented.

Keywords: Steganography, steganalysis, data embedding, data mining, feature extraction, knowledge base, system learning, hypothesis testing, error estimation, black box program, file structure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1523

821 Application of Data Mining Techniques for Tourism Knowledge Discovery

Authors: Teklu Urgessa, Wookjae Maeng, Joong Seek Lee

Abstract:

Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.

Keywords: Classification algorithms; data mining; tourism; knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2476

820 Application of Computer Aided Engineering Tools in Performance Prediction and Fault Detection of Mechanical Equipment of Mining Process Line

Authors: K. Jahani, J. Razavi

Abstract:

Nowadays, to decrease the number of downtimes in the industries such as metal mining, petroleum and chemical industries, predictive maintenance is crucial. In order to have efficient predictive maintenance, knowing the performance of critical equipment of production line such as pumps and hydro-cyclones under variable operating parameters, selecting best indicators of this equipment health situations, best locations for instrumentation, and also measuring of these indicators are very important. In this paper, computer aided engineering (CAE) tools are implemented to study some important elements of copper process line, namely slurry pumps and cyclone to predict the performance of these components under different working conditions. These modeling and simulations can be used in predicting, for example, the damage tolerance of the main shaft of the slurry pump or wear rate and location of cyclone wall or pump case and impeller. Also, the simulations can suggest best-measuring parameters, measuring intervals, and their locations.

Keywords: Computer aided engineering, predictive maintenance, fault detection, mining process line, slurry pump, hydrocyclone.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1792

819 A Review and Comparative Analysis on Cluster Ensemble Methods

Authors: S. Sarumathi, P. Ranjetha, C. Saraswathy, M. Vaishnavi, S. Geetha

Abstract:

Clustering is an unsupervised learning technique for aggregating data objects into meaningful classes so that intra cluster similarity is maximized and inter cluster similarity is minimized in data mining. However, no single clustering algorithm proves to be the most effective in producing the best result. As a result, a new challenging technique known as the cluster ensemble approach has blossomed in order to determine the solution to this problem. For the cluster analysis issue, this new technique is a successful approach. The cluster ensemble's main goal is to combine similar clustering solutions in a way that achieves the precision while also improving the quality of individual data clustering. Because of the massive and rapid creation of new approaches in the field of data mining, the ongoing interest in inventing novel algorithms necessitates a thorough examination of current techniques and future innovation. This paper presents a comparative analysis of various cluster ensemble approaches, including their methodologies, formal working process, and standard accuracy and error rates. As a result, the society of clustering practitioners will benefit from this exploratory and clear research, which will aid in determining the most appropriate solution to the problem at hand.

Keywords: Clustering, cluster ensemble methods, consensus function, data mining, unsupervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 736

818 Improving Classification Accuracy with Discretization on Datasets Including Continuous Valued Features

Authors: Mehmet Hacibeyoglu, Ahmet Arslan, Sirzat Kahramanli

Abstract:

This study analyzes the effect of discretization on classification of datasets including continuous valued features. Six datasets from UCI which containing continuous valued features are discretized with entropy-based discretization method. The performance improvement between the dataset with original features and the dataset with discretized features is compared with k-nearest neighbors, Naive Bayes, C4.5 and CN2 data mining classification algorithms. As the result the classification accuracies of the six datasets are improved averagely by 1.71% to 12.31%.

Keywords: Data mining classification algorithms, entropy-baseddiscretization method

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2419

817 Cirrhosis Mortality Prediction as Classification Using Frequent Subgraph Mining

Authors: Abdolghani Ebrahimi, Diego Klabjan, Chenxi Ge, Daniela Ladner, Parker Stride

Abstract:

In this work, we use machine learning and data analysis techniques to predict the one-year mortality of cirrhotic patients. Data from 2,322 patients with liver cirrhosis are collected at a single medical center. Different machine learning models are applied to predict one-year mortality. A comprehensive feature space including demographic information, comorbidity, clinical procedure and laboratory tests is being analyzed. A temporal pattern mining technic called Frequent Subgraph Mining (FSM) is being used. Model for End-stage liver disease (MELD) prediction of mortality is used as a comparator. All of our models statistically significantly outperform the MELD-score model and show an average 10% improvement of the area under the curve (AUC). The FSM technic itself does not improve the model significantly, but FSM, together with a machine learning technique called an ensemble, further improves the model performance. With the abundance of data available in healthcare through electronic health records (EHR), existing predictive models can be refined to identify and treat patients at risk for higher mortality. However, due to the sparsity of the temporal information needed by FSM, the FSM model does not yield significant improvements. Our work applies modern machine learning algorithms and data analysis methods on predicting one-year mortality of cirrhotic patients and builds a model that predicts one-year mortality significantly more accurate than the MELD score. We have also tested the potential of FSM and provided a new perspective of the importance of clinical features.

Keywords: machine learning, liver cirrhosis, subgraph mining, supervised learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 380

816 A Comparative Study of GTC and PSP Algorithms for Mining Sequential Patterns Embedded in Database with Time Constraints

Authors: Safa Adi

Abstract:

This paper will consider the problem of sequential mining patterns embedded in a database by handling the time constraints as defined in the GSP algorithm (level wise algorithms). We will compare two previous approaches GTC and PSP, that resumes the general principles of GSP. Furthermore this paper will discuss PG-hybrid algorithm, that using PSP and GTC. The results show that PSP and GTC are more efficient than GSP. On the other hand, the GTC algorithm performs better than PSP. The PG-hybrid algorithm use PSP algorithm for the two first passes on the database, and GTC approach for the following scans. Experiments show that the hybrid approach is very efficient for short, frequent sequences.

Keywords: Database, GTC algorithm, PSP algorithm, sequential patterns, time constraints.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 646

815 Development of Online Islamic Medication Expert System (OIMES)

Authors: Hanita Daud, Noorhana Yahya, Low Tan Jung, Azizuddin Abd Aziz, Rabibah Ahmad Samawe

Abstract:

This paper presents an overview of the design and implementation of an online rule-based Expert Systems for Islamic medication. T his Online Islamic Medication Expert System (OIMES) focuses on physical illnesses only. Knowledge base of this Expert System contains exhaustively the types of illness together with their related cures or treatments/therapies, obtained exclusively from the Quran and Hadith. Extensive research and study are conducted to ensure that the Expert System is able to provide the most suitable treatment with reference to the relevant verses cited in Quran or Hadith. These verses come together with their related 'actions' (bodily actions/gestures or some acts) to be performed by the patient to treat a particular illness/sickness. These verses and the instructions for the 'actions' are to be displayed unambiguously on the computer screen. The online platform provides the advantage for patient getting treatment practically anytime and anywhere as long as the computer and Internet facility exist. Patient does not need to make appointment to see an expert for a therapy.

Keywords: Expert System, Quran and Hadith, Islamic Medication, Rule-Based.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1801

814 Media Regulation and Public Sphere in the Digital Age: An Analysis in the Light of Constructive Democracy

Authors: J. Bolzan, C. Marden

Abstract:

The article proposed intends to analyze the possibility (and conditions) of a media regulation law in a democratic rule of law in the twenty-first century. To do so, will be presented initially the idea of the public sphere (by Jürgen Habermas), showing how it is presented as an interface between the citizen and the state (or the private and public) and how important is it in a deliberative democracy. Based on this paradigm, the traditional perception of the role of public information (such as system functional element) and on the possibility of media regulation will be exposed, due to the public nature of their activity. A critical argument will then be displayed from two different perspectives: a) the formal function of the current media information, considering that the digital age has fragmented the information access; b) the concept of a constructive democracy, which reduces the need for representation, changing the strategic importance of the public sphere. The question to be addressed (based on the comparative law) is if the regulation is justified in a polycentric democracy, especially when it operates under the digital age (with immediate and virtual communication). The proposal is to be presented in the sense that even in a twenty-first century the media in a democratic rule of law still has an extremely important role and may be subject to regulation, but this should be on terms very different (and narrower) from those usually defended.

Keywords: Media regulation, public sphere, digital age, constructive democracy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2399

813 Response of the Residential Building Structureon Load Technical Seismicity due to Mining Activities

Authors: V. Salajka, Z. Kaláb, J. Kala, P. Hradil

Abstract:

In the territories where high-intensity earthquakes are frequent is paid attention to the solving of the seismic problems. In the paper are described two computational model variants based on finite element method of the construction with different subsoil simulation (rigid or elastic subsoil) is used. For simulation and calculations program system based on method final elements ANSYS was used. Seismic responses calculations of residential building structure were effected on loading characterized by accelerogram for comparing with the responses spectra method.

Keywords: Accelerogram, ANSYS, mining induced seismic, residential building structure, spectra, subsoil.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1482

812 Association of the p53 Codon 72 Polymorphism with Colorectal Cancer in South West of Iran

Authors: A. Doosti, P. Ghasemi Dehkordi, M. Zamani, S. Taheri, M. Banitalebi, M. Mahmoudzadeh

Abstract:

The p53 tumor suppressor gene plays two important roles in genomic stability: blocking cell proliferation after DNA damage until it has been repaired, and starting apoptosis if the damage is too critical. Codon 72 exon4 polymorphism (Arg72Pro) of the P53 gene has been implicated in cancer risk. Various studies have been done to investigate the status of p53 at codon 72 for arginine (Arg) and proline (Pro) alleles in different populations and also the association of this codon 72 polymorphism with various tumors. Our objective was to investigate the possible association between P53 Arg72Pro polymorphism and susceptibility to colorectal cancer among Isfahan and Chaharmahal Va Bakhtiari (a part of south west of Iran) population. We investigated the status of p53 at codon 72 for Arg/Arg, Arg/Pro and Pro/Pro allele polymorphisms in blood samples from 145 colorectal cancer patients and 140 controls by Nested-PCR of p53 exon 4 and digestion with BstUI restriction enzyme and the DNA fragments were then resolved by electrophoresis in 2% agarose gel. The Pro allele was 279 bp, while the Arg allele was restricted into two fragments of 160 and 119 bp. Among the 145 colorectal cancer cases 49 cases (33.79%) were homozygous for the Arg72 allele (Arg/Arg), 18 cases (12.41%) were homozygous for the Pro72 allele (Pro/Pro) and 78 cases (53.8%) found in heterozygous (Arg/Pro). In conclusion, it can be said that p53Arg/Arg genotype may be correlated with possible increased risk of this kind of cancers in south west of Iran.

Keywords: TP53, Polymorphism, Colorectal Cancer, Iran

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2340

811 A Decision Support System for Predicting Hospitalization of Hemodialysis Patients

Authors: Jinn-Yi Yeh, Tai-Hsi Wu

Abstract:

Hemodialysis patients might suffer from unhealthy care behaviors or long-term dialysis treatments. Ultimately they need to be hospitalized. If the hospitalization rate of a hemodialysis center is high, its quality of service would be low. Therefore, how to decrease hospitalization rate is a crucial problem for health care. In this study we combined temporal abstraction with data mining techniques for analyzing the dialysis patients' biochemical data to develop a decision support system. The mined temporal patterns are helpful for clinicians to predict hospitalization of hemodialysis patients and to suggest them some treatments immediately to avoid hospitalization.

Keywords: Hemodialysis, Temporal abstract, Data mining, Healthcare quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1692