Search results for: Association Rule Mining.
790 Hormones and Mineral Elements Associated with Osteoporosis in Postmenopausal Women in Eastern Slovakia
Authors: M. Mydlárová Blaščáková, J. Poráčová, Z. Tomková, Ľ. Blaščáková, M. Nagy, M. Konečná, E. Petrejčíková, Z. Gogaľová, V. Sedlák, J. Mydlár, M. Zahatňanská, K. Hricová
Abstract:
Osteoporosis is a multifactorial disease that results in reduced quality of life, causes decreased bone strength, and changes in their microarchitecture. Mostly postmenopausal women are at risk. In our study, we measured anthropometric parameters of postmenopausal women (104 women of control group – CG and 105 women of osteoporotic group - OG) and determined TSH hormone levels and PTH as well as mineral elements - Ca, P, Mg and enzyme alkaline phosphatase. Through the correlation analysis in CG, we have found association based on age and BMI, P and Ca, as well as Mg and Ca; in OG we determined interdependence based on an association of age and BMI, age and Ca. Using the Student's t test, we found significantly important differences in biochemical parameters of Mg (p ˂ 0,001) and TSH (p ˂ 0,05) between CG and OG.
Keywords: Factors, bone mass density, Central Europe, biomarkers.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 877789 A Novel Approach to Optimal Cutting Tool Replacement
Authors: Cem Karacal, Sohyung Cho, William Yu
Abstract:
In metal cutting industries, mathematical/statistical models are typically used to predict tool replacement time. These off-line methods usually result in less than optimum replacement time thereby either wasting resources or causing quality problems. The few online real-time methods proposed use indirect measurement techniques and are prone to similar errors. Our idea is based on identifying the optimal replacement time using an electronic nose to detect the airborne compounds released when the tool wear reaches to a chemical substrate doped into tool material during the fabrication. The study investigates the feasibility of the idea, possible doping materials and methods along with data stream mining techniques for detection and monitoring different phases of tool wear.Keywords: Tool condition monitoring, cutting tool replacement, data stream mining, e-Nose.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1882788 On the Use of Correlated Binary Model in Social Network Analysis
Authors: Elsayed A. Habib Elamir
Abstract:
In social network analysis the mean nodal degree and density of the graph can be considered as a measure of the activity of all actors in the network and this is an important property of a graph and for making comparisons among networks. Since subjects in a family or organization are subject to common environment factors, it is prime interest to study the association between responses. Therefore, we study the distribution of the mean nodal degree and density of the graph under correlated binary units. The cross product ratio is used to capture the intra-units association among subjects. Computer program and an application are given to show the benefits of the method.Keywords: Correlated Binary data, cross product ratio, densityof the graph, multiplicative binomial distribution.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1451787 Road Traffic Accidents Analysis in Mexico City through Crowdsourcing Data and Data Mining Techniques
Authors: Gabriela V. Angeles Perez, Jose Castillejos Lopez, Araceli L. Reyes Cabello, Emilio Bravo Grajales, Adriana Perez Espinosa, Jose L. Quiroz Fabian
Abstract:
Road traffic accidents are among the principal causes of traffic congestion, causing human losses, damages to health and the environment, economic losses and material damages. Studies about traditional road traffic accidents in urban zones represents very high inversion of time and money, additionally, the result are not current. However, nowadays in many countries, the crowdsourced GPS based traffic and navigation apps have emerged as an important source of information to low cost to studies of road traffic accidents and urban congestion caused by them. In this article we identified the zones, roads and specific time in the CDMX in which the largest number of road traffic accidents are concentrated during 2016. We built a database compiling information obtained from the social network known as Waze. The methodology employed was Discovery of knowledge in the database (KDD) for the discovery of patterns in the accidents reports. Furthermore, using data mining techniques with the help of Weka. The selected algorithms was the Maximization of Expectations (EM) to obtain the number ideal of clusters for the data and k-means as a grouping method. Finally, the results were visualized with the Geographic Information System QGIS.Keywords: Data mining, K-means, road traffic accidents, Waze, Weka.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1215786 Detecting Interactions between Behavioral Requirements with OWL and SWRL
Authors: Haibo Hu, Dan Yang, Chunxiao Ye, Chunlei Fu, Ren Li
Abstract:
High quality requirements analysis is one of the most crucial activities to ensure the success of a software project, so that requirements verification for software system becomes more and more important in Requirements Engineering (RE) and it is one of the most helpful strategies for improving the quality of software system. Related works show that requirement elicitation and analysis can be facilitated by ontological approaches and semantic web technologies. In this paper, we proposed a hybrid method which aims to verify requirements with structural and formal semantics to detect interactions. The proposed method is twofold: one is for modeling requirements with the semantic web language OWL, to construct a semantic context; the other is a set of interaction detection rules which are derived from scenario-based analysis and represented with semantic web rule language (SWRL). SWRL based rules are working with rule engines like Jess to reason in semantic context for requirements thus to detect interactions. The benefits of the proposed method lie in three aspects: the method (i) provides systematic steps for modeling requirements with an ontological approach, (ii) offers synergy of requirements elicitation and domain engineering for knowledge sharing, and (3)the proposed rules can systematically assist in requirements interaction detection.Keywords: Requirements Engineering, Semantic Web, OWL, Requirements Interaction Detection, SWRL.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1797785 Design of Personal Job Recommendation Framework on Smartphone Platform
Authors: Chayaporn Kaensar
Abstract:
Recently, Job Recommender Systems have gained much attention in industries since they solve the problem of information overload on the recruiting website. Therefore, we proposed Extended Personalized Job System that has the capability of providing the appropriate jobs for job seeker and recommending some suitable information for them using Data Mining Techniques and Dynamic User Profile. On the other hands, company can also interact to the system for publishing and updating job information. This system have emerged and supported various platforms such as web application and android mobile application. In this paper, User profiles, Implicit User Action, User Feedback, and Clustering Techniques in WEKA libraries were applied and implemented. In additions, open source tools like Yii Web Application Framework, Bootstrap Front End Framework and Android Mobile Technology were also applied.Keywords: Recommendation, user profile, data mining, web technology, mobile technology.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2151784 Apolipoprotein E Gene Polymorphism and Its Association with Cardiovascular Heart Disease Risk Factors in Type 2 Diabetes Mellitus
Authors: Amani Ashari, Julia Omar, Arif Hashim, Shahrul Hamid
Abstract:
Apolipoprotein E (APOE) gene polymorphism has influence on serum lipids which relates to cardiovascular risk. The purpose of this study was to determine the frequency distribution of APOE alleles among Malaysian Type 2 Diabetes Mellitus (DM) patients with and without coronary artery disease (CAD) and their association with serum lipid profiles. A total of 115 patients were recruited in which 78 patients had Type 2 DM without CAD and 37 patients had Type 2 DM with CAD. The APOE polymorphism was detected by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP). The APOE ɛ3 allele was the most common one in both groups. There was no significant association between the APOE genotypes and the CAD status in Type 2 DM using Pearson χ2 test. Further analysis indicated there were no significant differences in all lipid parameters between E2, E3 and E4 subgroups in both groups. The study showed that the E4 allele carriers of Type 2 DM with CAD patients had higher LDL-C level and lower HDL-C level compared to the other allele carriers. However, analyses showed these levels were not statistically different. The study also showed that the Type 2 DM with CAD group with E2 allele had higher triglyceride (TG). In conclusion, further study with larger sample size is needed to confirm role of E4 as a marker of CAD among Type 2 DM patients in Malaysian population.
Keywords: Apolipoprotein E, diabetes mellitus, cardiovascular disease, lipids.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1263783 Feature Selection Approaches with Missing Values Handling for Data Mining - A Case Study of Heart Failure Dataset
Authors: N.Poolsawad, C.Kambhampati, J. G. F. Cleland
Abstract:
In this paper, we investigated the characteristic of a clinical dataseton the feature selection and classification measurements which deal with missing values problem.And also posed the appropriated techniques to achieve the aim of the activity; in this research aims to find features that have high effect to mortality and mortality time frame. We quantify the complexity of a clinical dataset. According to the complexity of the dataset, we proposed the data mining processto cope their complexity; missing values, high dimensionality, and the prediction problem by using the methods of missing value replacement, feature selection, and classification.The experimental results will extend to develop the prediction model for cardiology.Keywords: feature selection, missing values, classification, clinical dataset, heart failure.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3211782 Seasonal Variation of the Impact of Mining Activities on Ga-Selati River in Limpopo Province, South Africa
Authors: Joshua N. Edokpayi, John O. Odiyo, Patience P. Shikwambana
Abstract:
Water is a very rare natural resource in South Africa. Ga-Selati River is used for both domestic and industrial purposes. This study was carried out in order to assess the quality of Ga-Selati River in a mining area of Limpopo Province-Phalaborwa. The pH, Electrical Conductivity (EC) and Total Dissolved Solids (TDS) were determined using a Crinson multimeter while turbidity was measured using a Labcon Turbidimeter. The concentrations of Al, Ca, Cd, Cr, Fe, K, Mg, Mn, Na and Pb were analysed in triplicate using a Varian 520 flame atomic absorption spectrometer (AAS) supplied by PerkinElmer, after acid digestion with nitric acid in a fume cupboard. The average pH of the river from eight different sampling sites was 8.00 and 9.38 in wet and dry season respectively. Higher EC values were determined in the dry season (138.7 mS/m) than in the wet season (96.93 mS/m). Similarly, TDS values were higher in dry (929.29 mg/L) than in the wet season (640.72 mg/L) season. These values exceeded the recommended guideline of South Africa Department of Water Affairs and Forestry (DWAF) for domestic water use (70 mS/m) and that of the World Health Organization (WHO) (600 mS/m), respectively. Turbidity varied between 1.78-5.20 and 0.95-2.37 NTU in both wet and dry seasons. Total hardness of 312.50 mg/L and 297.75 mg/L as the concentration of CaCO3 was computed for the river in both the wet and the dry seasons and the river water was categorised as very hard. Mean concentration of the metals studied in both the wet and the dry seasons are: Na (94.06 mg/L and 196.3 mg/L), K (11.79 mg/L and 13.62 mg/L), Ca (45.60 mg/L and 41.30 mg/L), Mg (48.41 mg/L and 44.71 mg/L), Al (0.31 mg/L and 0.38 mg/L), Cd (0.01 mg/L and 0.01 mg/L), Cr (0.02 mg/L and 0.09 mg/L), Pb (0.05 mg/L and 0.06 mg/L), Mn (0.31 mg/L and 0.11 mg/L) and Fe (0.76 mg/L and 0.69 mg/L). Results from this study reveal that most of the metals were present in concentrations higher than the recommended guidelines of DWAF and WHO for domestic use and the protection of aquatic life.Keywords: Contamination, mining activities, surface water, trace metals.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1986781 Analysis of Secondary School Students’ Perceptions about Information Technologies through a Word Association Test
Authors: Fetah Eren, Ismail Sahin, Ismail Celik, Ahmet Oguz Akturk
Abstract:
The aim of this study is to discover secondary school students’ perceptions related to information technologies and the connections between concepts in their cognitive structures. A word association test consisting of six concepts related to information technologies is used to collect data from 244 secondary school students. Concept maps that present students’ cognitive structures are drawn with the help of frequency data. Data are analyzed and interpreted according to the connections obtained as a result of the concept maps. It is determined students associate most with these concepts—computer, Internet, and communication of the given concepts, and associate least with these concepts—computer-assisted education and information technologies. These results show the concepts, Internet, communication, and computer, are an important part of students’ cognitive structures. In addition, students mostly answer computer, phone, game, Internet and Facebook as the key concepts. These answers show students regard information technologies as a means for entertainment and free time activity, not as a means for education.
Keywords: Word association test, cognitive structure, information technology, secondary school.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2079780 Efficient Implementation of Serial and Parallel Support Vector Machine Training with a Multi-Parameter Kernel for Large-Scale Data Mining
Authors: Tatjana Eitrich, Bruno Lang
Abstract:
This work deals with aspects of support vector learning for large-scale data mining tasks. Based on a decomposition algorithm that can be run in serial and parallel mode we introduce a data transformation that allows for the usage of an expensive generalized kernel without additional costs. In order to speed up the decomposition algorithm we analyze the problem of working set selection for large data sets and analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our modifications and settings lead to improvement of support vector learning performance and thus allow using extensive parameter search methods to optimize classification accuracy.
Keywords: Support Vector Machines, Shared Memory Parallel Computing, Large Data
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1577779 Integration of Educational Data Mining Models to a Web-Based Support System for Predicting High School Student Performance
Authors: Sokkhey Phauk, Takeo Okazaki
Abstract:
The challenging task in educational institutions is to maximize the high performance of students and minimize the failure rate of poor-performing students. An effective method to leverage this task is to know student learning patterns with highly influencing factors and get an early prediction of student learning outcomes at the timely stage for setting up policies for improvement. Educational data mining (EDM) is an emerging disciplinary field of data mining, statistics, and machine learning concerned with extracting useful knowledge and information for the sake of improvement and development in the education environment. The study is of this work is to propose techniques in EDM and integrate it into a web-based system for predicting poor-performing students. A comparative study of prediction models is conducted. Subsequently, high performing models are developed to get higher performance. The hybrid random forest (Hybrid RF) produces the most successful classification. For the context of intervention and improving the learning outcomes, a feature selection method MICHI, which is the combination of mutual information (MI) and chi-square (CHI) algorithms based on the ranked feature scores, is introduced to select a dominant feature set that improves the performance of prediction and uses the obtained dominant set as information for intervention. By using the proposed techniques of EDM, an academic performance prediction system (APPS) is subsequently developed for educational stockholders to get an early prediction of student learning outcomes for timely intervention. Experimental outcomes and evaluation surveys report the effectiveness and usefulness of the developed system. The system is used to help educational stakeholders and related individuals for intervening and improving student performance.
Keywords: Academic performance prediction system, prediction model, educational data mining, dominant factors, feature selection methods, student performance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 975778 Mining User-Generated Contents to Detect Service Failures with Topic Model
Authors: Kyung Bae Park, Sung Ho Ha
Abstract:
Online user-generated contents (UGC) significantly change the way customers behave (e.g., shop, travel), and a pressing need to handle the overwhelmingly plethora amount of various UGC is one of the paramount issues for management. However, a current approach (e.g., sentiment analysis) is often ineffective for leveraging textual information to detect the problems or issues that a certain management suffers from. In this paper, we employ text mining of Latent Dirichlet Allocation (LDA) on a popular online review site dedicated to complaint from users. We find that the employed LDA efficiently detects customer complaints, and a further inspection with the visualization technique is effective to categorize the problems or issues. As such, management can identify the issues at stake and prioritize them accordingly in a timely manner given the limited amount of resources. The findings provide managerial insights into how analytics on social media can help maintain and improve their reputation management. Our interdisciplinary approach also highlights several insights by applying machine learning techniques in marketing research domain. On a broader technical note, this paper illustrates the details of how to implement LDA in R program from a beginning (data collection in R) to an end (LDA analysis in R) since the instruction is still largely undocumented. In this regard, it will help lower the boundary for interdisciplinary researcher to conduct related research.Keywords: Latent Dirichlet allocation, R program, text mining, topic model, user generated contents, visualization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1216777 An Architectural Study on the Railway Station Buildings in Malaysia during British Era, 1885-1957
Authors: Nor Hafizah Anuar, M. Gul Akdeniz
Abstract:
This paper attempted on emphasize on the station buildings façade elements. Station buildings were essential part of the transportation that reflected the technology. Comparative analysis on architectural styles will also be made between the railway station buildings of Malaysia and any railway station buildings which have similarities. The Malay Peninsula which is strategically situated between the Straits of Malacca and the South China Sea makes it an ideal location for trade. Malacca became an important trading port whereby merchants from around the world stopover to exchange various products. The Portuguese ruled Malacca for 130 years (1511–1641) and for the next century and a half (1641–1824), the Dutch endeavoured to maintain an economic monopoly along the coasts of Malaya. Malacca came permanently under British rule under the Anglo-Dutch Treaty, 1824. Up to Malaysian independence in 1957, Malaya saw a great influx of Chinese and Indian migrants as workers to support its growing industrial needs facilitated by the British. The growing tin ore mining and rubber industry resulted as the reason of the development of the railways as urgency to transport it from one place to another. The existence of railway transportation becomes more significant when the city started to bloom and the British started to build grandeur buildings that have different functions; administrative buildings, town and city halls, railway stations, public works department, courts, and post offices.
Keywords: Malaysia, railway station, architectural design, façade elements.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1663776 Feature Based Unsupervised Intrusion Detection
Authors: Deeman Yousif Mahmood, Mohammed Abdullah Hussein
Abstract:
The goal of a network-based intrusion detection system is to classify activities of network traffics into two major categories: normal and attack (intrusive) activities. Nowadays, data mining and machine learning plays an important role in many sciences; including intrusion detection system (IDS) using both supervised and unsupervised techniques. However, one of the essential steps of data mining is feature selection that helps in improving the efficiency, performance and prediction rate of proposed approach. This paper applies unsupervised K-means clustering algorithm with information gain (IG) for feature selection and reduction to build a network intrusion detection system. For our experimental analysis, we have used the new NSL-KDD dataset, which is a modified dataset for KDDCup 1999 intrusion detection benchmark dataset. With a split of 60.0% for the training set and the remainder for the testing set, a 2 class classifications have been implemented (Normal, Attack). Weka framework which is a java based open source software consists of a collection of machine learning algorithms for data mining tasks has been used in the testing process. The experimental results show that the proposed approach is very accurate with low false positive rate and high true positive rate and it takes less learning time in comparison with using the full features of the dataset with the same algorithm.
Keywords: Information Gain (IG), Intrusion Detection System (IDS), K-means Clustering, Weka.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2776775 Using Copulas to Measure Association between Air Pollution and Respiratory Diseases
Authors: Snezhana P. Kostova, Krassi V. Rumchev, Todor Vlaev, Silviya B. Popova
Abstract:
Air pollution is still considered as one of the major environmental and health issues. There is enough research evidence to show a strong relationship between exposure to air contaminants and respiratory illnesses among children and adults. In this paper we used the Copula approach to study a potential relationship between selected air pollutants (PM10 and NO2) and hospital admissions for respiratory diseases. Kendall-s tau and Spearman-s rho rank correlation coefficients are calculated and used in Copula method. This paper demonstrates that copulas can be used to provide additional information as a measure of an association when compared to the standard correlation coefficients. The results find a significant correlation between the selected air pollutants and hospital admissions for most of the selected respiratory illnesses.Keywords: Air pollution, Copula, Respiratory Health.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1827774 A Supervised Learning Data Mining Approach for Object Recognition and Classification in High Resolution Satellite Data
Authors: Mais Nijim, Rama Devi Chennuboyina, Waseem Al Aqqad
Abstract:
Advances in spatial and spectral resolution of satellite images have led to tremendous growth in large image databases. The data we acquire through satellites, radars, and sensors consists of important geographical information that can be used for remote sensing applications such as region planning, disaster management. Spatial data classification and object recognition are important tasks for many applications. However, classifying objects and identifying them manually from images is a difficult task. Object recognition is often considered as a classification problem, this task can be performed using machine-learning techniques. Despite of many machine-learning algorithms, the classification is done using supervised classifiers such as Support Vector Machines (SVM) as the area of interest is known. We proposed a classification method, which considers neighboring pixels in a region for feature extraction and it evaluates classifications precisely according to neighboring classes for semantic interpretation of region of interest (ROI). A dataset has been created for training and testing purpose; we generated the attributes by considering pixel intensity values and mean values of reflectance. We demonstrated the benefits of using knowledge discovery and data-mining techniques, which can be on image data for accurate information extraction and classification from high spatial resolution remote sensing imagery.Keywords: Remote sensing, object recognition, classification, data mining, waterbody identification, feature extraction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2053773 A 10 Giga VPN Accelerator Board for Trust Channel Security System
Authors: Ki Hyun Kim, Jang-Hee Yoo, Kyo Il Chung
Abstract:
This paper proposes a VPN Accelerator Board (VPN-AB), a virtual private network (VPN) protocol designed for trust channel security system (TCSS). TCSS supports safety communication channel between security nodes in internet. It furnishes authentication, confidentiality, integrity, and access control to security node to transmit data packets with IPsec protocol. TCSS consists of internet key exchange block, security association block, and IPsec engine block. The internet key exchange block negotiates crypto algorithm and key used in IPsec engine block. Security Association blocks setting-up and manages security association information. IPsec engine block treats IPsec packets and consists of networking functions for communication. The IPsec engine block should be embodied by H/W and in-line mode transaction for high speed IPsec processing. Our VPN-AB is implemented with high speed security processor that supports many cryptographic algorithms and in-line mode. We evaluate a small TCSS communication environment, and measure a performance of VPN-AB in the environment. The experiment results show that VPN-AB gets a performance throughput of maximum 15.645Gbps when we set the IPsec protocol with 3DES-HMAC-MD5 tunnel mode.Keywords: TCSS(Trust Channel Security System), VPN(VirtualPrivate Network), IPsec, SSL, Security Processor, Securitycommunication.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2099772 Content Based Sampling over Transactional Data Streams
Authors: Mansour Tarafdar, Mohammad Saniee Abade
Abstract:
This paper investigates the problem of sampling from transactional data streams. We introduce CFISDS as a content based sampling algorithm that works on a landmark window model of data streams and preserve more informed sample in sample space. This algorithm that work based on closed frequent itemset mining tasks, first initiate a concept lattice using initial data, then update lattice structure using an incremental mechanism.Incremental mechanism insert, update and delete nodes in/from concept lattice in batch manner. Presented algorithm extracts the final samples on demand of user. Experimental results show the accuracy of CFISDS on synthetic and real datasets, despite on CFISDS algorithm is not faster than exist sampling algorithms such as Z and DSS.
Keywords: Sampling, data streams, closed frequent item set mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1709771 Bullies and Their Mothers: Who Influence Whom?
Authors: Kostas A. Fanti, Stelios Georgiou
Abstract:
Even though most researchers would agree that in symbiotic relationships, like the one between parent and child, influences become reciprocal over time, empirical evidence supporting this claim is limited. The aim of the current study was to develop and test a model describing the reciprocal influence between characteristics of the parent-child relationship, such as closeness and conflict, and the child-s bullying and victimization experiences at school. The study used data from the longitudinal Study of Early Child-Care, conducted by the National Institute of Child Health and Human Development. The participants were dyads of early adolescents (5th and 6th graders during the two data collection waves) and their mothers (N=1364). Supporting our hypothesis, the findings suggested a reciprocal association between bullying and positive parenting, although this association was only significant for boys. Victimization and positive parenting were not significantly interrelated.Keywords: bullying, parenting, reciprocal associations, victimization
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1643770 Mining Implicit Knowledge to Predict Political Risk by Providing Novel Framework with Using Bayesian Network
Authors: Siavash Asadi Ghajarloo
Abstract:
Nowadays predicting political risk level of country has become a critical issue for investors who intend to achieve accurate information concerning stability of the business environments. Since, most of the times investors are layman and nonprofessional IT personnel; this paper aims to propose a framework named GECR in order to help nonexpert persons to discover political risk stability across time based on the political news and events. To achieve this goal, the Bayesian Networks approach was utilized for 186 political news of Pakistan as sample dataset. Bayesian Networks as an artificial intelligence approach has been employed in presented framework, since this is a powerful technique that can be applied to model uncertain domains. The results showed that our framework along with Bayesian Networks as decision support tool, predicted the political risk level with a high degree of accuracy.Keywords: Bayesian Networks, Data mining, GECRframework, Predicting political risk.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2174769 Appraisal of Methods for Identifying, Mapping, and Modelling of Fluvial Erosion in a Mining Environment
Authors: F. F. Howard, I. Yakubu, C. B. Boye, J. S. Y. Kuma
Abstract:
Natural and human activities, such as mining operations, expose the natural soil to adverse environmental conditions, leading to contamination of soil, groundwater, and surface water, which has negative effects on humans, flora, and fauna. Bare or partly exposed soil is most liable to fluvial erosion. This paper enumerates various methods used to identify, map, and model fluvial erosion in a mining environment. Classical, Artificial Intelligence (AI), and GIS methods have been reviewed. One of the many classical methods used to estimate river erosion is the Revised Universal Soil Loss Equation (RUSLE) model. The RUSLE model is easy to use. Its reliance on empirical relationships that may not always be applicable to specific circumstances or locations is a flaw. Other classical models for estimating fluvial erosion are the Soil and Water Assessment Tool (SWAT) and the Universal Soil Loss Equation (USLE). These models offer a more complete understanding of the underlying physical processes and encompass a wider range of situations. Although more difficult to utilise, they depend on the availability and dependability of input data for correctness. AI can help deal with multivariate and complex difficulties and predict soil loss with higher accuracy than traditional methods, and also be used to build unique models for identifying degraded areas. AI techniques have become popular as an alternative predictor for degraded environments. However, this research proposed a hybrid of classical, AI, and GIS methods for efficient and effective modelling of fluvial erosion.
Keywords: Fluvial erosion, classical methods, Artificial Intelligence, Geographic Information System.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 185768 Remarks on Some Properties of Decision Rules
Authors: Songlin Yang, Ying Ge
Abstract:
This paper shows that some properties of the decision rules in the literature do not hold by presenting a counterexample. We give sufficient and necessary conditions under which these properties are valid. These results will be helpful when one tries to choose the right decision rules in the research of rough set theory.Keywords: set, Decision table, Decision rule, coverage factor.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1413767 The Use of Classifiers in Image Analysis of Oil Wells Profiling Process and the Automatic Identification of Events
Authors: Jaqueline M. R. Vieira
Abstract:
Different strategies and tools are available at the oil and gas industry for detecting and analyzing tension and possible fractures in borehole walls. Most of these techniques are based on manual observation of the captured borehole images. While this strategy may be possible and convenient with small images and few data, it may become difficult and suitable to errors when big databases of images must be treated. While the patterns may differ among the image area, depending on many characteristics (drilling strategy, rock components, rock strength, etc.). In this work we propose the inclusion of data-mining classification strategies in order to create a knowledge database of the segmented curves. These classifiers allow that, after some time using and manually pointing parts of borehole images that correspond to tension regions and breakout areas, the system will indicate and suggest automatically new candidate regions, with higher accuracy. We suggest the use of different classifiers methods, in order to achieve different knowledge dataset configurations.
Keywords: Brazil, classifiers, data-mining, Image Segmentation, oil well visualization, classifiers.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2544766 Q-Map: Clinical Concept Mining from Clinical Documents
Authors: Sheikh Shams Azam, Manoj Raju, Venkatesh Pagidimarri, Vamsi Kasivajjala
Abstract:
Over the past decade, there has been a steep rise in the data-driven analysis in major areas of medicine, such as clinical decision support system, survival analysis, patient similarity analysis, image analytics etc. Most of the data in the field are well-structured and available in numerical or categorical formats which can be used for experiments directly. But on the opposite end of the spectrum, there exists a wide expanse of data that is intractable for direct analysis owing to its unstructured nature which can be found in the form of discharge summaries, clinical notes, procedural notes which are in human written narrative format and neither have any relational model nor any standard grammatical structure. An important step in the utilization of these texts for such studies is to transform and process the data to retrieve structured information from the haystack of irrelevant data using information retrieval and data mining techniques. To address this problem, the authors present Q-Map in this paper, which is a simple yet robust system that can sift through massive datasets with unregulated formats to retrieve structured information aggressively and efficiently. It is backed by an effective mining technique which is based on a string matching algorithm that is indexed on curated knowledge sources, that is both fast and configurable. The authors also briefly examine its comparative performance with MetaMap, one of the most reputed tools for medical concepts retrieval and present the advantages the former displays over the latter.Keywords: Information retrieval (IR), unified medical language system (UMLS), Syntax Based Analysis, natural language processing (NLP), medical informatics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 779765 Systematic Identification and Quantification of Substrate Specificity Determinants in Human Protein Kinases
Authors: Manuel A. Alonso-Tarajano, Roberto Mosca, Patrick Aloy
Abstract:
Protein kinases participate in a myriad of cellular processes of major biomedical interest. The in vivo substrate specificity of these enzymes is a process determined by several factors, and despite several years of research on the topic, is still far from being totally understood. In the present work, we have quantified the contributions to the kinase substrate specificity of i) the phosphorylation sites and their surrounding residues in the sequence and of ii) the association of kinases to adaptor or scaffold proteins. We have used position-specific scoring matrices (PSSMs), to represent the stretches of sequences phosphorylated by 93 families of kinases. We have found negative correlations between the number of sequences from which a PSSM is generated and the statistical significance and the performance of that PSSM. Using a subset of 22 statistically significant PSSMs, we have identified specificity determinant residues (SDRs) for 86% of the corresponding kinase families. Our results suggest that different SDRs can function as positive or negative elements of substrate recognition by the different families of kinases. Additionally, we have found that human proteins with known function as adaptors or scaffolds (kAS) tend to interact with a significantly large fraction of the substrates of the kinases to which they associate. Based on this characteristic we have identified a set of 279 potential adaptors/scaffolds (pAS) for human kinases, which is enriched in Pfam domains and functional terms tightly related to the proposed function. Moreover, our results show that for 74.6% of the kinase–pAS association found, the pAS colocalize with the substrates of the kinases they are associated to. Finally, we have found evidence suggesting that the association of kinases to adaptors and scaffolds, may contribute significantly to diminish the in vivo substrate crossed-specificity of protein kinases. In general, our results indicate the relevance of several SDRs for both the positive and negative selection of phosphorylation sites by kinase families and also suggest that the association of kinases to pAS proteins may be an important factor for the localization of the enzymes with their set of substrates.
Keywords: Kinase, phosphorylation, substrate specificity, adaptors, scaffolds, cellular colocalization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1535764 Genetic-based Anomaly Detection in Logs of Process Aware Systems
Authors: Hanieh Jalali, Ahmad Baraani
Abstract:
Nowaday-s, many organizations use systems that support business process as a whole or partially. However, in some application domains, like software development and health care processes, a normative Process Aware System (PAS) is not suitable, because a flexible support is needed to respond rapidly to new process models. On the other hand, a flexible Process Aware System may be vulnerable to undesirable and fraudulent executions, which imposes a tradeoff between flexibility and security. In order to make this tradeoff available, a genetic-based anomaly detection model for logs of Process Aware Systems is presented in this paper. The detection of an anomalous trace is based on discovering an appropriate process model by using genetic process mining and detecting traces that do not fit the appropriate model as anomalous trace; therefore, when used in PAS, this model is an automated solution that can support coexistence of flexibility and security.Keywords: Anomaly Detection, Genetic Algorithm, ProcessAware Systems, Process Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1925763 Clustering Unstructured Text Documents Using Fading Function
Authors: Pallav Roxy, Durga Toshniwal
Abstract:
Clustering unstructured text documents is an important issue in data mining community and has a number of applications such as document archive filtering, document organization and topic detection and subject tracing. In the real world, some of the already clustered documents may not be of importance while new documents of more significance may evolve. Most of the work done so far in clustering unstructured text documents overlooks this aspect of clustering. This paper, addresses this issue by using the Fading Function. The unstructured text documents are clustered. And for each cluster a statistics structure called Cluster Profile (CP) is implemented. The cluster profile incorporates the Fading Function. This Fading Function keeps an account of the time-dependent importance of the cluster. The work proposes a novel algorithm Clustering n-ary Merge Algorithm (CnMA) for unstructured text documents, that uses Cluster Profile and Fading Function. Experimental results illustrating the effectiveness of the proposed technique are also included.Keywords: Clustering, Text Mining, Unstructured TextDocuments, Fading Function.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1985762 Discovery of Time Series Event Patterns based on Time Constraints from Textual Data
Authors: Shigeaki Sakurai, Ken Ueno, Ryohei Orihara
Abstract:
This paper proposes a method that discovers time series event patterns from textual data with time information. The patterns are composed of sequences of events and each event is extracted from the textual data, where an event is characteristic content included in the textual data such as a company name, an action, and an impression of a customer. The method introduces 7 types of time constraints based on the analysis of the textual data. The method also evaluates these constraints when the frequency of a time series event pattern is calculated. We can flexibly define the time constraints for interesting combinations of events and can discover valid time series event patterns which satisfy these conditions. The paper applies the method to daily business reports collected by a sales force automation system and verifies its effectiveness through numerical experiments.
Keywords: Text mining, sequential mining, time constraints, daily business reports.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1488761 Knowledge Mining in Web-based Learning Environments
Authors: Nittaya Kerdprasop, Kittisak Kerdprasop
Abstract:
The state of the art in instructional design for computer-assisted learning has been strongly influenced by advances in information technology, Internet and Web-based systems. The emphasis of educational systems has shifted from training to learning. The course delivered has also been changed from large inflexible content to sequential small chunks of learning objects. The concepts of learning objects together with the advanced technologies of Web and communications support the reusability, interoperability, and accessibility design criteria currently exploited by most learning systems. These concepts enable just-in-time learning. We propose to extend theses design criteria further to include the learnability concept that will help adapting content to the needs of learners. The learnability concept offers a better personalization leading to the creation and delivery of course content more appropriate to performance and interest of each learner. In this paper we present a new framework of learning environments containing knowledge discovery as a tool to automatically learn patterns of learning behavior from learners' profiles and history.Keywords: Knowledge mining, Web-based learning, Learning environments.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1786