Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3

data-mining Related Abstracts

3 Analysis and Rule Extraction of Coronary Artery Disease Data Using Data Mining

Authors: Rezaei Hachesu Peyman, Oliyaee Azadeh, Salahzadeh Zahra, Alizadeh Somayyeh, Safaei Naser

Abstract:

Coronary Artery Disease (CAD) is one major cause of disability in adults and one main cause of death in developed. In this study, data mining techniques including Decision Trees, Artificial neural networks (ANNs), and Support Vector Machine (SVM) analyze CAD data. Data of 4948 patients who had suffered from heart diseases were included in the analysis. CAD is the target variable, and 24 inputs or predictor variables are used for the classification. The performance of these techniques is compared in terms of sensitivity, specificity, and accuracy. The most significant factor influencing CAD is chest pain. Elderly males (age > 53) have a high probability to be diagnosed with CAD. SVM algorithm is the most useful way for evaluation and prediction of CAD patients as compared to non-CAD ones. Application of data mining techniques in analyzing coronary artery diseases is a good method for investigating the existing relationships between variables.

Keywords: Knowledge Discovery, classification, coronary artery disease, data-mining, extract

Procedia PDF Downloads 330
2 The Use of Classifiers in Image Analysis of Oil Wells Profiling Process and the Automatic Identification of Events

Authors: Jaqueline Maria Ribeiro Vieira

Abstract:

Different strategies and tools are available at the oil and gas industry for detecting and analyzing tension and possible fractures in borehole walls. Most of these techniques are based on manual observation of the captured borehole images. While this strategy may be possible and convenient with small images and few data, it may become difficult and suitable to errors when big databases of images must be treated. While the patterns may differ among the image area, depending on many characteristics (drilling strategy, rock components, rock strength, etc.). Previously we developed and proposed a novel strategy capable of detecting patterns at borehole images that may point to regions that have tension and breakout characteristics, based on segmented images. In this work we propose the inclusion of data-mining classification strategies in order to create a knowledge database of the segmented curves. These classifiers allow that, after some time using and manually pointing parts of borehole images that correspond to tension regions and breakout areas, the system will indicate and suggest automatically new candidate regions, with higher accuracy. We suggest the use of different classifiers methods, in order to achieve different knowledge data set configurations.

Keywords: Image Segmentation, data-mining, oil well visualization, classifiers, visual computer

Procedia PDF Downloads 182
1 Recurrent Neural Networks for Classifying Outliers in Electronic Health Record Clinical Text

Authors: M-Tahar Kechadi, Duncan Wallace

Abstract:

In recent years, Machine Learning (ML) approaches have been successfully applied to an analysis of patient symptom data in the context of disease diagnosis, at least where such data is well codified. However, much of the data present in Electronic Health Records (EHR) are unlikely to prove suitable for classic ML approaches. Furthermore, as scores of data are widely spread across both hospitals and individuals, a decentralized, computationally scalable methodology is a priority. The focus of this paper is to develop a method to predict outliers in an out-of-hours healthcare provision center (OOHC). In particular, our research is based upon the early identification of patients who have underlying conditions which will cause them to repeatedly require medical attention. OOHC act as an ad-hoc delivery of triage and treatment, where interactions occur without recourse to a full medical history of the patient in question. Medical histories, relating to patients contacting an OOHC, may reside in several distinct EHR systems in multiple hospitals or surgeries, which are unavailable to the OOHC in question. As such, although a local solution is optimal for this problem, it follows that the data under investigation is incomplete, heterogeneous, and comprised mostly of noisy textual notes compiled during routine OOHC activities. Through the use of Deep Learning methodologies, the aim of this paper is to provide the means to identify patient cases, upon initial contact, which are likely to relate to such outliers. To this end, we compare the performance of Long Short-Term Memory, Gated Recurrent Units, and combinations of both with Convolutional Neural Networks. A further aim of this paper is to elucidate the discovery of such outliers by examining the exact terms which provide a strong indication of positive and negative case entries. While free-text is the principal data extracted from EHRs for classification, EHRs also contain normalized features. Although the specific demographical features treated within our corpus are relatively limited in scope, we examine whether it is beneficial to include such features among the inputs to our neural network, or whether these features are more successfully exploited in conjunction with a different form of a classifier. In this section, we compare the performance of randomly generated regression trees and support vector machines and determine the extent to which our classification program can be improved upon by using either of these machine learning approaches in conjunction with the output of our Recurrent Neural Network application. The output of our neural network is also used to help determine the most significant lexemes present within the corpus for determining high-risk patients. By combining the confidence of our classification program in relation to lexemes within true positive and true negative cases, with an inverse document frequency of the lexemes related to these cases, we can determine what features act as the primary indicators of frequent-attender and non-frequent-attender cases, providing a human interpretable appreciation of how our program classifies cases.

Keywords: Artificial Neural Networks, Machine Learning, Medical informatics, data-mining

Procedia PDF Downloads 5