Search results for: Dataset production
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2374

Search results for: Dataset production

2284 On Identity Disclosure Risk Measurement for Shared Microdata

Authors: M. N. Huda, S. Yamada, N. Sonehara

Abstract:

Probability-based identity disclosure risk measurement may give the same overall risk for different anonymization strategy of the same dataset. Some entities in the anonymous dataset may have higher identification risks than the others. Individuals are more concerned about higher risks than the average and are more interested to know if they have a possibility of being under higher risk. A notation of overall risk in the above measurement method doesn-t indicate whether some of the involved entities have higher identity disclosure risk than the others. In this paper, we have introduced an identity disclosure risk measurement method that not only implies overall risk, but also indicates whether some of the members have higher risk than the others. The proposed method quantifies the overall risk based on the individual risk values, the percentage of the records that have a risk value higher than the average and how larger the higher risk values are compared to the average. We have analyzed the disclosure risks for different disclosure control techniques applied to original microdata and present the results.

Keywords: Anonymization, microdata, disclosure risk, privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1327
2283 Energy Production Potential from Co-Digestion between Frozen Seafood Wastewater and Decanter Cake in Thailand

Authors: Thaniya Kaosol, Narumol Sohgrathok

Abstract:

In this paper, a Biochemical Methane Potential (BMP) test provides a measure of the energy production potential from codigestion between the frozen seafood wastewater and the decanter cake. The experiments were conducted in laboratory-scale. The suitable ratio of the frozen seafood wastewater and the decanter cake was observed in the BMP test. The ratio of the co-digestion between the frozen seafood wastewater and the decanter cake has impacts on the biogas production and energy production potential. The best performance for energy production potential using BMP test observed from the 180 ml of the frozen seafood wastewater and 10 g of the decanter cake ratio. This ratio provided the maximum methane production at 0.351 l CH4/g TCODremoval. The removal efficiencies are 76.18%, 83.55%, 43.16% and 56.76% at TCOD, SCOD, TS and VS, respectively. The result can be concluded that the decanter cake can improve the energy production potential of the frozen seafood wastewater. The energy provides from co-digestion between frozen seafood wastewater and decanter cake approximately 19x109 MJ/year in Thailand.

Keywords: Frozen seafood wastewater, decanter cake, biogas, methane, BMP test.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2226
2282 Feature Selection for Web Page Classification Using Swarm Optimization

Authors: B. Leela Devi, A. Sankar

Abstract:

The web’s increased popularity has included a huge amount of information, due to which automated web page classification systems are essential to improve search engines’ performance. Web pages have many features like HTML or XML tags, hyperlinks, URLs and text contents which can be considered during an automated classification process. It is known that Webpage classification is enhanced by hyperlinks as it reflects Web page linkages. The aim of this study is to reduce the number of features to be used to improve the accuracy of the classification of web pages. In this paper, a novel feature selection method using an improved Particle Swarm Optimization (PSO) using principle of evolution is proposed. The extracted features were tested on the WebKB dataset using a parallel Neural Network to reduce the computational cost.

Keywords: Web page classification, WebKB Dataset, Term Frequency-Inverse Document Frequency (TF-IDF), Particle Swarm Optimization (PSO).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3216
2281 Use of Linear Programming for Optimal Production in a Production Line in Saudi Food Co.

Authors: Qasim M. Kriri

Abstract:

Few Saudi Arabia production companies face financial profit issues until this moment. This work presents a linear integer programming model that solves a production problem of a Saudi Food Company in Saudi Arabia. An optimal solution to the above-mentioned problem is a Linear Programming solution. In this regard, the main purpose of this project is to maximize profit. Linear Programming Technique has been used to derive the maximum profit from production of natural juice at Saudi Food Co. The operations of production of the company were formulated and optimal results are found out by using Lindo Software that employed Sensitivity Analysis and Parametric linear programming in order develop Linear Programming. In addition, the parameter values are increased, then the values of the objective function will be increased.

Keywords: Parameter linear programming, objective function, sensitivity analysis, optimize profit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2821
2280 Neural Network Based Approach for Face Detection cum Face Recognition

Authors: Kesari Verma, Aniruddha S. Thoke, Pritam Singh

Abstract:

Automatic face detection is a complex problem in image processing. Many methods exist to solve this problem such as template matching, Fisher Linear Discriminate, Neural Networks, SVM, and MRC. Success has been achieved with each method to varying degrees and complexities. In proposed algorithm we used upright, frontal faces for single gray scale images with decent resolution and under good lighting condition. In the field of face recognition technique the single face is matched with single face from the training dataset. The author proposed a neural network based face detection algorithm from the photographs as well as if any test data appears it check from the online scanned training dataset. Experimental result shows that the algorithm detected up to 95% accuracy for any image.

Keywords: Face Detection, Face Recognition, NN Approach, PCA Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2243
2279 Optimized Brain Computer Interface System for Unspoken Speech Recognition: Role of Wernicke Area

Authors: Nassib Abdallah, Pierre Chauvet, Abd El Salam Hajjar, Bassam Daya

Abstract:

In this paper, we propose an optimized brain computer interface (BCI) system for unspoken speech recognition, based on the fact that the constructions of unspoken words rely strongly on the Wernicke area, situated in the temporal lobe. Our BCI system has four modules: (i) the EEG Acquisition module based on a non-invasive headset with 14 electrodes; (ii) the Preprocessing module to remove noise and artifacts, using the Common Average Reference method; (iii) the Features Extraction module, using Wavelet Packet Transform (WPT); (iv) the Classification module based on a one-hidden layer artificial neural network. The present study consists of comparing the recognition accuracy of 5 Arabic words, when using all the headset electrodes or only the 4 electrodes situated near the Wernicke area, as well as the selection effect of the subbands produced by the WPT module. After applying the articial neural network on the produced database, we obtain, on the test dataset, an accuracy of 83.4% with all the electrodes and all the subbands of 8 levels of the WPT decomposition. However, by using only the 4 electrodes near Wernicke Area and the 6 middle subbands of the WPT, we obtain a high reduction of the dataset size, equal to approximately 19% of the total dataset, with 67.5% of accuracy rate. This reduction appears particularly important to improve the design of a low cost and simple to use BCI, trained for several words.

Keywords: Brain-computer interface, speech recognition, electroencephalography EEG, Wernicke area, artificial neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 846
2278 Knowledge Discovery and Data Mining Techniques in Textile Industry

Authors: Filiz Ersoz, Taner Ersoz, Erkin Guler

Abstract:

This paper addresses the issues and technique for textile industry using data mining techniques. Data mining has been applied to the stitching of garments products that were obtained from a textile company. Data mining techniques were applied to the data obtained from the CHAID algorithm, CART algorithm, Regression Analysis and, Artificial Neural Networks. Classification technique based analyses were used while data mining and decision model about the production per person and variables affecting about production were found by this method. In the study, the results show that as the daily working time increases, the production per person also decreases. In addition, the relationship between total daily working and production per person shows a negative result and the production per person show the highest and negative relationship.

Keywords: Data mining, textile production, decision trees, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1492
2277 Cost Sensitive Analysis of Production Logistics Measures A Decision Making Support System for Evaluating Measures in the Production

Authors: Michael Grigutsch, Peter Nyhuis

Abstract:

Due to the volatile global economy, enterprises are increasingly focusing on logistics. By investing in suitable measures a company can increase their logistic performance and assert themselves over the competition. However, enterprises are also faced with the challenge of investing available capital for maximum profits. In order to be able to create an informed and quantifiably comprehensible basis for a decision, enterprises need a suitable model for logistically and monetarily evaluating measures in production. Previously, within the frame of Collaborate Research Centre 489 (SFB 489) at the Institute for Production Systems and Logistics, (IFA) a Logistic Information System was developed specifically for providing enterprises in the forging industry with support when making decisions. Based on this research, a new initiative referred to as ‘Transfer Project T7’, aims to develop a universal approach for logistically and monetarily evaluating production measures. This paper focuses on the structural measure echelon storage and their impact on the entire production system.

Keywords: Logistic Operating Curves, Transfer Functions, Production Logistics, Storages Echelon.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1295
2276 Mucus Secretion Responses to Various Sublethal Copper (II) Concentrations in the Mussel Perna perna

Authors: Kamleshan Pillay

Abstract:

The purpose of this study was to evaluate the use of mucus production as a biomarker. This was done by exposing the mussel Perna perna to various sublethal concentrations of Cu. Mussels are effective as a bioindicator species as they accumulate Cu in their tissues. Differences in mucus production rates were evaluated at different Cu concentrations. The findings of this study indicate that increasing Cu concentrations had a significant effect on the mucus production rates over a 24 hour exposure. There were also significant differences between the mucus production rates at different Cu concentrations (p < 0.05). Thus, mucus is an essential detoxification mechanism.

Keywords: Copper, Mucus, Depuration, Perna perna.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2667
2275 Microbial Production of Levan using Date Syrup and Investigation of Its Properties

Authors: Marzieh Moosavi-Nasab, Behnaz Layegh , Ladan Aminlari, Mohammad B. Hashemi

Abstract:

Levan, an exopolysaccharide, was produced by Microbacterium laevaniformans and its yield was characterized as a function of concentrations of date syrup, sucrose and the fermentation time. The optimum condition for levan production from sucrose was at concentration of 20% sucrose for 48 h and for date syrup was 25% for 48 h. The results show that an increase in fermentation time caused a decrease in the levan production at all concentrations of date syrup tested. Under these conditions after 48 h in sucrose medium, levan production reached 48.9 g/L and for date syrup reached 10.48 g/L . The effect of pH on the yield of the purified levan was examined and the optimum pH for levan production was determined to be 6.0. Levan was composed mainly of fructose residues when analyzed by TLC and FT-IR spectroscopy. Date syrup is a cheap substrate widely available in Iran and has potential for levan production. The thermal stability of levan was assessed by Thermo Gravimetric Analysis (TGA) that revealed the onset of decomposition near to 49°C for the levan produced from sucrose and 51°C for the levan from date syrup. DSC results showed a single Tg at 98°C for levan produced from sucrose and 206 °C for levan from date syrup.

Keywords: Date syrup, Fermentation, Levan, Microbacteriumlaevaniformans

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2687
2274 Dynamically Monitoring Production Methods for Identifying Structural Changes relevant to Logistics

Authors: Marco Kennemann, Steffen C. Eickemeyer, Peter Nyhuis

Abstract:

Due to the growing dynamic and complexity within the market environment production enterprises in particular are faced with new logistic challenges. Moreover, it is here in this dynamic environment that the Logistic Operating Curve Theory also reaches its limits as a method for describing the correlations between the logistic objectives. In order to convert this theory into a method for dynamically monitoring productions this paper will introduce methods for reliably and quickly identifying structural changes relevant to logistics.

Keywords: Dynamics, Logistic Operating Curves, Production Logistics, Production Planning and Control

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1330
2273 Fusion of ETM+ Multispectral and Panchromatic Texture for Remote Sensing Classification

Authors: Mahesh Pal

Abstract:

This paper proposes to use ETM+ multispectral data and panchromatic band as well as texture features derived from the panchromatic band for land cover classification. Four texture features including one 'internal texture' and three GLCM based textures namely correlation, entropy, and inverse different moment were used in combination with ETM+ multispectral data. Two data sets involving combination of multispectral, panchromatic band and its texture were used and results were compared with those obtained by using multispectral data alone. A decision tree classifier with and without boosting were used to classify different datasets. Results from this study suggest that the dataset consisting of panchromatic band, four of its texture features and multispectral data was able to increase the classification accuracy by about 2%. In comparison, a boosted decision tree was able to increase the classification accuracy by about 3% with the same dataset.

Keywords: Internal texture; GLCM; decision tree; boosting; classification accuracy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1690
2272 Improving the Performance of Deep Learning in Facial Emotion Recognition with Image Sharpening

Authors: Ksheeraj Sai Vepuri, Nada Attar

Abstract:

We as humans use words with accompanying visual and facial cues to communicate effectively. Classifying facial emotion using computer vision methodologies has been an active research area in the computer vision field. In this paper, we propose a simple method for facial expression recognition that enhances accuracy. We tested our method on the FER-2013 dataset that contains static images. Instead of using Histogram equalization to preprocess the dataset, we used Unsharp Mask to emphasize texture and details and sharpened the edges. We also used ImageDataGenerator from Keras library for data augmentation. Then we used Convolutional Neural Networks (CNN) model to classify the images into 7 different facial expressions, yielding an accuracy of 69.46% on the test set. Our results show that using image preprocessing such as the sharpening technique for a CNN model can improve the performance, even when the CNN model is relatively simple.

Keywords: Facial expression recognition, image pre-processing, deep learning, CNN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 492
2271 Assessing and Visualizing the Stability of Feature Selectors: A Case Study with Spectral Data

Authors: R.Guzman-Martinez, Oscar Garcia-Olalla, R.Alaiz-Rodriguez

Abstract:

Feature selection plays an important role in applications with high dimensional data. The assessment of the stability of feature selection/ranking algorithms becomes an important issue when the dataset is small and the aim is to gain insight into the underlying process by analyzing the most relevant features. In this work, we propose a graphical approach that enables to analyze the similarity between feature ranking techniques as well as their individual stability. Moreover, it works with whatever stability metric (Canberra distance, Spearman's rank correlation coefficient, Kuncheva's stability index,...). We illustrate this visualization technique evaluating the stability of several feature selection techniques on a spectral binary dataset. Experimental results with a neural-based classifier show that stability and ranking quality may not be linked together and both issues have to be studied jointly in order to offer answers to the domain experts.

Keywords: Feature Selection Stability, Spectral data, Data visualization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1480
2270 Evaluating the Logistic Performance Capability of Regeneration Processes

Authors: Thorben Kuprat, Julian Becker, Jonas Mayer, Peter Nyhuis

Abstract:

For years now, it has been recognized that logistic performance capability contributes enormously to a production enterprise’s competitiveness and as such is a critical control lever. In doing so, the orientation on customer wishes (e.g. delivery dates) represents a key parameter not only in the value-adding production but also in product regeneration. Since production and regeneration processes have different characteristics, production planning and control measures cannot be directly transferred to regeneration processes. As part of a special research project, the Institute of Production Systems and Logistics Hannover is focused on increasing the logistic performance capability of regeneration processes for complex capital goods. The aim is to ensure logistic targets are met by implementing a model specifically designed to align the capacities and load in regeneration processes.

Keywords: Capacity planning, complex capital goods, logistic performance, regeneration process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1707
2269 Microalgal Lipid Production by Microalgae Chlorella sp. KKU-S2

Authors: Ratanaporn Leesing, Supaporn Kookkhunthod, Ngarmnit Nontaso

Abstract:

The objective of this work is to produce heterotrophic microalgal lipid in flask-batch fermentation. Chlorella sp. KKU-S2 supported maximum values of 0.374 g/L/d, 0.478 g lipid/g cells, and 0.112 g/L/d for volumetric lipid production rate, and specific yield of lipid, and specific rate of lipid production, respectively when culture was performed on BG-11 medium supplemented with 50g/L glucose. Among the carbon sources tested, maximum cell yield coefficient (YX/S, g/L), maximum specific yield of lipid (YP/X, g lipid/g cells) and volumetric lipid production rate (QP, g/L/d) were found of 0.728, 0.237, and 0.619, respectively, using sugarcane molasses as carbon source. The main components of fatty acid from extracted lipid were palmitic acid, stearic acid, oleic acid and linoleic acid which similar to vegetable oils and suitable for biodiesel production.

Keywords: Microalgal lipid, Chlorella sp. KKU-S2, kineticparameters, biodiesel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2665
2268 Environmental Potentials within the Production of Asphalt Mixtures

Authors: Florian Gschösser, Walter Purrer

Abstract:

The paper shows examples for the (environmental) optimization of production processes for asphalt mixtures applied for typical road pavements in Austria and Switzerland. The conducted “from-cradle-to-gate” LCA firstly analyzes the production one cubic meter of asphalt and secondly all material production processes for exemplary highway pavements applied in Austria and Switzerland. It is shown that environmental impacts can be reduced by the application of reclaimed asphalt pavement (RAP) and by the optimization of specific production characteristics, e.g. the reduction of the initial moisture of the mineral aggregate and the reduction of the mixing temperature by the application of low-viscosity and foam bitumen. The results of the LCA study demonstrate reduction potentials per cubic meter asphalt of up to 57 % (Global Warming Potential–GWP) and 77 % (Ozone depletion–ODP). The analysis per square meter of asphalt pavement determined environmental potentials of up to 40 % (GWP) and 56 % (ODP).

Keywords: Asphalt mixtures, environmental potentials, life cycle assessment, material production.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1051
2267 Innovation to Protect the Smoke and Odor Pollutions in Benjarong Ceramic Production

Authors: Chonmapat Torasa, Witthaya Mekhum

Abstract:

The improvement of a filer case utilized to purify the let-out smoke and smell in the production of Benjarong Ceramic is studied through Participatory Action Research (PAR). This research is aimed to protect smell, dirty smoke, and air pollution which are effects of incomplete combustion in the production of Benjarong ceramic. This research was conducted at Jongjint Benjarong Ceramic Factory in Plai Bang, Bang Kruai, Nonthaburi Province,Thailand, also 12 employees were interviewed for data collection. All collected data were analyzed to develop and create solution to protect smoke and smell pollution from Benjarong ceramic production. The results revealed that the employees who have used the developed filer cases are moderately satisfied. In addition to the efficiency of developed smoke-and-smell filer cases, it was found that Overall, the respondents were satisfied moderately with efficiency of modified smoke and smell filter cases.

Keywords: Benjarong Ceramic, Community Economy, OTOP Production, Production.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1734
2266 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: Data mining, knowledge discovery, machine learning, similarity measurement, supervised classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1478
2265 Egg Production Performance of Old Laying Hen Fed Dietary Turmeric Powder

Authors: D. P. Rahardja, M. Rahman Hakim, V. Sri Lestari

Abstract:

An experiment was conducted to elucidate the effects of turmeric powder supplementation on egg production performance of old laying hens (80 weeks of age). There were 40 hens of Hysex Brown strain used in the study. They were caged individually, and randomly divided into 4 treatment groups of diet containing 0 (control), 1, 2 and 4 % oven dried turmeric powder for 3 periods of 4 weeks; Egg production (% hen day) and feed intake of the 4 treatment groups at the commencement of the experiment were not significantly different. In addition to egg production performance (% and egg weight), feed and water intakes were measured daily, and cholesterol content of the whole egg was determined. The results indicated that feed intakes of the hen were significantly lowered when 4% turmeric powder supplemented, while there were no significant changes in water intakes. Egg production were significantly increased and maintained at a higher level by turmeric powder supplementation up to 4% compared with the control, while the weight of eggs were not significantly affected. The research markedly demonstrated that supplementation of turmeric powder up to 4% could improve and maintain egg production performance of the old laying hen at a higher level with a lower cholesterol content. 

Keywords: Curcumin, feed and water intake, old laying hen, egg production.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3471
2264 Image Ranking to Assist Object Labeling for Training Detection Models

Authors: Tonislav Ivanov, Oleksii Nedashkivskyi, Denis Babeshko, Vadim Pinskiy, Matthew Putman

Abstract:

Training a machine learning model for object detection that generalizes well is known to benefit from a training dataset with diverse examples. However, training datasets usually contain many repeats of common examples of a class and lack rarely seen examples. This is due to the process commonly used during human annotation where a person would proceed sequentially through a list of images labeling a sufficiently high total number of examples. Instead, the method presented involves an active process where, after the initial labeling of several images is completed, the next subset of images for labeling is selected by an algorithm. This process of algorithmic image selection and manual labeling continues in an iterative fashion. The algorithm used for the image selection is a deep learning algorithm, based on the U-shaped architecture, which quantifies the presence of unseen data in each image in order to find images that contain the most novel examples. Moreover, the location of the unseen data in each image is highlighted, aiding the labeler in spotting these examples. Experiments performed using semiconductor wafer data show that labeling a subset of the data, curated by this algorithm, resulted in a model with a better performance than a model produced from sequentially labeling the same amount of data. Also, similar performance is achieved compared to a model trained on exhaustive labeling of the whole dataset. Overall, the proposed approach results in a dataset that has a diverse set of examples per class as well as more balanced classes, which proves beneficial when training a deep learning model.

Keywords: Computer vision, deep learning, object detection, semiconductor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 747
2263 Semi-Supervised Outlier Detection Using a Generative and Adversary Framework

Authors: Jindong Gu, Matthias Schubert, Volker Tresp

Abstract:

In many outlier detection tasks, only training data belonging to one class, i.e., the positive class, is available. The task is then to predict a new data point as belonging either to the positive class or to the negative class, in which case the data point is considered an outlier. For this task, we propose a novel corrupted Generative Adversarial Network (CorGAN). In the adversarial process of training CorGAN, the Generator generates outlier samples for the negative class, and the Discriminator is trained to distinguish the positive training data from the generated negative data. The proposed framework is evaluated using an image dataset and a real-world network intrusion dataset. Our outlier-detection method achieves state-of-the-art performance on both tasks.

Keywords: Outlier detection, generative adversary networks, semi-supervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1014
2262 Upgraded Rough Clustering and Outlier Detection Method on Yeast Dataset by Entropy Rough K-Means Method

Authors: P. Ashok, G. M. Kadhar Nawaz

Abstract:

Rough set theory is used to handle uncertainty and incomplete information by applying two accurate sets, Lower approximation and Upper approximation. In this paper, the rough clustering algorithms are improved by adopting the Similarity, Dissimilarity–Similarity and Entropy based initial centroids selection method on three different clustering algorithms namely Entropy based Rough K-Means (ERKM), Similarity based Rough K-Means (SRKM) and Dissimilarity-Similarity based Rough K-Means (DSRKM) were developed and executed by yeast dataset. The rough clustering algorithms are validated by cluster validity indexes namely Rand and Adjusted Rand indexes. An experimental result shows that the ERKM clustering algorithm perform effectively and delivers better results than other clustering methods. Outlier detection is an important task in data mining and very much different from the rest of the objects in the clusters. Entropy based Rough Outlier Factor (EROF) method is seemly to detect outlier effectively for yeast dataset. In rough K-Means method, by tuning the epsilon (ᶓ) value from 0.8 to 1.08 can detect outliers on boundary region and the RKM algorithm delivers better results, when choosing the value of epsilon (ᶓ) in the specified range. An experimental result shows that the EROF method on clustering algorithm performed very well and suitable for detecting outlier effectively for all datasets. Further, experimental readings show that the ERKM clustering method outperformed the other methods.

Keywords: Clustering, Entropy, Outlier, Rough K-Means, validity index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1364
2261 Comparative Evaluation of Accuracy of Selected Machine Learning Classification Techniques for Diagnosis of Cancer: A Data Mining Approach

Authors: Rajvir Kaur, Jeewani Anupama Ginige

Abstract:

With recent trends in Big Data and advancements in Information and Communication Technologies, the healthcare industry is at the stage of its transition from clinician oriented to technology oriented. Many people around the world die of cancer because the diagnosis of disease was not done at an early stage. Nowadays, the computational methods in the form of Machine Learning (ML) are used to develop automated decision support systems that can diagnose cancer with high confidence in a timely manner. This paper aims to carry out the comparative evaluation of a selected set of ML classifiers on two existing datasets: breast cancer and cervical cancer. The ML classifiers compared in this study are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and Artificial Neural Networks (ANN). The evaluation is carried out based on standard evaluation metrics Precision (P), Recall (R), F1-score and Accuracy. The experimental results based on the evaluation metrics show that ANN showed the highest-level accuracy (99.4%) when tested with breast cancer dataset. On the other hand, when these ML classifiers are tested with the cervical cancer dataset, Ensemble (Bagged Tree) technique gave better accuracy (93.1%) in comparison to other classifiers.

Keywords: Artificial neural networks, breast cancer, cancer dataset, classifiers, cervical cancer, F-score, logistic regression, machine learning, precision, recall, support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1492
2260 Steam Assisted Gravity Drainage: A Recipe for Success

Authors: Mohsen Ebrahimi

Abstract:

In this paper, Steam Assisted Gravity Drainage (SAGD) is introduced and its advantages over ordinary steam injection is demonstrated. A simple simulation model is built and three scenarios of natural production, ordinary steam injection, and SAGD are compared in terms of their cumulative oil production and cumulative oil steam ratio. The results show that SAGD can significantly enhance oil production in quite a short period of time. However, since the distance between injection and production wells is short, the oil to steam ratio decreases gradually through time.

Keywords: Thermal recovery, Steam injection, SAGD, Enhanced oil recovery

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2140
2259 The Role of Business Survey Measures in Forecasting Croatian Industrial Production

Authors: M. Cizmesija, N. Erjavec, V. Bahovec

Abstract:

While the European Union (EU) harmonized methodology is a benchmark of worldwide used business survey (BS) methodology, the choice of variables that are components of the confidence indicators, as the leading indicators, is not strictly determined and unique. Therefore, the aim of this paper is to investigate and to quantify the relationship between all business survey variables in manufacturing industry and industrial production as a reference macroeconomic series in Croatia. The assumption is that there are variables in the business survey, that are not components of Industrial Confidence Indicator (ICI) and which can accurately (and sometimes better then ICI) predict changes in Croatian industrial production. Empirical analyses are conducted using quarterly data of BS variables in manufacturing industry and Croatian industrial production over the period from the first quarter 2005 to the first quarter 2013. Research results confirmed the assumption: three BS variables which is not components of ICI (competitive position, demand and liquidity) are the best leading indicator then ICI, in forecasting changes in Croatian industrial production instantaneously, with one, two or three quarter ahead.

Keywords: Balance, Business Survey, Confidence Indicators, Industrial Production, Forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1871
2258 Modeling Erosion Control in Oil Production Wells

Authors: Kenneth I.Eshiet, Yong Sheng

Abstract:

The sand production problem has led researchers into making various attempts to understand the phenomenon. The generally accepted concept is that the occurrence of sanding is due to the in-situ stress conditions and the induced changes in stress that results in the failure of the reservoir sandstone during hydrocarbon production from wellbores. By using a hypothetical cased (perforated) well, an approach to the problem is presented here by using Finite Element numerical modelling techniques. In addition to the examination of the erosion problem, the influence of certain key parameters is studied in order to ascertain their effect on the failure and subsequent erosion process. The major variables investigated include: drawdown, perforation depth, and the erosion criterion. Also included is the determination of the optimal mud pressure for given operational and reservoir conditions. The improved understanding between parameters enables the choice of optimal values to minimize sanding during oil production.

Keywords: Equivalent Plastic Strain, Erosion, Hydrocarbon Production.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1410
2257 A Neural Network Approach in Predicting the Blood Glucose Level for Diabetic Patients

Authors: Zarita Zainuddin, Ong Pauline, C. Ardil

Abstract:

Diabetes Mellitus is a chronic metabolic disorder, where the improper management of the blood glucose level in the diabetic patients will lead to the risk of heart attack, kidney disease and renal failure. This paper attempts to enhance the diagnostic accuracy of the advancing blood glucose levels of the diabetic patients, by combining principal component analysis and wavelet neural network. The proposed system makes separate blood glucose prediction in the morning, afternoon, evening and night intervals, using dataset from one patient covering a period of 77 days. Comparisons of the diagnostic accuracy with other neural network models, which use the same dataset are made. The comparison results showed overall improved accuracy, which indicates the effectiveness of this proposed system.

Keywords: Diabetes Mellitus, principal component analysis, time-series, wavelet neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2943
2256 Estimation of Methane from Hydrocarbon Exploration and Production in India

Authors: A. K. Pathak, K. Ojha

Abstract:

Methane is the second most important greenhouse gas (GHG) after carbon dioxide. Amount of methane emission from energy sector is increasing day by day with various activities. In present work, various sources of methane emission from upstream, middle stream and downstream of oil & gas sectors are identified and categorised as per IPCC-2006 guidelines. Data were collected from various oil & gas sector like (i) exploration & production of oil & gas (ii) supply through pipelines (iii) refinery throughput & production (iv) storage & transportation (v) usage. Methane emission factors for various categories were determined applying Tier-II and Tier-I approach using the collected data. Total methane emission from Indian Oil & Gas sectors was thus estimated for the year 1990 to 2007.

Keywords: Carbon credit, Climate change, Methane emission, Oil & Gas production

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2099
2255 Analysis of a Population of Diabetic Patients Databases with Classifiers

Authors: Murat Koklu, Yavuz Unal

Abstract:

Data mining can be called as a technique to extract information from data. It is the process of obtaining hidden information and then turning it into qualified knowledge by statistical and artificial intelligence technique. One of its application areas is medical area to form decision support systems for diagnosis just by inventing meaningful information from given medical data. In this study a decision support system for diagnosis of illness that make use of data mining and three different artificial intelligence classifier algorithms namely Multilayer Perceptron, Naive Bayes Classifier and J.48. Pima Indian dataset of UCI Machine Learning Repository was used. This dataset includes urinary and blood test results of 768 patients. These test results consist of 8 different feature vectors. Obtained classifying results were compared with the previous studies. The suggestions for future studies were presented.

Keywords: Artificial Intelligence, Classifiers, Data Mining, Diabetic Patients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5386