Search results for: distributed data stream mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 26767

Search results for: distributed data stream mining

26527 Arabic Light Stemmer for Better Search Accuracy

Authors: Sahar Khedr, Dina Sayed, Ayman Hanafy

Abstract:

Arabic is one of the most ancient and critical languages in the world. It has over than 250 million Arabic native speakers and more than twenty countries having Arabic as one of its official languages. In the past decade, we have witnessed a rapid evolution in smart devices, social network and technology sector which led to the need to provide tools and libraries that properly tackle the Arabic language in different domains. Stemming is one of the most crucial linguistic fundamentals. It is used in many applications especially in information extraction and text mining fields. The motivation behind this work is to enhance the Arabic light stemmer to serve the data mining industry and leverage it in an open source community. The presented implementation works on enhancing the Arabic light stemmer by utilizing and enhancing an algorithm that provides an extension for a new set of rules and patterns accompanied by adjusted procedure. This study has proven a significant enhancement for better search accuracy with an average 10% improvement in comparison with previous works.

Keywords: Arabic data mining, Arabic Information extraction, Arabic Light stemmer, Arabic stemmer

Procedia PDF Downloads 301
26526 A General Framework for Knowledge Discovery from Echocardiographic and Natural Images

Authors: S. Nandagopalan, N. Pradeep

Abstract:

The aim of this paper is to propose a general framework for storing, analyzing, and extracting knowledge from two-dimensional echocardiographic images, color Doppler images, non-medical images, and general data sets. A number of high performance data mining algorithms have been used to carry out this task. Our framework encompasses four layers namely physical storage, object identification, knowledge discovery, user level. Techniques such as active contour model to identify the cardiac chambers, pixel classification to segment the color Doppler echo image, universal model for image retrieval, Bayesian method for classification, parallel algorithms for image segmentation, etc., were employed. Using the feature vector database that have been efficiently constructed, one can perform various data mining tasks like clustering, classification, etc. with efficient algorithms along with image mining given a query image. All these facilities are included in the framework that is supported by state-of-the-art user interface (UI). The algorithms were tested with actual patient data and Coral image database and the results show that their performance is better than the results reported already.

Keywords: active contour, Bayesian, echocardiographic image, feature vector

Procedia PDF Downloads 434
26525 Linguistic Summarization of Structured Patent Data

Authors: E. Y. Igde, S. Aydogan, F. E. Boran, D. Akay

Abstract:

Patent data have an increasingly important role in economic growth, innovation, technical advantages and business strategies and even in countries competitions. Analyzing of patent data is crucial since patents cover large part of all technological information of the world. In this paper, we have used the linguistic summarization technique to prove the validity of the hypotheses related to patent data stated in the literature.

Keywords: data mining, fuzzy sets, linguistic summarization, patent data

Procedia PDF Downloads 265
26524 Numerical Modeling of Artisanal and Small Scale Mining of Coltan in the African Great Lakes Region

Authors: Sergio Perez Rodriguez

Abstract:

Coltan Artisanal and Small-Scale Mining (ASM) production from Africa's Great Lakes region has previously been addressed at large scales, notably from regional to country levels. The current findings address the unresolved issue of a production model of ASM of coltan ore by an average Democratic Republic of Congo (DRC) mineworker, which can be used as a reference for a similar characterization of the daily labor of counterparts from other countries in the region. To that end, the Fundamental Equation of Mineral Production has been applied, considering a miner's average daily output of coltan, estimated in the base of gross statistical data gathered from reputable sources. Results indicate daily yields of individual miners in the order of 300 g of coltan ore, with hourly peaks of production in the range of 30 to 40 g of the mineral. Yields are expected to be in the order of 5 g or less during the least productive hours. These outputs are expected to be achieved during the halves of the eight to ten hours of daily working sessions that these artisanal laborers can attend during the mining season.

Keywords: coltan, mineral production, production to reserve ratio, artisanal mining, small-scale mining, ASM, human work, Great Lakes region, Democratic Republic of Congo

Procedia PDF Downloads 72
26523 Digitalization in Aggregate Quarries

Authors: José Eugenio Ortiz, Pierre Plaza, Josefa Herrero, Iván Cabria, José Luis Blanco, Javier Gavilanes, José Ignacio Escavy, Ignacio López-Cilla, Virginia Yagüe, César Pérez, Silvia Rodríguez, Jorge Rico, Cecilia Serrano, Jesús Bernat

Abstract:

The development of Artificial Intelligence services in mining processes, specifically in aggregate quarries, is facilitating automation and improving numerous aspects of operations. Ultimately, AI is transforming the mining industry by improving efficiency, safety and sustainability. With the ability to analyze large amounts of data and make autonomous decisions, AI offers great opportunities to optimize mining operations and maximize the economic and social benefits of this vital industry. Within the framework of the European DIGIECOQUARRY project, various services were developed for the identification of material quality, production estimation, detection of anomalies and prediction of consumption and production automatically with good results.

Keywords: aggregates, artificial intelligence, automatization, mining operations

Procedia PDF Downloads 82
26522 Performance Evaluation of Distributed and Co-Located MIMO LTE Physical Layer Using Wireless Open-Access Research Platform

Authors: Ishak Suleiman, Ahmad Kamsani Samingan, Yeoh Chun Yeow, Abdul Aziz Bin Abdul Rahman

Abstract:

In this paper, we evaluate the benefits of distributed 4x4 MIMO LTE downlink systems compared to that of the co-located 4x4 MIMO LTE downlink system. The performance evaluation was carried out experimentally by using Wireless Open-Access Research Platform (WARP), where the comparison between the 4x4 MIMO LTE transmission downlink system in distributed and co-located techniques was examined. The measured Error Vector Magnitude (EVM) results showed that the distributed technique achieved better system performance compared to the co-located arrangement.

Keywords: multiple-input-multiple-output (MIMO), distributed MIMO, co-located MIMO, LTE

Procedia PDF Downloads 413
26521 Using Mining Methods of WEKA to Predict Quran Verb Tense and Aspect in Translations from Arabic to English: Experimental Results and Analysis

Authors: Jawharah Alasmari

Abstract:

In verb inflection, tense marks past/present/future action, and aspect marks progressive/continues perfect/completed actions. This usage and meaning of tense and aspect differ in Arabic and English. In this research, we applied data mining methods to test the predictive function of candidate features by using our dataset of Arabic verbs in-context, and their 7 translations. Weka machine learning classifiers is used in this experiment in order to examine the key features that can be used to provide guidance to enable a translator’s appropriate English translation of the Arabic verb tense and aspect.

Keywords: Arabic verb, English translations, mining methods, Weka software

Procedia PDF Downloads 266
26520 A General Framework for Knowledge Discovery Using High Performance Machine Learning Algorithms

Authors: S. Nandagopalan, N. Pradeep

Abstract:

The aim of this paper is to propose a general framework for storing, analyzing, and extracting knowledge from two-dimensional echocardiographic images, color Doppler images, non-medical images, and general data sets. A number of high performance data mining algorithms have been used to carry out this task. Our framework encompasses four layers namely physical storage, object identification, knowledge discovery, user level. Techniques such as active contour model to identify the cardiac chambers, pixel classification to segment the color Doppler echo image, universal model for image retrieval, Bayesian method for classification, parallel algorithms for image segmentation, etc., were employed. Using the feature vector database that have been efficiently constructed, one can perform various data mining tasks like clustering, classification, etc. with efficient algorithms along with image mining given a query image. All these facilities are included in the framework that is supported by state-of-the-art user interface (UI). The algorithms were tested with actual patient data and Coral image database and the results show that their performance is better than the results reported already.

Keywords: active contour, bayesian, echocardiographic image, feature vector

Procedia PDF Downloads 413
26519 Sensitivity Analysis for 14 Bus Systems in a Distribution Network with Distribution Generators

Authors: Lakshya Bhat, Anubhav Shrivastava, Shivarudraswamy

Abstract:

There has been a formidable interest in the area of Distributed Generation in recent times. A wide number of loads are addressed by Distributed Generators and have better efficiency too. The major disadvantage in Distributed Generation is voltage control- is highlighted in this paper. The paper addresses voltage control at buses in IEEE 14 Bus system by regulating reactive power. An analysis is carried out by selecting the most optimum location in placing the Distributed Generators through load flow analysis and seeing where the voltage profile rises. Matlab programming is used for simulation of voltage profile in the respective buses after introduction of DG’s. A tolerance limit of +/-5% of the base value has to be maintained.To maintain the tolerance limit , 3 methods are used. Sensitivity analysis of 3 methods for voltage control is carried out to determine the priority among the methods.

Keywords: distributed generators, distributed system, reactive power, voltage control, sensitivity analysis

Procedia PDF Downloads 581
26518 Assessment of Escherichia coli along Nakibiso Stream in Mbale Municipality, Uganda

Authors: Abdul Walusansa

Abstract:

The aim of this study was to assess the level of microbial pollution along Nakibiso stream. The study was carried out in polluted waters of Nakibiso stream, originating from Mbale municipality and running through ADRA Estates to Namatala Wetlands in Eastern Uganda. Four sites along the stream were selected basing on the activities of their vicinity. A total of 120 samples were collected in sterile bottles from the four sampling locations of the stream during the wet and dry seasons of the year 2011. The samples were taken to the National water and Sewerage Cooperation Laboratory for Analysis. Membrane filter technique was used to test for Erischerichia coli. Nitrogen, Phosphorus, pH, dissolved oxygen, electrical conductivity, total suspended solids, turbidity and temperature were also measured. Results for Nitrogen and Phosphorus for sites; 1, 2, 3 and 4 were 1.8, 8.8, 7.7 and 13.8 NH4-N mg/L; and 1.8, 2.1, 1.8 and 2.3 PO4-P mg/L respectively. Basing on these results, it was estimated that farmers use 115 and 24 Kg/acre of Nitrogen and Phosphorus respectively per month. Taking results for Nitrogen, the same amount of Nutrients in artificial fertilizers would cost $ 88. This shows that reuse of wastewater has a potential in terms of nutrients. The results for E. coli for sites 1, 2, 3 and 4 were 1.1 X 107, 9.1 X 105, 7.4 X 105, and 3.4 X 105 respectively. E. coli hence decreased downstream with statistically significant variations between sites 1 and 4. Site 1 had the highest mean E.coli counts. The bacterial contamination was significantly higher during the dry season when more water was needed for irrigation. Although the water had the potential for reuse in farming, bacterial contamination during both seasons was higher than 103 FC/100ml recommended by WHO for unrestricted Agriculture.

Keywords: E. coli, nitrogen, phosphorus, water reuse, waste water

Procedia PDF Downloads 241
26517 Distributed Leadership and Emergency Response: A Study on Seafarers

Authors: Delna Shroff

Abstract:

Merchant shipping is an occupation with a high rate of fatal injuries caused by organizational accidents and maritime disasters. In most accident investigations, the leader’s actions are under scrutiny and point out the necessity to investigate the leader’s decisions in critical conditions. While several leadership studies have been carried out in the past, there is a tendency for most research to focus on holders of formal positions. The unit of analysis in most studies has been the ‘individual.’ A need is, therefore, felt to adopt a practice-based perspective of leadership, understand how leadership emerges to affect maritime safety. This paper explores the phenomenon of distributed leadership among seafarers more holistically. It further examines the role of one form of distributed leadership, that is, planfully aligned leadership in the emergency response of the team. A mixed design will be applied. In the first phase, the data gathered by way of semi-structured interviews will be used to explore the seafarer’s implicit understanding of leadership. The data will be used to develop a conceptual framework of distributed leadership, specific to the maritime context. This framework will be used to develop a simulation. Experimental design will be used to examine the relationship between planfully aligned leadership and emergency response of the team members during navigation. Findings show that planfully aligned leadership significantly and positively predicts the emergency response of team members. Planfully aligned leadership leads to a better emergency response of the team members as compared to authoritarian leadership. In the third qualitative phase, additional data will be gathered through semi-structured interviews to further validate the findings to gain a more complete understanding of distributed leadership and its relation to emergency response. Above are the predictive results; the study expects to be a cornerstone of safety leadership research and has important implications for leadership development and training within the maritime industry.

Keywords: authoritarian leadership, distributed leadership, emergency response , planfully aligned leadership

Procedia PDF Downloads 165
26516 Analysis of Changes Being Done of the Mine Legislation of Turkey: Mining Operation Activity Process

Authors: Taşkın Deniz Yıldız, Mustafa Topaloğlu, Orhan Kural

Abstract:

The right to operate a fairly long periods of prior periods and after the 3213 Mining Law has been observed to be shortened in Turkey. Permit the realization of business activities (or concession) requested the purchase of the mine operated "found mine" position, as well as the financial and technical capability to have the owner of the right to operate the mines as well as the principle of equality is important in terms of assessing the best way be. In particular, in this context, license fields "negligence" (downsizing) have noted that the current arrangement for all periods. However, in the period after 3213 Mining Act and a permit to operate more effectively within the framework of implementation of negligence is laid down.

Keywords: mining legislation, operation, permit, Turkey

Procedia PDF Downloads 398
26515 Isolation Preserving Medical Conclusion Hold Structure via C5 Algorithm

Authors: Swati Kishor Zode, Rahul Ambekar

Abstract:

Data mining is the extraction of fascinating examples on the other hand information from enormous measure of information and choice is made as indicated by the applicable information extracted. As of late, with the dangerous advancement in internet, stockpiling of information and handling procedures, privacy preservation has been one of the major (higher) concerns in data mining. Various techniques and methods have been produced for protection saving data mining. In the situation of Clinical Decision Support System, the choice is to be made on the premise of the data separated from the remote servers by means of Internet to diagnose the patient. In this paper, the fundamental thought is to build the precision of Decision Support System for multiple diseases for different maladies and in addition protect persistent information while correspondence between Clinician side (Client side) also, the Server side. A privacy preserving protocol for clinical decision support network is proposed so that patients information dependably stay scrambled amid diagnose prepare by looking after the accuracy. To enhance the precision of Decision Support System for various malady C5.0 classifiers and to save security, a Homomorphism encryption algorithm Paillier cryptosystem is being utilized.

Keywords: classification, homomorphic encryption, clinical decision support, privacy

Procedia PDF Downloads 327
26514 Comparative Analysis of Classification Methods in Determining Non-Active Student Characteristics in Indonesia Open University

Authors: Dewi Juliah Ratnaningsih, Imas Sukaesih Sitanggang

Abstract:

Classification is one of data mining techniques that aims to discover a model from training data that distinguishes records into the appropriate category or class. Data mining classification methods can be applied in education, for example, to determine the classification of non-active students in Indonesia Open University. This paper presents a comparison of three methods of classification: Naïve Bayes, Bagging, and C.45. The criteria used to evaluate the performance of three methods of classification are stratified cross-validation, confusion matrix, the value of the area under the ROC Curve (AUC), Recall, Precision, and F-measure. The data used for this paper are from the non-active Indonesia Open University students in registration period of 2004.1 to 2012.2. Target analysis requires that non-active students were divided into 3 groups: C1, C2, and C3. Data analyzed are as many as 4173 students. Results of the study show: (1) Bagging method gave a high degree of classification accuracy than Naïve Bayes and C.45, (2) the Bagging classification accuracy rate is 82.99 %, while the Naïve Bayes and C.45 are 80.04 % and 82.74 % respectively, (3) the result of Bagging classification tree method has a large number of nodes, so it is quite difficult in decision making, (4) classification of non-active Indonesia Open University student characteristics uses algorithms C.45, (5) based on the algorithm C.45, there are 5 interesting rules which can describe the characteristics of non-active Indonesia Open University students.

Keywords: comparative analysis, data mining, clasiffication, Bagging, Naïve Bayes, C.45, non-active students, Indonesia Open University

Procedia PDF Downloads 308
26513 Comparing Performance of Neural Network and Decision Tree in Prediction of Myocardial Infarction

Authors: Reza Safdari, Goli Arji, Robab Abdolkhani Maryam zahmatkeshan

Abstract:

Background and purpose: Cardiovascular diseases are among the most common diseases in all societies. The most important step in minimizing myocardial infarction and its complications is to minimize its risk factors. The amount of medical data is increasingly growing. Medical data mining has a great potential for transforming these data into information. Using data mining techniques to generate predictive models for identifying those at risk for reducing the effects of the disease is very helpful. The present study aimed to collect data related to risk factors of heart infarction from patients’ medical record and developed predicting models using data mining algorithm. Methods: The present work was an analytical study conducted on a database containing 350 records. Data were related to patients admitted to Shahid Rajaei specialized cardiovascular hospital, Iran, in 2011. Data were collected using a four-sectioned data collection form. Data analysis was performed using SPSS and Clementine version 12. Seven predictive algorithms and one algorithm-based model for predicting association rules were applied to the data. Accuracy, precision, sensitivity, specificity, as well as positive and negative predictive values were determined and the final model was obtained. Results: five parameters, including hypertension, DLP, tobacco smoking, diabetes, and A+ blood group, were the most critical risk factors of myocardial infarction. Among the models, the neural network model was found to have the highest sensitivity, indicating its ability to successfully diagnose the disease. Conclusion: Risk prediction models have great potentials in facilitating the management of a patient with a specific disease. Therefore, health interventions or change in their life style can be conducted based on these models for improving the health conditions of the individuals at risk.

Keywords: decision trees, neural network, myocardial infarction, Data Mining

Procedia PDF Downloads 426
26512 Performance Study of Classification Algorithms for Consumer Online Shopping Attitudes and Behavior Using Data Mining

Authors: Rana Alaa El-Deen Ahmed, M. Elemam Shehab, Shereen Morsy, Nermeen Mekawie

Abstract:

With the growing popularity and acceptance of e-commerce platforms, users face an ever increasing burden in actually choosing the right product from the large number of online offers. Thus, techniques for personalization and shopping guides are needed by users. For a pleasant and successful shopping experience, users need to know easily which products to buy with high confidence. Since selling a wide variety of products has become easier due to the popularity of online stores, online retailers are able to sell more products than a physical store. The disadvantage is that the customers might not find products they need. In this research the customer will be able to find the products he is searching for, because recommender systems are used in some ecommerce web sites. Recommender system learns from the information about customers and products and provides appropriate personalized recommendations to customers to find the needed product. In this paper eleven classification algorithms are comparatively tested to find the best classifier fit for consumer online shopping attitudes and behavior in the experimented dataset. The WEKA knowledge analysis tool, which is an open source data mining workbench software used in comparing conventional classifiers to get the best classifier was used in this research. In this research by using the data mining tool (WEKA) with the experimented classifiers the results show that decision table and filtered classifier gives the highest accuracy and the lowest accuracy classification via clustering and simple cart.

Keywords: classification, data mining, machine learning, online shopping, WEKA

Procedia PDF Downloads 345
26511 Identification of Soft Faults in Branched Wire Networks by Distributed Reflectometry and Multi-Objective Genetic Algorithm

Authors: Soumaya Sallem, Marc Olivas

Abstract:

This contribution presents a method for detecting, locating, and characterizing soft faults in a complex wired network. The proposed method is based on multi-carrier reflectometry MCTDR (Multi-Carrier Time Domain Reflectometry) combined with a multi-objective genetic algorithm. In order to ensure complete network coverage and eliminate diagnosis ambiguities, the MCTDR test signal is injected at several points on the network, and the data is merged between different reflectometers (sensors) distributed on the network. An adapted multi-objective genetic algorithm is used to merge data in order to obtain more accurate faults location and characterization. The proposed method performances are evaluated from numerical and experimental results.

Keywords: wired network, reflectometry, network distributed diagnosis, multi-objective genetic algorithm

Procedia PDF Downloads 188
26510 A Comparative Analysis of Classification Models with Wrapper-Based Feature Selection for Predicting Student Academic Performance

Authors: Abdullah Al Farwan, Ya Zhang

Abstract:

In today’s educational arena, it is critical to understand educational data and be able to evaluate important aspects, particularly data on student achievement. Educational Data Mining (EDM) is a research area that focusing on uncovering patterns and information in data from educational institutions. Teachers, if they are able to predict their students' class performance, can use this information to improve their teaching abilities. It has evolved into valuable knowledge that can be used for a wide range of objectives; for example, a strategic plan can be used to generate high-quality education. Based on previous data, this paper recommends employing data mining techniques to forecast students' final grades. In this study, five data mining methods, Decision Tree, JRip, Naive Bayes, Multi-layer Perceptron, and Random Forest with wrapper feature selection, were used on two datasets relating to Portuguese language and mathematics classes lessons. The results showed the effectiveness of using data mining learning methodologies in predicting student academic success. The classification accuracy achieved with selected algorithms lies in the range of 80-94%. Among all the selected classification algorithms, the lowest accuracy is achieved by the Multi-layer Perceptron algorithm, which is close to 70.45%, and the highest accuracy is achieved by the Random Forest algorithm, which is close to 94.10%. This proposed work can assist educational administrators to identify poor performing students at an early stage and perhaps implement motivational interventions to improve their academic success and prevent educational dropout.

Keywords: classification algorithms, decision tree, feature selection, multi-layer perceptron, Naïve Bayes, random forest, students’ academic performance

Procedia PDF Downloads 159
26509 Medical Knowledge Management since the Integration of Heterogeneous Data until the Knowledge Exploitation in a Decision-Making System

Authors: Nadjat Zerf Boudjettou, Fahima Nader, Rachid Chalal

Abstract:

Knowledge management is to acquire and represent knowledge relevant to a domain, a task or a specific organization in order to facilitate access, reuse and evolution. This usually means building, maintaining and evolving an explicit representation of knowledge. The next step is to provide access to that knowledge, that is to say, the spread in order to enable effective use. Knowledge management in the medical field aims to improve the performance of the medical organization by allowing individuals in the care facility (doctors, nurses, paramedics, etc.) to capture, share and apply collective knowledge in order to make optimal decisions in real time. In this paper, we propose a knowledge management approach based on integration technique of heterogeneous data in the medical field by creating a data warehouse, a technique of extracting knowledge from medical data by choosing a technique of data mining, and finally an exploitation technique of that knowledge in a case-based reasoning system.

Keywords: data warehouse, data mining, knowledge discovery in database, KDD, medical knowledge management, Bayesian networks

Procedia PDF Downloads 386
26508 Coordinated Voltage Control in a Radial Distribution System

Authors: Shivarudraswamy, Anubhav Shrivastava, Lakshya Bhat

Abstract:

Distributed generation has indeed become a major area of interest in recent years. Distributed Generation can address large number of loads in a power line and hence has better efficiency over the conventional methods. However there are certain drawbacks associated with it, increase in voltage being the major one. This paper addresses the voltage control at the buses for an IEEE 30 bus system by regulating reactive power. For carrying out the analysis, the suitable location for placing distributed generators (DG) is identified through load flow analysis and seeing where the voltage profile is dipping. MATLAB programming is used to regulate the voltage at all buses within +/-5% of the base value even after the introduction of DG’s. Three methods for regulation of voltage are discussed. A sensitivity based analysis is later carried out to determine the priority among the various methods listed in the paper.

Keywords: distributed generators, distributed system, reactive power, voltage control

Procedia PDF Downloads 489
26507 A Supervised Learning Data Mining Approach for Object Recognition and Classification in High Resolution Satellite Data

Authors: Mais Nijim, Rama Devi Chennuboyina, Waseem Al Aqqad

Abstract:

Advances in spatial and spectral resolution of satellite images have led to tremendous growth in large image databases. The data we acquire through satellites, radars and sensors consists of important geographical information that can be used for remote sensing applications such as region planning, disaster management. Spatial data classification and object recognition are important tasks for many applications. However, classifying objects and identifying them manually from images is a difficult task. Object recognition is often considered as a classification problem, this task can be performed using machine-learning techniques. Despite of many machine-learning algorithms, the classification is done using supervised classifiers such as Support Vector Machines (SVM) as the area of interest is known. We proposed a classification method, which considers neighboring pixels in a region for feature extraction and it evaluates classifications precisely according to neighboring classes for semantic interpretation of region of interest (ROI). A dataset has been created for training and testing purpose; we generated the attributes by considering pixel intensity values and mean values of reflectance. We demonstrated the benefits of using knowledge discovery and data-mining techniques, which can be on image data for accurate information extraction and classification from high spatial resolution remote sensing imagery.

Keywords: remote sensing, object recognition, classification, data mining, waterbody identification, feature extraction

Procedia PDF Downloads 333
26506 Improved FP-Growth Algorithm with Multiple Minimum Supports Using Maximum Constraints

Authors: Elsayeda M. Elgaml, Dina M. Ibrahim, Elsayed A. Sallam

Abstract:

Association rule mining is one of the most important fields of data mining and knowledge discovery. In this paper, we propose an efficient multiple support frequent pattern growth algorithm which we called “MSFP-growth” that enhancing the FP-growth algorithm by making infrequent child node pruning step with multiple minimum support using maximum constrains. The algorithm is implemented, and it is compared with other common algorithms: Apriori-multiple minimum supports using maximum constraints and FP-growth. The experimental results show that the rule mining from the proposed algorithm are interesting and our algorithm achieved better performance than other algorithms without scarifying the accuracy.

Keywords: association rules, FP-growth, multiple minimum supports, Weka tool

Procedia PDF Downloads 476
26505 A Novel Microcontroller Based Islanding Protection of Distributed Generation Systems

Authors: Saeid Jalilzadeh, Majid Pakdel

Abstract:

The customer demand for better power quality and higher reliability has forced the power industry to use distributed generations (DGs) such as wind power and photo voltaic arrays. Islanding is a phenomenon occurs when a power grid becomes electrically isolated from the power system and the distribution system is energized by distributed generators. It is necessary to disconnect all distributed generators immediately after islanding occurrence. Therefore a DG system should have the capability to detect islanding phenomena. In this paper, a novel micro controller based relay for anti-islanding protection of a typical DG system is proposed. The simulation results using Proteus software verify the proper operation and effectiveness of the proposed protective relay.

Keywords: islanding, distributed generation (DG), protective relay, micro controller, proteus software

Procedia PDF Downloads 570
26504 Determination of Frequency Relay Setting during Distributed Generators Islanding

Authors: Tarek Kandil, Ameen Ali

Abstract:

Distributed generation (DG) has recently gained a lot of momentum in power industry due to market deregulation and environmental concerns. One of the most technical challenges facing DGs is islanding of distributed generators. The current industry practice is to disconnect all distributed generators immediately after the occurrence of islands within 200 to 350 ms after loss of main supply. To achieve such goal, each DG must be equipped with an islanding detection device. Frequency relays are one of the most commonly used loss of mains detection method. However, distribution utilities may be faced with concerns related to false operation of these frequency relays due to improper settings. The commercially available frequency relays are considering standard tight setting. This paper investigates some factors related to relays internal algorithm that contribute to their different operating responses. Further, the relay operation in the presence of multiple distributed at the same network is analyzed. Finally, the relay setting can be accurately determined based on these investigation and analysis.

Keywords: frequency relay, distributed generation, islanding detection, relay setting

Procedia PDF Downloads 530
26503 Implementation of Lean Tools (Value Stream Mapping and ECRS) in an Oil Refinery

Authors: Ronita Singh, Yaman Pattanaik, Soham Lalwala

Abstract:

In today’s highly competitive business environment, every organization is striving towards lean manufacturing systems to achieve lower Production Lead Times, lower costs, less inventory and overall improvement in supply chains efficiency. Based on the similar idea, this paper presents the practical application of Value Stream Mapping (VSM) tool and ECRS (Eliminate, Combine, Reduce, and Simplify) technique in the receipt section of the material management center of an oil refinery. A value stream is an assortment of all actions (value added as well as non-value added) that are required to bring a product through the essential flows, starting with raw material and ending with the customer. For drawing current state value stream mapping, all relevant data of the receipt cycle has been collected and analyzed. Then analysis of current state map has been done for determining the type and quantum of waste at every stage which helped in ascertaining as to how far the warehouse is from the concept of lean manufacturing. From the results achieved by current VSM, it was observed that the two processes- Preparation of GRN (Goods Receipt Number) and Preparation of UD (Usage Decision) are both bottle neck operations and have higher cycle time. This root cause analysis of various types of waste helped in designing a strategy for step-wise implementation of lean tools. The future state thus created a lean flow of materials at the warehouse center, reducing the lead time of the receipt cycle from 11 days to 7 days and increasing overall efficiency by 27.27%.

Keywords: current VSM, ECRS, future VSM, receipt cycle, supply chain, VSM

Procedia PDF Downloads 301
26502 Model for Introducing Products to New Customers through Decision Tree Using Algorithm C4.5 (J-48)

Authors: Komol Phaisarn, Anuphan Suttimarn, Vitchanan Keawtong, Kittisak Thongyoun, Chaiyos Jamsawang

Abstract:

This article is intended to analyze insurance information which contains information on the customer decision when purchasing life insurance pay package. The data were analyzed in order to present new customers with Life Insurance Perfect Pay package to meet new customers’ needs as much as possible. The basic data of insurance pay package were collect to get data mining; thus, reducing the scattering of information. The data were then classified in order to get decision model or decision tree using Algorithm C4.5 (J-48). In the classification, WEKA tools are used to form the model and testing datasets are used to test the decision tree for the accurate decision. The validation of this model in classifying showed that the accurate prediction was 68.43% while 31.25% were errors. The same set of data were then tested with other models, i.e. Naive Bayes and Zero R. The results showed that J-48 method could predict more accurately. So, the researcher applied the decision tree in writing the program used to introduce the product to new customers to persuade customers’ decision making in purchasing the insurance package that meets the new customers’ needs as much as possible.

Keywords: decision tree, data mining, customers, life insurance pay package

Procedia PDF Downloads 422
26501 Impacts of Land Use and Land Cover Change on Stream Flow and Sediment Yield of Genale Dawa Dam III Watershed, Ethiopia

Authors: Aklilu Getahun Sulito

Abstract:

Land Use and Land Cover change dynamics is a result of complex interactions betweenseveral bio- physical and socio-economic conditions. The impacts of the landcoverchange on stream flow and sediment yield were analyzed statistically usingthehydrological model, SWAT. Genale Dawa Dam III watershed is highly af ectedbydeforestation, over grazing, and agricultural land expansion. This study was aimedusingSWAT model for the assessment of impacts of land use land cover change on sediment yield, evaluating stream flow on wet &dry seasons and spatial distribution sediment yieldfrom sub-basins of the Genale Dawa Dam III watershed. Land use land cover maps(LULC) of 2000, 2008 and 2016 were used with same corresponding climate data. During the study period most parts of the forest, dense forest evergreen and grass landchanged to cultivated land. The cultivated land increased by 26.2%but forest land, forest evergreen lands and grass lands decreased by 21.33%, 11.59 % and 7.28 %respectively, following that the mean annual sediment yield of watershed increased by 7.37ton/haover16 years period (2000 – 2016). The analysis of stream flow for wet and dry seasonsshowed that the steam flow increased by 25.5% during wet season, but decreasedby29.6% in the dry season. The result an average annual spatial distribution of sediment yield increased by 7.73ton/ha yr -1 from (2000_2016). The calibration results for bothstream flow and sediment yield showed good agreement between observed and simulateddata with the coef icient of determination of 0.87 and 0.84, Nash-Sutclif e ef iciencyequality to 0.83 and 0.78 and percentage bias of -7.39% and -10.90%respectively. Andthe result for validation for both stream flow and sediment showed good result withCoef icient of determination equality to 0.83 and 0.80, Nash-Sutclif e ef iciency of 0.78and 0.75 and percentage bias of 7.09% and 3.95%. The result obtained fromthe model based on the above method was the mean annual sediment load at Genale DawaDamIIIwatershed increase from 2000 to 2016 for the reason that of the land uses change. Sotouse the Genale Dawa Dam III the land use management practices are neededinthefuture to prevent further increase of sediment yield of the watershed.

Keywords: Genale Dawa Dam III watershed, land use land cover change, SWAT, spatial distribution, sediment yield, stream flow

Procedia PDF Downloads 48
26500 High Performance Computing and Big Data Analytics

Authors: Branci Sarra, Branci Saadia

Abstract:

Because of the multiplied data growth, many computer science tools have been developed to process and analyze these Big Data. High-performance computing architectures have been designed to meet the treatment needs of Big Data (view transaction processing standpoint, strategic, and tactical analytics). The purpose of this article is to provide a historical and global perspective on the recent trend of high-performance computing architectures especially what has a relation with Analytics and Data Mining.

Keywords: high performance computing, HPC, big data, data analysis

Procedia PDF Downloads 514
26499 Distributed Processing for Content Based Lecture Video Retrieval on Hadoop Framework

Authors: U. S. N. Raju, Kothuri Sai Kiran, Meena G. Kamal, Vinay Nikhil Pabba, Suresh Kanaparthi

Abstract:

There is huge amount of lecture video data available for public use, and many more lecture videos are being created and uploaded every day. Searching for videos on required topics from this huge database is a challenging task. Therefore, an efficient method for video retrieval is needed. An approach for automated video indexing and video search in large lecture video archives is presented. As the amount of video lecture data is huge, it is very inefficient to do the processing in a centralized computation framework. Hence, Hadoop Framework for distributed computing for Big Video Data is used. First, step in the process is automatic video segmentation and key-frame detection to offer a visual guideline for the video content navigation. In the next step, we extract textual metadata by applying video Optical Character Recognition (OCR) technology on key-frames. The OCR and detected slide text line types are adopted for keyword extraction, by which both video- and segment-level keywords are extracted for content-based video browsing and search. The performance of the indexing process can be improved for a large database by using distributed computing on Hadoop framework.

Keywords: video lectures, big video data, video retrieval, hadoop

Procedia PDF Downloads 523
26498 Mood Recognition Using Indian Music

Authors: Vishwa Joshi

Abstract:

The study of mood recognition in the field of music has gained a lot of momentum in the recent years with machine learning and data mining techniques and many audio features contributing considerably to analyze and identify the relation of mood plus music. In this paper we consider the same idea forward and come up with making an effort to build a system for automatic recognition of mood underlying the audio song’s clips by mining their audio features and have evaluated several data classification algorithms in order to learn, train and test the model describing the moods of these audio songs and developed an open source framework. Before classification, Preprocessing and Feature Extraction phase is necessary for removing noise and gathering features respectively.

Keywords: music, mood, features, classification

Procedia PDF Downloads 491