Search results for: data mining analytics
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25601

Search results for: data mining analytics

25241 Heterogeneous-Resolution and Multi-Source Terrain Builder for CesiumJS WebGL Virtual Globe

Authors: Umberto Di Staso, Marco Soave, Alessio Giori, Federico Prandi, Raffaele De Amicis

Abstract:

The increasing availability of information about earth surface elevation (Digital Elevation Models DEM) generated from different sources (remote sensing, Aerial Images, Lidar) poses the question about how to integrate and make available to the most than possible audience this huge amount of data. In order to exploit the potential of 3D elevation representation the quality of data management plays a fundamental role. Due to the high acquisition costs and the huge amount of generated data, highresolution terrain surveys tend to be small or medium sized and available on limited portion of earth. Here comes the need to merge large-scale height maps that typically are made available for free at worldwide level, with very specific high resolute datasets. One the other hand, the third dimension increases the user experience and the data representation quality, unlocking new possibilities in data analysis for civil protection, real estate, urban planning, environment monitoring, etc. The open-source 3D virtual globes, which are trending topics in Geovisual Analytics, aim at improving the visualization of geographical data provided by standard web services or with proprietary formats. Typically, 3D Virtual globes like do not offer an open-source tool that allows the generation of a terrain elevation data structure starting from heterogeneous-resolution terrain datasets. This paper describes a technological solution aimed to set up a so-called “Terrain Builder”. This tool is able to merge heterogeneous-resolution datasets, and to provide a multi-resolution worldwide terrain services fully compatible with CesiumJS and therefore accessible via web using traditional browser without any additional plug-in.

Keywords: Terrain Builder, WebGL, Virtual Globe, CesiumJS, Tiled Map Service, TMS, Height-Map, Regular Grid, Geovisual Analytics, DTM

Procedia PDF Downloads 426
25240 An Improved K-Means Algorithm for Gene Expression Data Clustering

Authors: Billel Kenidra, Mohamed Benmohammed

Abstract:

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Keywords: microarray data mining, biological pattern recognition, partitional clustering, k-means algorithm, centroid initialization

Procedia PDF Downloads 190
25239 Integrating of Multi-Criteria Decision Making and Spatial Data Warehouse in Geographic Information System

Authors: Zohra Mekranfar, Ahmed Saidi, Abdellah Mebrek

Abstract:

This work aims to develop multi-criteria decision making (MCDM) and spatial data warehouse (SDW) methods, which will be integrated into a GIS according to a ‘GIS dominant’ approach. The GIS operating tools will be operational to operate the SDW. The MCDM methods can provide many solutions to a set of problems with various and multiple criteria. When the problem is so complex, integrating spatial dimension, it makes sense to combine the MCDM process with other approaches like data mining, ascending analyses, we present in this paper an experiment showing a geo-decisional methodology of SWD construction, On-line analytical processing (OLAP) technology which combines both basic multidimensional analysis and the concepts of data mining provides powerful tools to highlight inductions and information not obvious by traditional tools. However, these OLAP tools become more complex in the presence of the spatial dimension. The integration of OLAP with a GIS is the future geographic and spatial information solution. GIS offers advanced functions for the acquisition, storage, analysis, and display of geographic information. However, their effectiveness for complex spatial analysis is questionable due to their determinism and their decisional rigor. A prerequisite for the implementation of any analysis or exploration of spatial data requires the construction and structuring of a spatial data warehouse (SDW). This SDW must be easily usable by the GIS and by the tools offered by an OLAP system.

Keywords: data warehouse, GIS, MCDM, SOLAP

Procedia PDF Downloads 178
25238 Integration of Educational Data Mining Models to a Web-Based Support System for Predicting High School Student Performance

Authors: Sokkhey Phauk, Takeo Okazaki

Abstract:

The challenging task in educational institutions is to maximize the high performance of students and minimize the failure rate of poor-performing students. An effective method to leverage this task is to know student learning patterns with highly influencing factors and get an early prediction of student learning outcomes at the timely stage for setting up policies for improvement. Educational data mining (EDM) is an emerging disciplinary field of data mining, statistics, and machine learning concerned with extracting useful knowledge and information for the sake of improvement and development in the education environment. The study is of this work is to propose techniques in EDM and integrate it into a web-based system for predicting poor-performing students. A comparative study of prediction models is conducted. Subsequently, high performing models are developed to get higher performance. The hybrid random forest (Hybrid RF) produces the most successful classification. For the context of intervention and improving the learning outcomes, a feature selection method MICHI, which is the combination of mutual information (MI) and chi-square (CHI) algorithms based on the ranked feature scores, is introduced to select a dominant feature set that improves the performance of prediction and uses the obtained dominant set as information for intervention. By using the proposed techniques of EDM, an academic performance prediction system (APPS) is subsequently developed for educational stockholders to get an early prediction of student learning outcomes for timely intervention. Experimental outcomes and evaluation surveys report the effectiveness and usefulness of the developed system. The system is used to help educational stakeholders and related individuals for intervening and improving student performance.

Keywords: academic performance prediction system, educational data mining, dominant factors, feature selection method, prediction model, student performance

Procedia PDF Downloads 106
25237 Mine Production Index (MPi): New Method to Evaluate Effectiveness of Mining Machinery

Authors: Amol Lanke, Hadi Hoseinie, Behzad Ghodrati

Abstract:

OEE has been used in many industries as measure of performance. However due to limitations of original OEE, it has been modified by various researchers. OEE for mining application is special version of classic equation, carries these limitation over. In this paper it has been aimed to modify the OEE for mining application by introducing the weights to the elements of it and termed as Mine Production index (MPi). As a special application of new index MPi shovel has been developed by team of experts and researchers for evaluating the shovel effectiveness. Based on analysis, utilization followed by performance and availability were ranked in this order. To check the applicability of this index, a case study was done on four electrical and one hydraulic shovel in a Swedish mine. The results shows that MPishovelcan properly evaluate production effectiveness of shovels and determine effectiveness values in optimistic view compared to OEE. MPi with calculation not only give the effectiveness but also can predict which elements should be focused for improving the productivity.

Keywords: mining, overall equipment efficiency (OEE), mine production index, shovels

Procedia PDF Downloads 463
25236 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: clustering, unsupervised learning, pattern recognition, categorical datasets, knowledge discovery, k-means

Procedia PDF Downloads 259
25235 The Acquisition of Case in Biological Domain Based on Text Mining

Authors: Shen Jian, Hu Jie, Qi Jin, Liu Wei Jie, Chen Ji Yi, Peng Ying Hong

Abstract:

In order to settle the problem of acquiring case in biological related to design problems, a biometrics instance acquisition method based on text mining is presented. Through the construction of corpus text vector space and knowledge mining, the feature selection, similarity measure and case retrieval method of text in the field of biology are studied. First, we establish a vector space model of the corpus in the biological field and complete the preprocessing steps. Then, the corpus is retrieved by using the vector space model combined with the functional keywords to obtain the biological domain examples related to the design problems. Finally, we verify the validity of this method by taking the example of text.

Keywords: text mining, vector space model, feature selection, biologically inspired design

Procedia PDF Downloads 262
25234 Discerning Divergent Nodes in Social Networks

Authors: Mehran Asadi, Afrand Agah

Abstract:

In data mining, partitioning is used as a fundamental tool for classification. With the help of partitioning, we study the structure of data, which allows us to envision decision rules, which can be applied to classification trees. In this research, we used online social network dataset and all of its attributes (e.g., Node features, labels, etc.) to determine what constitutes an above average chance of being a divergent node. We used the R statistical computing language to conduct the analyses in this report. The data were found on the UC Irvine Machine Learning Repository. This research introduces the basic concepts of classification in online social networks. In this work, we utilize overfitting and describe different approaches for evaluation and performance comparison of different classification methods. In classification, the main objective is to categorize different items and assign them into different groups based on their properties and similarities. In data mining, recursive partitioning is being utilized to probe the structure of a data set, which allow us to envision decision rules and apply them to classify data into several groups. Estimating densities is hard, especially in high dimensions, with limited data. Of course, we do not know the densities, but we could estimate them using classical techniques. First, we calculated the correlation matrix of the dataset to see if any predictors are highly correlated with one another. By calculating the correlation coefficients for the predictor variables, we see that density is strongly correlated with transitivity. We initialized a data frame to easily compare the quality of the result classification methods and utilized decision trees (with k-fold cross validation to prune the tree). The method performed on this dataset is decision trees. Decision tree is a non-parametric classification method, which uses a set of rules to predict that each observation belongs to the most commonly occurring class label of the training data. Our method aggregates many decision trees to create an optimized model that is not susceptible to overfitting. When using a decision tree, however, it is important to use cross-validation to prune the tree in order to narrow it down to the most important variables.

Keywords: online social networks, data mining, social cloud computing, interaction and collaboration

Procedia PDF Downloads 157
25233 Feature Based Unsupervised Intrusion Detection

Authors: Deeman Yousif Mahmood, Mohammed Abdullah Hussein

Abstract:

The goal of a network-based intrusion detection system is to classify activities of network traffics into two major categories: normal and attack (intrusive) activities. Nowadays, data mining and machine learning plays an important role in many sciences; including intrusion detection system (IDS) using both supervised and unsupervised techniques. However, one of the essential steps of data mining is feature selection that helps in improving the efficiency, performance and prediction rate of proposed approach. This paper applies unsupervised K-means clustering algorithm with information gain (IG) for feature selection and reduction to build a network intrusion detection system. For our experimental analysis, we have used the new NSL-KDD dataset, which is a modified dataset for KDDCup 1999 intrusion detection benchmark dataset. With a split of 60.0% for the training set and the remainder for the testing set, a 2 class classifications have been implemented (Normal, Attack). Weka framework which is a java based open source software consists of a collection of machine learning algorithms for data mining tasks has been used in the testing process. The experimental results show that the proposed approach is very accurate with low false positive rate and high true positive rate and it takes less learning time in comparison with using the full features of the dataset with the same algorithm.

Keywords: information gain (IG), intrusion detection system (IDS), k-means clustering, Weka

Procedia PDF Downloads 296
25232 Clustering Ethno-Informatics of Naming Village in Java Island Using Data Mining

Authors: Atje Setiawan Abdullah, Budi Nurani Ruchjana, I. Gede Nyoman Mindra Jaya, Eddy Hermawan

Abstract:

Ethnoscience is used to see the culture with a scientific perspective, which may help to understand how people develop various forms of knowledge and belief, initially focusing on the ecology and history of the contributions that have been there. One of the areas studied in ethnoscience is etno-informatics, is the application of informatics in the culture. In this study the science of informatics used is data mining, a process to automatically extract knowledge from large databases, to obtain interesting patterns in order to obtain a knowledge. While the application of culture described by naming database village on the island of Java were obtained from Geographic Indonesia Information Agency (BIG), 2014. The purpose of this study is; first, to classify the naming of the village on the island of Java based on the structure of the word naming the village, including the prefix of the word, syllable contained, and complete word. Second to classify the meaning of naming the village based on specific categories, as well as its role in the community behavioral characteristics. Third, how to visualize the naming of the village to a map location, to see the similarity of naming villages in each province. In this research we have developed two theorems, i.e theorems area as a result of research studies have collected intersection naming villages in each province on the island of Java, and the composition of the wedge theorem sets the provinces in Java is used to view the peculiarities of a location study. The methodology in this study base on the method of Knowledge Discovery in Database (KDD) on data mining, the process includes preprocessing, data mining and post processing. The results showed that the Java community prioritizes merit in running his life, always working hard to achieve a more prosperous life, and love as well as water and environmental sustainment. Naming villages in each location adjacent province has a high degree of similarity, and influence each other. Cultural similarities in the province of Central Java, East Java and West Java-Banten have a high similarity, whereas in Jakarta-Yogyakarta has a low similarity. This research resulted in the cultural character of communities within the meaning of the naming of the village on the island of Java, this character is expected to serve as a guide in the behavior of people's daily life on the island of Java.

Keywords: ethnoscience, ethno-informatics, data mining, clustering, Java island culture

Procedia PDF Downloads 283
25231 Compliance with the Health and Safety Standards/Regulations in the South African Mining Industry: A Literature Review

Authors: Livhuwani Muthelo, Tebogo Maria Mothiba, Rambelani Nancy Malema

Abstract:

Background: Despite occupational legislation/standards being in place in the industry, there are many reported health and safety incidents, including both occupational injuries and illnesses in the South African mining industry. Purpose: This systematic literature review aimed to describe and identify the existing gaps in health and safety compliance within the South African mining industry and propose future research areas. Methodology: A systematic literature review was conducted using the key concepts of health and safety, compliance, standards, and mining. A total of 102 papers issued from 1994 to April 2020 were extracted from an online database search, which included a combination of South African and international government OHS legislation documents, policies, standards, reports from the mineral departments and international labour office, qualitative and quantitative journal articles, dissertations, seminars and conference proceedings. Results: The literature review revealed that, though there are laws, regulations, standards to guide the industry on health and safety issues in South Africa, the main challenge is with the compliance with the existing health and safety systems, wherein systems are not being implemented. Conclusion: Gaps between research, policy, and implementation in occupational health practice in the South African mining industry were also identified.

Keywords: circumstances, non-compliance, health and safety, standards, mining industry

Procedia PDF Downloads 288
25230 Forest Risk and Vulnerability Assessment: A Case Study from East Bokaro Coal Mining Area in India

Authors: Sujata Upgupta, Prasoon Kumar Singh

Abstract:

The expansion of large scale coal mining into forest areas is a potential hazard for the local biodiversity and wildlife. The objective of this study is to provide a picture of the threat that coal mining poses to the forests of the East Bokaro landscape. The vulnerable forest areas at risk have been assessed and the priority areas for conservation have been presented. The forested areas at risk in the current scenario have been assessed and compared with the past conditions using classification and buffer based overlay approach. Forest vulnerability has been assessed using an analytical framework based on systematic indicators and composite vulnerability index values. The results indicate that more than 4 km2 of forests have been lost from 1973 to 2016. Large patches of forests have been diverted for coal mining projects. Forests in the northern part of the coal field within 1-3 km radius around the coal mines are at immediate risk. The original contiguous forests have been converted into fragmented and degraded forest patches. Most of the collieries are located within or very close to the forests thus threatening the biodiversity and hydrology of the surrounding regions. Based on the vulnerability values estimated, it was concluded that more than 90% of the forested grids in East Bokaro are highly vulnerable to mining. The forests in the sub-districts of Bermo and Chandrapura have been identified as the most vulnerable to coal mining activities. This case study would add to the capacity of the forest managers and mine managers to address the risk and vulnerability of forests at a small landscape level in order to achieve sustainable development.

Keywords: forest, coal mining, indicators, vulnerability

Procedia PDF Downloads 389
25229 Planning Urban Sprawl in Mining Areas in Africa: How to Ensure Coherent Development

Authors: Pascal Rey, Anaïs Weber

Abstract:

Many mining projects are being developed in Africa the last decades. Due to the economic opportunities they offer, these projects result in a massive and rapid influx of migrants to the surrounding area. In areas where central government representation is low and local administration lack financial resources, urban development is often anarchical, beyond all public control. It leads to socio-spatial segregation, insecurity and the risk of social conflicts rising. Aware that their economic development is very correlated with local situation, mining companies get more and more involved in regional planning in setting up tools and Strategic Directions document. One of the commonly used tools in this regard is the “Influx Management Plan”. It consists in looking at the region’s absorption capacities in order to ensure its coherent development and by developing several urban centers than one macrocephalic city. It includes many other measures such as urban governance support, skills transfer, creation of strategic guidelines, financial support (local taxes, mining taxes, development funds etc.) local development projects. Through various examples of mining projects in Guinea, A country that is host to many large mining projects, we will look at the implications of regional and urban planning of which mining companies are key playor as well as public authorities. While their investment capacity offers advantages and accelerates development, their actions raise questions of the unilaterality of interests and local governance. By interfering in public affairs are mining companies not increasing the risk of central and local government shirking their responsibilities in terms of regional development, or even calling their legitimacy into question? Is such public-private collaboration really sustainable for the region as a whole and for all stakeholders?

Keywords: Africa, guinea, mine, urban planning

Procedia PDF Downloads 498
25228 Assessment for the Backfill Using the Run of the Mine Tailings and Portland Cement

Authors: Javad Someehneshin, Weizhou Quan, Abdelsalam Abugharara, Stephen Butt

Abstract:

Narrow vein mining (NVM) is exploiting very thin but valuable ore bodies that are uneconomical to extract by conventional mining methods. NVM applies the technique of Sustainable Mining by Drilling (SMD). The SMD method is used to mine stranded, steeply dipping ore veins, which are too small or isolated to mine economically using conventional methods since the dilution is minimized. This novel mining technique uses drilling rigs to extract the ore through directional drilling surgically. This paper is focusing on utilizing the run of the mine tailings and Portland cement as backfill material to support the hanging wall for providing safe mine operation. Cemented paste backfill (CPB) is designed by mixing waste tailings, water, and cement of the precise percentage for optimal outcomes. It is a non-homogenous material that contains 70-85% solids. Usually, a hydraulic binder is added to the mixture to increase the strength of the CPB. The binder fraction mostly accounts for 2–10% of the total weight. In the mining industry, CPB has been improved and expanded gradually because it provides safety and support for the mines. Furthermore, CPB helps manage the waste tailings in an economical method and plays a significant role in environmental protection.

Keywords: backfilling, cement backfill, tailings, Portland cement

Procedia PDF Downloads 138
25227 Developing an Advanced Algorithm Capable of Classifying News, Articles and Other Textual Documents Using Text Mining Techniques

Authors: R. B. Knudsen, O. T. Rasmussen, R. A. Alphinas

Abstract:

The reason for conducting this research is to develop an algorithm that is capable of classifying news articles from the automobile industry, according to the competitive actions that they entail, with the use of Text Mining (TM) methods. It is needed to test how to properly preprocess the data for this research by preparing pipelines which fits each algorithm the best. The pipelines are tested along with nine different classification algorithms in the realm of regression, support vector machines, and neural networks. Preliminary testing for identifying the optimal pipelines and algorithms resulted in the selection of two algorithms with two different pipelines. The two algorithms are Logistic Regression (LR) and Artificial Neural Network (ANN). These algorithms are optimized further, where several parameters of each algorithm are tested. The best result is achieved with the ANN. The final model yields an accuracy of 0.79, a precision of 0.80, a recall of 0.78, and an F1 score of 0.76. By removing three of the classes that created noise, the final algorithm is capable of reaching an accuracy of 94%.

Keywords: Artificial Neural network, Competitive dynamics, Logistic Regression, Text classification, Text mining

Procedia PDF Downloads 121
25226 Twitter Ego Networks and the Capital Markets: A Social Network Analysis Perspective of Market Reactions to Earnings Announcement Events

Authors: Gregory D. Saxton

Abstract:

Networks are everywhere: lunch ties among co-workers, golfing partnerships among employees, interlocking board-of-director connections, Facebook friendship ties, etc. Each network varies in terms of its structure -its size, how inter-connected network members are, and the prevalence of sub-groups and cliques. At the same time, within any given network, some network members will have a more important, more central position on account of their greater number of connections or their capacity as “bridges” connecting members of different network cliques. The logic of network structure and position is at the heart of what is known as social network analysis, and this paper applies this logic to the study of the stock market. Using an array of data analytics and machine learning tools, this study will examine 17 million Twitter messages discussing the stocks of the firms in the S&P 1,500 index in 2018. Each of these 1,500 stocks has a distinct Twitter discussion network that varies in terms of core network characteristics such as size, density, influence, norms and values, level of activity, and embedded resources. The study’s core proposition is that the ultimate effect of any market-relevant information is contingent on the characteristics of the network through which it flows. To test this proposition, this study operationalizes each of the core network characteristics and examines their influence on market reactions to 2018 quarterly earnings announcement events.

Keywords: data analytics, investor-to-investor communication, social network analysis, Twitter

Procedia PDF Downloads 121
25225 Use of Quasi-3D Inversion of VES Data Based on Lateral Constraints to Characterize the Aquifer and Mining Sites of an Area Located in the North-East of Figuil, North Cameroon

Authors: Fofie Kokea Ariane Darolle, Gouet Daniel Hervé, Koumetio Fidèle, Yemele David

Abstract:

The electrical resistivity method is successfully used in this paper in order to have a clearer picture of the subsurface of the North-East ofFiguil in northern Cameroon. It is worth noting that this method is most often used when the objective of the study is to image the shallow subsoils by considering them as a set of stratified ground layers. The problem to be solved is very often environmental, and in this case, it is necessary to perform an inversion of the data in order to have a complete and accurate picture of the parameters of the said layers. In the case of this work, thirty-three (33) Schlumberger VES have been carried out on an irregular grid to investigate the subsurface of the study area. The 1D inversion applied as a preliminary modeling tool and in correlation with the mechanical drillings results indicates a complex subsurface lithology distribution mainly consisting of marbles and schists. Moreover, the quasi-3D inversion with lateral constraint shows that the misfit between the observed field data and the model response is quite good and acceptable with a value low than 10%. The method also reveals existence of two water bearing in the considered area. The first is the schist or weathering aquifer (unsuitable), and the other is the marble or the fracturing aquifer (suitable). The final quasi 3D inversion results and geological models indicate proper sites for groundwaters prospecting and for mining exploitation, thus allowing the economic development of the study area.

Keywords: electrical resistivity method, 1D inversion, quasi 3D inversion, groundwaters, mining

Procedia PDF Downloads 155
25224 The Use of Classifiers in Image Analysis of Oil Wells Profiling Process and the Automatic Identification of Events

Authors: Jaqueline Maria Ribeiro Vieira

Abstract:

Different strategies and tools are available at the oil and gas industry for detecting and analyzing tension and possible fractures in borehole walls. Most of these techniques are based on manual observation of the captured borehole images. While this strategy may be possible and convenient with small images and few data, it may become difficult and suitable to errors when big databases of images must be treated. While the patterns may differ among the image area, depending on many characteristics (drilling strategy, rock components, rock strength, etc.). Previously we developed and proposed a novel strategy capable of detecting patterns at borehole images that may point to regions that have tension and breakout characteristics, based on segmented images. In this work we propose the inclusion of data-mining classification strategies in order to create a knowledge database of the segmented curves. These classifiers allow that, after some time using and manually pointing parts of borehole images that correspond to tension regions and breakout areas, the system will indicate and suggest automatically new candidate regions, with higher accuracy. We suggest the use of different classifiers methods, in order to achieve different knowledge data set configurations.

Keywords: image segmentation, oil well visualization, classifiers, data-mining, visual computer

Procedia PDF Downloads 303
25223 Application of Association Rule Using Apriori Algorithm for Analysis of Industrial Accidents in 2013-2014 in Indonesia

Authors: Triano Nurhikmat

Abstract:

Along with the progress of science and technology, the development of the industrialized world in Indonesia took place very rapidly. This leads to a process of industrialization of society Indonesia faster with the establishment of the company and the workplace are diverse. Development of the industry relates to the activity of the worker. Where in these work activities do not cover the possibility of an impending crash on either the workers or on a construction project. The cause of the occurrence of industrial accidents was the fault of electrical damage, work procedures, and error technique. The method of an association rule is one of the main techniques in data mining and is the most common form used in finding the patterns of data collection. In this research would like to know how relations of the association between the incidence of any industrial accidents. Therefore, by using methods of analysis association rule patterns associated with combination obtained two iterations item set (2 large item set) when every factor of industrial accidents with a West Jakarta so industrial accidents caused by the occurrence of an electrical value damage = 0.2 support and confidence value = 1, and the reverse pattern with value = 0.2 support and confidence = 0.75.

Keywords: association rule, data mining, industrial accidents, rules

Procedia PDF Downloads 299
25222 Information Management Approach in the Prediction of Acute Appendicitis

Authors: Ahmad Shahin, Walid Moudani, Ali Bekraki

Abstract:

This research aims at presenting a predictive data mining model to handle an accurate diagnosis of acute appendicitis with patients for the purpose of maximizing the health service quality, minimizing morbidity/mortality, and reducing cost. However, acute appendicitis is the most common disease which requires timely accurate diagnosis and needs surgical intervention. Although the treatment of acute appendicitis is simple and straightforward, its diagnosis is still difficult because no single sign, symptom, laboratory or image examination accurately confirms the diagnosis of acute appendicitis in all cases. This contributes in increasing morbidity and negative appendectomy. In this study, the authors propose to generate an accurate model in prediction of patients with acute appendicitis which is based, firstly, on the segmentation technique associated to ABC algorithm to segment the patients; secondly, on applying fuzzy logic to process the massive volume of heterogeneous and noisy data (age, sex, fever, white blood cell, neutrophilia, CRP, urine, ultrasound, CT, appendectomy, etc.) in order to express knowledge and analyze the relationships among data in a comprehensive manner; and thirdly, on applying dynamic programming technique to reduce the number of data attributes. The proposed model is evaluated based on a set of benchmark techniques and even on a set of benchmark classification problems of osteoporosis, diabetes and heart obtained from the UCI data and other data sources.

Keywords: healthcare management, acute appendicitis, data mining, classification, decision tree

Procedia PDF Downloads 350
25221 Hybrid Energy System for the German Mining Industry: An Optimized Model

Authors: Kateryna Zharan, Jan C. Bongaerts

Abstract:

In recent years, economic attractiveness of renewable energy (RE) for the mining industry, especially for off-grid mines, and a negative environmental impact of fossil energy are stimulating to use RE for mining needs. Being that remote area mines have higher energy expenses than mines connected to a grid, integration of RE may give a mine economic benefits. Regarding the literature review, there is a lack of business models for adopting of RE at mine. The main aim of this paper is to develop an optimized model of RE integration into the German mining industry (GMI). Hereby, the GMI with amount of around 800 mill. t. annually extracted resources is included in the list of the 15 major mining country in the world. Accordingly, the mining potential of Germany is evaluated in this paper as a perspective market for RE implementation. The GMI has been classified in order to find out the location of resources, quantity and types of the mines, amount of extracted resources, and access of the mines to the energy resources. Additionally, weather conditions have been analyzed in order to figure out where wind and solar generation technologies can be integrated into a mine with the highest efficiency. Despite the fact that the electricity demand of the GMI is almost completely covered by a grid connection, the hybrid energy system (HES) based on a mix of RE and fossil energy is developed due to show environmental and economic benefits. The HES for the GMI consolidates a combination of wind turbine, solar PV, battery and diesel generation. The model has been calculated using the HOMER software. Furthermore, the demonstrated HES contains a forecasting model that predicts solar and wind generation in advance. The main result from the HES such as CO2 emission reduction is estimated in order to make the mining processing more environmental friendly.

Keywords: diesel generation, German mining industry, hybrid energy system, hybrid optimization model for electric renewables, optimized model, renewable energy

Procedia PDF Downloads 343
25220 Sensing to Respond & Recover in Emergency

Authors: Alok Kumar, Raviraj Patil

Abstract:

The ability to respond to an incident of a disastrous event in a vulnerable area is very crucial an aspect of emergency management. The ability to constantly predict the likelihood of an event along with its severity in an area and react to those significant events which are likely to have a high impact allows the authorities to respond by allocating resources optimally in a timely manner. It provides for measuring, monitoring, and modeling facilities that integrate underlying systems into one solution to improve operational efficiency, planning, and coordination. We were particularly involved in this innovative incubation work on the current state of research and development in collaboration. technologies & systems for a disaster.

Keywords: predictive analytics, advanced analytics, area flood likelihood model, area flood severity model, level of impact model, mortality score, economic loss score, resource allocation, crew allocation

Procedia PDF Downloads 321
25219 Multi-Source Data Fusion for Urban Comprehensive Management

Authors: Bolin Hua

Abstract:

In city governance, various data are involved, including city component data, demographic data, housing data and all kinds of business data. These data reflects different aspects of people, events and activities. Data generated from various systems are different in form and data source are different because they may come from different sectors. In order to reflect one or several facets of an event or rule, data from multiple sources need fusion together. Data from different sources using different ways of collection raised several issues which need to be resolved. Problem of data fusion include data update and synchronization, data exchange and sharing, file parsing and entry, duplicate data and its comparison, resource catalogue construction. Governments adopt statistical analysis, time series analysis, extrapolation, monitoring analysis, value mining, scenario prediction in order to achieve pattern discovery, law verification, root cause analysis and public opinion monitoring. The result of Multi-source data fusion is to form a uniform central database, which includes people data, location data, object data, and institution data, business data and space data. We need to use meta data to be referred to and read when application needs to access, manipulate and display the data. A uniform meta data management ensures effectiveness and consistency of data in the process of data exchange, data modeling, data cleansing, data loading, data storing, data analysis, data search and data delivery.

Keywords: multi-source data fusion, urban comprehensive management, information fusion, government data

Procedia PDF Downloads 393
25218 Voice of Customer: Mining Customers' Reviews on On-Line Car Community

Authors: Kim Dongwon, Yu Songjin

Abstract:

This study identifies the business value of VOC (Voice of Customer) on the business. Precisely, we intend to demonstrate how much negative and positive sentiment of VOC has an influence on car sales market share in the unites states. We extract 7 emotions such as sadness, shame, anger, fear, frustration, delight and satisfaction from the VOC data, 23,204 pieces of opinions, that had been posted on car-related on-line community from 2007 to 2009(a part of data collection from 2007 to 2015), and intend to clarify the correlation between negative and positive sentimental keywords and contribution to market share. In order to develop a lexicon for each category of negative and positive sentiment, we took advantage of Corpus program, Antconc 3.4.1.w and on-line sentimental data, SentiWordNet and identified the part of speech(POS) information of words in the customers' opinion by using a part-of-speech tagging function provided by TextAnalysisOnline. For the purpose of this present study, a total of 45,741 pieces of customers' opinions of 28 car manufacturing companies had been collected including titles and status information. We conducted an experiment to examine whether the inclusion, frequency and intensity of terms with negative and positive emotions in each category affect the adoption of customer opinions for vehicle organizations' market share. In the experiment, we statistically verified that there is correlation between customer ideas containing negative and positive emotions and variation of marker share. Particularly, "Anger," a domain of negative domains, is significantly influential to car sales market share. The domain "Delight" and "Satisfaction" increased in proportion to growth of market share.

Keywords: data mining, opinion mining, sentiment analysis, VOC

Procedia PDF Downloads 214
25217 Neural Networks Models for Measuring Hotel Users Satisfaction

Authors: Asma Ameur, Dhafer Malouche

Abstract:

Nowadays, user comments on the Internet have an important impact on hotel bookings. This confirms that the e-reputation issue can influence the likelihood of customer loyalty to a hotel. In this way, e-reputation has become a real differentiator between hotels. For this reason, we have a unique opportunity in the opinion mining field to analyze the comments. In fact, this field provides the possibility of extracting information related to the polarity of user reviews. This sentimental study (Opinion Mining) represents a new line of research for analyzing the unstructured textual data. Knowing the score of e-reputation helps the hotelier to better manage his marketing strategy. The score we then obtain is translated into the image of hotels to differentiate between them. Therefore, this present research highlights the importance of hotel satisfaction ‘scoring. To calculate the satisfaction score, the sentimental analysis can be manipulated by several techniques of machine learning. In fact, this study treats the extracted textual data by using the Artificial Neural Networks Approach (ANNs). In this context, we adopt the aforementioned technique to extract information from the comments available in the ‘Trip Advisor’ website. This actual paper details the description and the modeling of the ANNs approach for the scoring of online hotel reviews. In summary, the validation of this used method provides a significant model for hotel sentiment analysis. So, it provides the possibility to determine precisely the polarity of the hotel users reviews. The empirical results show that the ANNs are an accurate approach for sentiment analysis. The obtained results show also that this proposed approach serves to the dimensionality reduction for textual data’ clustering. Thus, this study provides researchers with a useful exploration of this technique. Finally, we outline guidelines for future research in the hotel e-reputation field as comparing the ANNs with other technique.

Keywords: clustering, consumer behavior, data mining, e-reputation, machine learning, neural network, online hotel ‘reviews, opinion mining, scoring

Procedia PDF Downloads 136
25216 Study of the Stability of Underground Mines by Numerical Method: The Mine Chaabet El Hamra, Algeria

Authors: Nakache Radouane, M. Boukelloul, M. Fredj

Abstract:

Method room and pillar sizes are key factors for safe mining and their recovery in open-stop mining. This method is advantageous due to its simplicity and requirement of little information to be used. It is probably the most representative method among the total load approach methods although it also remains a safe design method. Using a finite element software (PLAXIS 3D), analyses were carried out with an elasto-plastic model and comparisons were made with methods based on the total load approach. The results were presented as the optimization for improving the ore recovery rate while maintaining a safe working environment.

Keywords: room and pillar, mining, total load approach, elasto-plastic

Procedia PDF Downloads 330
25215 Evaluation of the Performance of ACTIFLO® Clarifier in the Treatment of Mining Wastewaters: Case Study of Costerfield Mining Operations, Victoria, Australia

Authors: Seyed Mohsen Samaei, Shirley Gato-Trinidad

Abstract:

A pre-treatment stage prior to reverse osmosis (RO) is very important to ensure the long-term performance of the RO membranes in any wastewater treatment using RO. This study aims to evaluate the application of the Actiflo® clarifier as part of a pre-treatment unit in mining operations. It involves performing analytical testing on RO feed water before and after installation of Actiflo® unit. Water samples prior to RO plant stage were obtained on different dates from Costerfield mining operations in Victoria, Australia. Tests were conducted in an independent laboratory to determine the concentration of various compounds in RO feed water before and after installation of Actiflo® unit during the entire evaluated period from December 2015 to June 2018. Water quality analysis shows that the quality of RO feed water has remarkably improved since installation of Actiflo® clarifier. Suspended solids (SS) and turbidity removal efficiencies has been improved by 91 and 85 percent respectively in pre-treatment system since the installation of Actiflo®. The Actiflo® clarifier proved to be a valuable part of pre-treatment system prior to RO. It has the potential to conveniently condition the mining wastewater prior to RO unit, and reduce the risk of RO physical failure and irreversible fouling. Consequently, reliable and durable operation of RO unit with minimum requirement for RO membrane replacement is expected with Actiflo® in use.

Keywords: ACTIFLO ® clarifier, mining wastewater, reverse osmosis, water treatment

Procedia PDF Downloads 193
25214 Use of In-line Data Analytics and Empirical Model for Early Fault Detection

Authors: Hyun-Woo Cho

Abstract:

Automatic process monitoring schemes are designed to give early warnings for unusual process events or abnormalities as soon as possible. For this end, various techniques have been developed and utilized in various industrial processes. It includes multivariate statistical methods, representation skills in reduced spaces, kernel-based nonlinear techniques, etc. This work presents a nonlinear empirical monitoring scheme for batch type production processes with incomplete process measurement data. While normal operation data are easy to get, unusual fault data occurs infrequently and thus are difficult to collect. In this work, noise filtering steps are added in order to enhance monitoring performance by eliminating irrelevant information of the data. The performance of the monitoring scheme was demonstrated using batch process data. The results showed that the monitoring performance was improved significantly in terms of detection success rate of process fault.

Keywords: batch process, monitoring, measurement, kernel method

Procedia PDF Downloads 323
25213 A Fuzzy Kernel K-Medoids Algorithm for Clustering Uncertain Data Objects

Authors: Behnam Tavakkol

Abstract:

Uncertain data mining algorithms use different ways to consider uncertainty in data such as by representing a data object as a sample of points or a probability distribution. Fuzzy methods have long been used for clustering traditional (certain) data objects. They are used to produce non-crisp cluster labels. For uncertain data, however, besides some uncertain fuzzy k-medoids algorithms, not many other fuzzy clustering methods have been developed. In this work, we develop a fuzzy kernel k-medoids algorithm for clustering uncertain data objects. The developed fuzzy kernel k-medoids algorithm is superior to existing fuzzy k-medoids algorithms in clustering data sets with non-linearly separable clusters.

Keywords: clustering algorithm, fuzzy methods, kernel k-medoids, uncertain data

Procedia PDF Downloads 215
25212 From Industry 4.0 to Agriculture 4.0: A Framework to Manage Product Data in Agri-Food Supply Chain for Voluntary Traceability

Authors: Angelo Corallo, Maria Elena Latino, Marta Menegoli

Abstract:

Agri-food value chain involves various stakeholders with different roles. All of them abide by national and international rules and leverage marketing strategies to advance their products. Food products and related processing phases carry with it a big mole of data that are often not used to inform final customer. Some data, if fittingly identified and used, can enhance the single company, and/or the all supply chain creates a math between marketing techniques and voluntary traceability strategies. Moreover, as of late, the world has seen buying-models’ modification: customer is careful on wellbeing and food quality. Food citizenship and food democracy was born, leveraging on transparency, sustainability and food information needs. Internet of Things (IoT) and Analytics, some of the innovative technologies of Industry 4.0, have a significant impact on market and will act as a main thrust towards a genuine ‘4.0 change’ for agriculture. But, realizing a traceability system is not simple because of the complexity of agri-food supply chain, a lot of actors involved, different business models, environmental variations impacting products and/or processes, and extraordinary climate changes. In order to give support to the company involved in a traceability path, starting from business model analysis and related business process a Framework to Manage Product Data in Agri-Food Supply Chain for Voluntary Traceability was conceived. Studying each process task and leveraging on modeling techniques lead to individuate information held by different actors during agri-food supply chain. IoT technologies for data collection and Analytics techniques for data processing supply information useful to increase the efficiency intra-company and competitiveness in the market. The whole information recovered can be shown through IT solutions and mobile application to made accessible to the company, the entire supply chain and the consumer with the view to guaranteeing transparency and quality.

Keywords: agriculture 4.0, agri-food suppy chain, industry 4.0, voluntary traceability

Procedia PDF Downloads 147