Search results for: Spatial temporal data mining
7763 Framework for the Modeling of the Supply Chain Collaborative Planning Process
Authors: D. Pérez, M. M. E. Alemany
Abstract:
In this work, a framework to model the Supply Chain (SC) Collaborative Planning (CP) process is proposed. The main contributions of this framework concern 1) the presentation of the decision view, the most important one due to the characteristics of the process, jointly within the physical, organisation and information views, and 2) the simultaneous consideration of the spatial and temporal integration among the different supply chain decision centres. This framework provides the basis for a realistic and integrated perspective of the supply chain collaborative planning process and also the analytical modeling of each of its decisional activities.Keywords: Collaborative Planning, Decision View, Distributed Decision-Making, Framework.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13247762 Social and Economic Effects of Mining Industry Restructuring in Romania -Case Studies
Authors: Andra Costache, Gica Pehoiu
Abstract:
As in other countries from Central and Eastern Europe, the economic restructuring occurred in the last decade of the twentieth century affected the mining industry in Romania, an oversize and heavily subsidized sector before 1989. After more than a decade since the beginning of mining restructuring, an evaluation of current social implications of the process it is required, together with an efficiency analysis of the adaptation mechanisms developed at governmental level. This article aims to provide an insight into these issues through case studies conducted in the most important coal basin of Romania, Petroşani Depression.Keywords: case studies, government programs, miningrestructuring, social effects.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26607761 Integration of Educational Data Mining Models to a Web-Based Support System for Predicting High School Student Performance
Authors: Sokkhey Phauk, Takeo Okazaki
Abstract:
The challenging task in educational institutions is to maximize the high performance of students and minimize the failure rate of poor-performing students. An effective method to leverage this task is to know student learning patterns with highly influencing factors and get an early prediction of student learning outcomes at the timely stage for setting up policies for improvement. Educational data mining (EDM) is an emerging disciplinary field of data mining, statistics, and machine learning concerned with extracting useful knowledge and information for the sake of improvement and development in the education environment. The study is of this work is to propose techniques in EDM and integrate it into a web-based system for predicting poor-performing students. A comparative study of prediction models is conducted. Subsequently, high performing models are developed to get higher performance. The hybrid random forest (Hybrid RF) produces the most successful classification. For the context of intervention and improving the learning outcomes, a feature selection method MICHI, which is the combination of mutual information (MI) and chi-square (CHI) algorithms based on the ranked feature scores, is introduced to select a dominant feature set that improves the performance of prediction and uses the obtained dominant set as information for intervention. By using the proposed techniques of EDM, an academic performance prediction system (APPS) is subsequently developed for educational stockholders to get an early prediction of student learning outcomes for timely intervention. Experimental outcomes and evaluation surveys report the effectiveness and usefulness of the developed system. The system is used to help educational stakeholders and related individuals for intervening and improving student performance.
Keywords: Academic performance prediction system, prediction model, educational data mining, dominant factors, feature selection methods, student performance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9807760 An Improved K-Means Algorithm for Gene Expression Data Clustering
Authors: Billel Kenidra, Mohamed Benmohammed
Abstract:
Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.
Keywords: Microarray data mining, biological pattern recognition, partitional clustering, k-means algorithm, centroid initialization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12847759 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency
Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami
Abstract:
Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.
Keywords: Clustering, k-means, categorical datasets, pattern recognition, unsupervised learning, knowledge discovery.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 35457758 Recognition Machine (RM) for On-line and Isolated Flight Deck Officer (FDO) Gestures
Authors: Deniz T. Sodiri, Venkat V S S Sastry
Abstract:
The paper presents an on-line recognition machine (RM) for continuous/isolated, dynamic and static gestures that arise in Flight Deck Officer (FDO) training. RM is based on generic pattern recognition framework. Gestures are represented as templates using summary statistics. The proposed recognition algorithm exploits temporal and spatial characteristics of gestures via dynamic programming and Markovian process. The algorithm predicts corresponding index of incremental input data in the templates in an on-line mode. Accumulated consistency in the sequence of prediction provides a similarity measurement (Score) between input data and the templates. The algorithm provides an intuitive mechanism for automatic detection of start/end frames of continuous gestures. In the present paper, we consider isolated gestures. The performance of RM is evaluated using four datasets - artificial (W TTest), hand motion (Yang) and FDO (tracker, vision-based ). RM achieves comparable results which are in agreement with other on-line and off-line algorithms such as hidden Markov model (HMM) and dynamic time warping (DTW). The proposed algorithm has the additional advantage of providing timely feedback for training purposes.Keywords: On-line Recognition Algorithm, IsolatedDynamic/Static Gesture Recognition, On-line Markovian/DynamicProgramming, Training in Virtual Environments.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14637757 Feature Based Unsupervised Intrusion Detection
Authors: Deeman Yousif Mahmood, Mohammed Abdullah Hussein
Abstract:
The goal of a network-based intrusion detection system is to classify activities of network traffics into two major categories: normal and attack (intrusive) activities. Nowadays, data mining and machine learning plays an important role in many sciences; including intrusion detection system (IDS) using both supervised and unsupervised techniques. However, one of the essential steps of data mining is feature selection that helps in improving the efficiency, performance and prediction rate of proposed approach. This paper applies unsupervised K-means clustering algorithm with information gain (IG) for feature selection and reduction to build a network intrusion detection system. For our experimental analysis, we have used the new NSL-KDD dataset, which is a modified dataset for KDDCup 1999 intrusion detection benchmark dataset. With a split of 60.0% for the training set and the remainder for the testing set, a 2 class classifications have been implemented (Normal, Attack). Weka framework which is a java based open source software consists of a collection of machine learning algorithms for data mining tasks has been used in the testing process. The experimental results show that the proposed approach is very accurate with low false positive rate and high true positive rate and it takes less learning time in comparison with using the full features of the dataset with the same algorithm.
Keywords: Information Gain (IG), Intrusion Detection System (IDS), K-means Clustering, Weka.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27767756 Forest Risk and Vulnerability Assessment: A Case Study from East Bokaro Coal Mining Area in India
Authors: Sujata Upgupta, Prasoon Kumar Singh
Abstract:
The expansion of large scale coal mining into forest areas is a potential hazard for the local biodiversity and wildlife. The objective of this study is to provide a picture of the threat that coal mining poses to the forests of the East Bokaro landscape. The vulnerable forest areas at risk have been assessed and the priority areas for conservation have been presented. The forested areas at risk in the current scenario have been assessed and compared with the past conditions using classification and buffer based overlay approach. Forest vulnerability has been assessed using an analytical framework based on systematic indicators and composite vulnerability index values. The results indicate that more than 4 km2 of forests have been lost from 1973 to 2016. Large patches of forests have been diverted for coal mining projects. Forests in the northern part of the coal field within 1-3 km radius around the coal mines are at immediate risk. The original contiguous forests have been converted into fragmented and degraded forest patches. Most of the collieries are located within or very close to the forests thus threatening the biodiversity and hydrology of the surrounding regions. Based on the vulnerability values estimated, it was concluded that more than 90% of the forested grids in East Bokaro are highly vulnerable to mining. The forests in the sub-districts of Bermo and Chandrapura have been identified as the most vulnerable to coal mining activities. This case study would add to the capacity of the forest managers and mine managers to address the risk and vulnerability of forests at a small landscape level in order to achieve sustainable development.
Keywords: Coal mining, forest, indicators, vulnerability.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11607755 A Semantic Recommendation Procedure for Electronic Product Catalog
Authors: Hadi Khosravi Farsani, Mohammadali Nematbakhsh
Abstract:
To overcome the product overload of Internet shoppers, we introduce a semantic recommendation procedure which is more efficient when applied to Internet shopping malls. The suggested procedure recommends the semantic products to the customers and is originally based on Web usage mining, product classification, association rule mining, and frequently purchasing. We applied the procedure to the data set of MovieLens Company for performance evaluation, and some experimental results are provided. The experimental results have shown superior performance in terms of coverage and precision.Keywords: Personalization, Recommendation, OWL Ontology, Electronic Catalogs, Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19257754 Text-Mining Approach for Evaluation of Affective Management Practices
Authors: Masaaki Saito, Qin Tang, Hiroyuki Umemuro
Abstract:
The purpose of this paper is to propose a text mining approach to evaluate companies- practices on affective management. Affective management argues that it is critical to take stakeholders- affects into consideration during decision-making process, along with the traditional numerical and rational indices. CSR reports published by companies were collected as source information. Indices were proposed based on the frequency and collocation of words relevant to affective management concept using text mining approach to analyze the text information of CSR reports. In addition, the relationships between the results obtained using proposed indices and traditional indicators of business performance were investigated using correlation analysis. Those correlations were also compared between manufacturing and non-manufacturing companies. The results of this study revealed the possibility to evaluate affective management practices of companies based on publicly available text documents.Keywords: Affective management, Affect, Stakeholder, Text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18457753 Hybrid Intelligent Intrusion Detection System
Authors: Norbik Bashah, Idris Bharanidharan Shanmugam, Abdul Manan Ahmed
Abstract:
Intrusion Detection Systems are increasingly a key part of systems defense. Various approaches to Intrusion Detection are currently being used, but they are relatively ineffective. Artificial Intelligence plays a driving role in security services. This paper proposes a dynamic model Intelligent Intrusion Detection System, based on specific AI approach for intrusion detection. The techniques that are being investigated includes neural networks and fuzzy logic with network profiling, that uses simple data mining techniques to process the network data. The proposed system is a hybrid system that combines anomaly, misuse and host based detection. Simple Fuzzy rules allow us to construct if-then rules that reflect common ways of describing security attacks. For host based intrusion detection we use neural-networks along with self organizing maps. Suspicious intrusions can be traced back to its original source path and any traffic from that particular source will be redirected back to them in future. Both network traffic and system audit data are used as inputs for both.Keywords: Intrusion Detection, Network Security, Data mining, Fuzzy Logic.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21327752 Automata Theory Approach for Solving Frequent Pattern Discovery Problems
Authors: Renáta Iváncsy, István Vajk
Abstract:
The various types of frequent pattern discovery problem, namely, the frequent itemset, sequence and graph mining problems are solved in different ways which are, however, in certain aspects similar. The main approach of discovering such patterns can be classified into two main classes, namely, in the class of the levelwise methods and in that of the database projection-based methods. The level-wise algorithms use in general clever indexing structures for discovering the patterns. In this paper a new approach is proposed for discovering frequent sequences and tree-like patterns efficiently that is based on the level-wise issue. Because the level-wise algorithms spend a lot of time for the subpattern testing problem, the new approach introduces the idea of using automaton theory to solve this problem.Keywords: Frequent pattern discovery, graph mining, pushdownautomaton, sequence mining, state machine, tree mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16287751 An Analysis on the Appropriateness and Effectiveness of CCTV Location for Crime Prevention
Authors: Tae-Heon Moon, Sun-Young Heo, Sang-Ho Lee, Youn-Taik Leem, Kwang-Woo Nam
Abstract:
This study aims to investigate the possibility of crime prevention through CCTV by analyzing the appropriateness of the CCTV location, whether it is installed in the hotspot of crime-prone areas, and exploring the crime prevention effect and transition effect. The real crime and CCTV locations of case city were converted into the spatial data by using GIS. The data was analyzed by hotspot analysis and weighted displacement quotient (WDQ). As study methods, it analyzed existing relevant studies for identifying the trends of CCTV and crime studies based on big data from 1800 to 2014 and understanding the relation between CCTV and crime. Second, it investigated the current situation of nationwide CCTVs and analyzed the guidelines of CCTV installation and operation to draw attention to the problems and indicating points of CCTV use. Third, it investigated the crime occurrence in case areas and the current situation of CCTV installation in the spatial aspects, and analyzed the appropriateness and effectiveness of CCTV installation to suggest a rational installation of CCTV and the strategic direction of crime prevention. The results demonstrate that there was no significant effect in the installation of CCTV on crime prevention in the case area. This indicates that CCTV should be installed and managed in a more scientific way reflecting local crime situations. In terms of CCTV, the methods of spatial analysis such as GIS, which can evaluate the installation effect, and the methods of economic analysis like cost-benefit analysis should be developed. In addition, these methods should be distributed to local governments across the nation for the appropriate installation of CCTV and operation. This study intended to find a design guideline of the optimum CCTV installation. In this regard, this study is meaningful in that it will contribute to the creation of a safe city.
Keywords: CCTV, Safe City, Crime Prevention, Spatial Analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26847750 Spatial Distribution of Local Sheep Breeds in Antalya Province
Authors: Serife Gulden Yilmaz, Suleyman Karaman
Abstract:
Sheep breeding is important in terms of meeting both the demand of red meat consumption and the availability of industrial raw materials and the employment of the rural sector in Turkey. It is also very important to ensure the selection and continuity of the breeds that are raised in order to increase quality and productive products related to sheep breeding. The protection of local breeds and crossbreds also enables the development of the sector in the region and the reduction of imports. In this study, the data were obtained from the records of the Turkish Statistical Institute and Antalya Sheep & Goat Breeders' Association. Spatial distribution of sheep breeds in Antalya is reviewed statistically in terms of concentration at the local level for 2015 period spatially. For this reason; mapping, box plot, linear regression are used in this study. Concentration is introduced by means of studbook data on sheep breeding as locals and total sheep farm by mapping. It is observed that Pırlak breed (17.5%) and Merinos crossbreed (16.3%) have the highest concentration in the region. These breeds are respectively followed by Akkaraman breed (11%), Pirlak crossbreed (8%), Merinos breed (7.9%) Akkaraman crossbreed (7.9%) and Ivesi breed (7.2%).
Keywords: Antalya, sheep breeds, spatial distribution, local.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12317749 The Role of People and Data in Complex Spatial-Related Long-Term Decisions: A Case Study of Capital Project Management Groups
Authors: Peter Boyes, Sarah Sharples, Paul Tennent, Gary Priestnall, Jeremy Morley
Abstract:
Significant long-term investment projects can involve complex decisions. These are often described as capital projects and the factors that contribute to their complexity include budgets, motivating reasons for investment, stakeholder involvement, interdependent projects, and the delivery phases required. The complexity of these projects often requires management groups to be established involving stakeholder representatives, these teams are inherently multidisciplinary. This study uses two university campus capital projects as case studies for this type of management group. Due to the interaction of projects with wider campus infrastructure and users, decisions are made at varying spatial granularity throughout the project lifespan. This spatial-related context brings complexity to the group decisions. Sensemaking is the process used to achieve group situational awareness of a complex situation, enabling the team to arrive at a consensus and make a decision. The purpose of this study is to understand the role of people and data in complex spatial related long-term decision and sensemaking processes. The paper aims to identify and present issues experienced in practical settings of these types of decision. A series of exploratory semi-structured interviews with members of the two projects elicit an understanding of their operation. From two stages of thematic analysis, inductive and deductive, emergent themes are identified around the group structure, the data usage, and the decision making within these groups. When data were made available to the group, there were commonly issues with perception of veracity and validity of the data presented; this impacted the ability of the group to reach consensus and therefore for decision to be made. Similarly, there were different responses to forecasted or modelled data, shaped by the experience and occupation of the individuals within the multidisciplinary management group. This paper provides an understanding of further support required for team sensemaking and decision making in complex capital projects. The paper also discusses the barriers found to effective decision making in this setting and suggests opportunities to develop decision support systems in this team strategic decision-making process. Recommendations are made for further research into the sensemaking and decision-making process of this complex spatial-related setting.
Keywords: decision making, decisions under uncertainty, real decisions, sensemaking, spatial, team decision making
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4937748 An Approach to Concerns and Aspects Mining for Web Applications
Authors: Carlo Bellettini, Alessandro Marchetto, Andrea Trentini
Abstract:
Web applications have become very complex and crucial, especially when combined with areas such as CRM (Customer Relationship Management) and BPR (Business Process Reengineering), the scientific community has focused attention to Web applications design, development, analysis, and testing, by studying and proposing methodologies and tools. This paper proposes an approach to automatic multi-dimensional concern mining for Web Applications, based on concepts analysis, impact analysis, and token-based concern identification. This approach lets the user to analyse and traverse Web software relevant to a particular concern (concept, goal, purpose, etc.) via multi-dimensional separation of concerns, to document, understand and test Web applications. This technique was developed in the context of WAAT (Web Applications Analysis and Testing) project. A semi-automatic tool to support this technique is currently under development.Keywords: Aspect Mining, Concepts Analysis, Concerns Mining, Multi-Dimensional Separation of Concerns, Impact Analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15137747 Comparison of Artificial Neural Network and Multivariate Regression Methods in Prediction of Soil Cation Exchange Capacity
Authors: Ali Keshavarzi, Fereydoon Sarmadian
Abstract:
Investigation of soil properties like Cation Exchange Capacity (CEC) plays important roles in study of environmental reaserches as the spatial and temporal variability of this property have been led to development of indirect methods in estimation of this soil characteristic. Pedotransfer functions (PTFs) provide an alternative by estimating soil parameters from more readily available soil data. 70 soil samples were collected from different horizons of 15 soil profiles located in the Ziaran region, Qazvin province, Iran. Then, multivariate regression and neural network model (feedforward back propagation network) were employed to develop a pedotransfer function for predicting soil parameter using easily measurable characteristics of clay and organic carbon. The performance of the multivariate regression and neural network model was evaluated using a test data set. In order to evaluate the models, root mean square error (RMSE) was used. The value of RMSE and R2 derived by ANN model for CEC were 0.47 and 0.94 respectively, while these parameters for multivariate regression model were 0.65 and 0.88 respectively. Results showed that artificial neural network with seven neurons in hidden layer had better performance in predicting soil cation exchange capacity than multivariate regression.Keywords: Easily measurable characteristics, Feed-forwardback propagation, Pedotransfer functions, CEC.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22127746 Combining Fuzzy Logic and Data Miningto Predict the Result of an EIA Review
Authors: Kevin Fong-Rey Liu, Jia-Shen Chen, Han-Hsi Liang, Cheng-Wu Chen, Yung-Shuen Shen
Abstract:
The purpose of determining impact significance is to place value on impacts. Environmental impact assessment review is a process that judges whether impact significance is acceptable or not in accordance with the scientific facts regarding environmental, ecological and socio-economical impacts described in environmental impact statements (EIS) or environmental impact assessment reports (EIAR). The first aim of this paper is to summarize the criteria of significance evaluation from the past review results and accordingly utilize fuzzy logic to incorporate these criteria into scientific facts. The second aim is to employ data mining technique to construct an EIS or EIAR prediction model for reviewing results which can assist developers to prepare and revise better environmental management plans in advance. The validity of the previous prediction model proposed by authors in 2009 is 92.7%. The enhanced validity in this study can attain 100.0%.Keywords: Environmental impact assessment review, impactsignificance, fuzzy logic, data mining, classification tree.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19447745 Validation and Selection between Machine Learning Technique and Traditional Methods to Reduce Bullwhip Effects: a Data Mining Approach
Authors: Hamid R. S. Mojaveri, Seyed S. Mousavi, Mojtaba Heydar, Ahmad Aminian
Abstract:
The aim of this paper is to present a methodology in three steps to forecast supply chain demand. In first step, various data mining techniques are applied in order to prepare data for entering into forecasting models. In second step, the modeling step, an artificial neural network and support vector machine is presented after defining Mean Absolute Percentage Error index for measuring error. The structure of artificial neural network is selected based on previous researchers' results and in this article the accuracy of network is increased by using sensitivity analysis. The best forecast for classical forecasting methods (Moving Average, Exponential Smoothing, and Exponential Smoothing with Trend) is resulted based on prepared data and this forecast is compared with result of support vector machine and proposed artificial neural network. The results show that artificial neural network can forecast more precisely in comparison with other methods. Finally, forecasting methods' stability is analyzed by using raw data and even the effectiveness of clustering analysis is measured.Keywords: Artificial Neural Networks (ANN), bullwhip effect, demand forecasting, Support Vector Machine (SVM).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20107744 Developing an Advanced Algorithm Capable of Classifying News, Articles and Other Textual Documents Using Text Mining Techniques
Authors: R. B. Knudsen, O. T. Rasmussen, R. A. Alphinas
Abstract:
The reason for conducting this research is to develop an algorithm that is capable of classifying news articles from the automobile industry, according to the competitive actions that they entail, with the use of Text Mining (TM) methods. It is needed to test how to properly preprocess the data for this research by preparing pipelines which fits each algorithm the best. The pipelines are tested along with nine different classification algorithms in the realm of regression, support vector machines, and neural networks. Preliminary testing for identifying the optimal pipelines and algorithms resulted in the selection of two algorithms with two different pipelines. The two algorithms are Logistic Regression (LR) and Artificial Neural Network (ANN). These algorithms are optimized further, where several parameters of each algorithm are tested. The best result is achieved with the ANN. The final model yields an accuracy of 0.79, a precision of 0.80, a recall of 0.78, and an F1 score of 0.76. By removing three of the classes that created noise, the final algorithm is capable of reaching an accuracy of 94%.
Keywords: Artificial neural network, competitive dynamics, logistic regression, text classification, text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5367743 Main Cause of Children's Deaths in Indigenous Wayuu Community from Department of La Guajira: A Research Developed through Data Mining Use
Authors: Isaura Esther Solano Núñez, David Suarez
Abstract:
The main purpose of this research is to discover what causes death in children of the Wayuu community, and deeply analyze those results in order to take corrective measures to properly control infant mortality. We consider important to determine the reasons that are producing early death in this specific type of population, since they are the most vulnerable to high risk environmental conditions. In this way, the government, through competent authorities, may develop prevention policies and the right measures to avoid an increase of this tragic fact. The methodology used to develop this investigation is data mining, which consists in gaining and examining large amounts of data to produce new and valuable information. Through this technique it has been possible to determine that the child population is dying mostly from malnutrition. In short, this technique has been very useful to develop this study; it has allowed us to transform large amounts of information into a conclusive and important statement, which has made it easier to take appropriate steps to resolve a particular situation.
Keywords: Malnutrition, datamining, analytical, descriptive, population, wayuu, indigenous.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6967742 Dose due the Incorporation of Radionuclides Using Teeth as Bioindicators nearby Caetité Uranium Mines
Authors: Viviane S. Guimarães, Ícaro M. M. Brasil, Simara S. Campos, Roseli F. Gennari, Márcia R. P. Attie, Susana O. Souza.
Abstract:
Uranium mining and processing in Brazil occur in a northeastern area near to Caetité-BA. Several Non-Governmental Organizations claim that uranium mining in this region is a pollutant causing health risks to the local population,but those in charge of the complex extraction and production of“yellow cake" for generating fuel to the nuclear power plants reject these allegations. This study aimed at identifying potential problems caused by mining to the population of Caetité. In this, work,the concentrations of 238U, 232Th and 40K radioisotopes in the teeth of the Caetité population were determined by ICP-MS. Teeth are used as bioindicators of incorporated radionuclides. Cumulative radiation doses in the skeleton were also determined. The concentration values were below 0.008 ppm, and annual effective dose due to radioisotopes are below to the reference values. Therefore, it is not possible to state that the mining process in Caetité increases pollution or radiation exposure in a meaningful way.Keywords: bioindicators, radiation dose, radioisotopesincorporation, uranium.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 41127741 Improving Worm Detection with Artificial Neural Networks through Feature Selection and Temporal Analysis Techniques
Authors: Dima Stopel, Zvi Boger, Robert Moskovitch, Yuval Shahar, Yuval Elovici
Abstract:
Computer worm detection is commonly performed by antivirus software tools that rely on prior explicit knowledge of the worm-s code (detection based on code signatures). We present an approach for detection of the presence of computer worms based on Artificial Neural Networks (ANN) using the computer's behavioral measures. Identification of significant features, which describe the activity of a worm within a host, is commonly acquired from security experts. We suggest acquiring these features by applying feature selection methods. We compare three different feature selection techniques for the dimensionality reduction and identification of the most prominent features to capture efficiently the computer behavior in the context of worm activity. Additionally, we explore three different temporal representation techniques for the most prominent features. In order to evaluate the different techniques, several computers were infected with five different worms and 323 different features of the infected computers were measured. We evaluated each technique by preprocessing the dataset according to each one and training the ANN model with the preprocessed data. We then evaluated the ability of the model to detect the presence of a new computer worm, in particular, during heavy user activity on the infected computers.Keywords: Artificial Neural Networks, Feature Selection, Temporal Analysis, Worm Detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17287740 Machine Learning Facing Behavioral Noise Problem in an Imbalanced Data Using One Side Behavioral Noise Reduction: Application to a Fraud Detection
Authors: Salma El Hajjami, Jamal Malki, Alain Bouju, Mohammed Berrada
Abstract:
With the expansion of machine learning and data mining in the context of Big Data analytics, the common problem that affects data is class imbalance. It refers to an imbalanced distribution of instances belonging to each class. This problem is present in many real world applications such as fraud detection, network intrusion detection, medical diagnostics, etc. In these cases, data instances labeled negatively are significantly more numerous than the instances labeled positively. When this difference is too large, the learning system may face difficulty when tackling this problem, since it is initially designed to work in relatively balanced class distribution scenarios. Another important problem, which usually accompanies these imbalanced data, is the overlapping instances between the two classes. It is commonly referred to as noise or overlapping data. In this article, we propose an approach called: One Side Behavioral Noise Reduction (OSBNR). This approach presents a way to deal with the problem of class imbalance in the presence of a high noise level. OSBNR is based on two steps. Firstly, a cluster analysis is applied to groups similar instances from the minority class into several behavior clusters. Secondly, we select and eliminate the instances of the majority class, considered as behavioral noise, which overlap with behavior clusters of the minority class. The results of experiments carried out on a representative public dataset confirm that the proposed approach is efficient for the treatment of class imbalances in the presence of noise.Keywords: Machine learning, Imbalanced data, Data mining, Big data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11377739 Spatial Indeterminacy: Destabilization of Dichotomies in Modern and Contemporary Architecture
Authors: Adrian Lo
Abstract:
Since the advent of modern architecture, notions of free plan and transparency have proliferated well into current trends. The movement’s notion of a spatially homogeneous, open and limitless ‘free plan’ contrasts with the spatially heterogeneous ‘series of rooms’ defined by load bearing walls, which in turn triggered new notions of transparency created by vast expanses of glazed walls. Similarly, transparency was also dichotomized as something that was physical or optical, as well as something conceptual, akin to spatial organization. As opposed to merely accepting the duality and possible incompatibility of these dichotomies, this paper seeks to ask how can space be both literally and phenomenally transparent, as well as exhibit both homogeneous and heterogeneous qualities? This paper explores this potential destabilization or blurring of spatial phenomena by dissecting the transparent layers and volumes of a series of selected case studies to investigate how different architects have devised strategies of spatial ambiguity and interpenetration. Projects by Peter Eisenman, Sou Fujimoto, and SANAA will be discussed and analyzed to show how the superimposition of geometries and spaces achieve different conditions of layering, transparency, and interstitiality. Their particular buildings will be explored to reveal various innovative kinds of spatial interpenetration produced through the articulate relations of the elements of architecture, which challenge conventional perceptions of interior and exterior whereby visual homogeneity blurs with spatial heterogeneity. The results show how spatial conceptions such as interpenetration and transparency have the ability to subvert not only inside-outside dialectics, but could also produce multiple degrees of interiority within complex and indeterminate spatial dimensions in constant flux as well as present alternative forms of social interaction.
Keywords: interpenetration, literal and phenomenal transparency, spatial heterogeneity, visual homogeneity
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5337738 Educational Data Mining: The Case of Department of Mathematics and Computing in the Period 2009-2018
Authors: M. Sitoe, O. Zacarias
Abstract:
University education is influenced by several factors that range from the adoption of strategies to strengthen the whole process to the academic performance improvement of the students themselves. This work uses data mining techniques to develop a predictive model to identify students with a tendency to evasion and retention. To this end, a database of real students’ data from the Department of University Admission (DAU) and the Department of Mathematics and Informatics (DMI) was used. The data comprised 388 undergraduate students admitted in the years 2009 to 2014. The Weka tool was used for model building, using three different techniques, namely: K-nearest neighbor, random forest, and logistic regression. To allow for training on multiple train-test splits, a cross-validation approach was employed with a varying number of folds. To reduce bias variance and improve the performance of the models, ensemble methods of Bagging and Stacking were used. After comparing the results obtained by the three classifiers, Logistic Regression using Bagging with seven folds obtained the best performance, showing results above 90% in all evaluated metrics: accuracy, rate of true positives, and precision. Retention is the most common tendency.
Keywords: Evasion and retention, cross validation, bagging, stacking.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1217737 Rural – Urban Partnership for Balanced Spatial Development in Latvia
Authors: Zane Bulderberga
Abstract:
Spatial dimension in development planning is becoming more topical in 21st century as a result of changes in population structure. Sustainable spatial development focuses on identifying and using territorial advantages to foster the harmonized development of the entire country, reducing negative effects of population concentration, increasing availability and mobility. EU and national development planning documents state polycentrism as main tool for balance spatial development, including investment concentration in growth centres. If mutual cooperation of growth centres as well as urban–rural cooperation is not fostered, then territorial differences can deepen and create unbalanced development.
The aim of research: to evaluate the urban–rural interaction, elaborating spatial development scenarios in framework of Latvian regional policy. To perform the research monographic, comparison, abstract–logical method, synthesis and analysis will be used when studying the theoretical aspects of research aiming at collecting the ideas of scientists from different countries, concepts, regulations as well as to create meaningful scientific discussion. Hierarchy analysis process (AHP) will be used to state further scenarios of spatial development in Latvia.
Experts from various institutions recognized urban – rural interaction and co-operation as an essential tool for the development. The most important factors for balanced spatial development in Latvia are availability of public transportation and improvement of service availability. Evaluating the three alternative scenarios, it was concluded that the urban – rural partnership will ensure a balanced development in Latvian regions.
Keywords: Rural – urban interaction, rural – urban cooperation, spatial development, AHP.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28327736 Timescape-Based Panoramic View for Historic Landmarks
Authors: H. Ali, A. Whitehead
Abstract:
Providing a panoramic view of famous landmarks around the world offers artistic and historic value for historians, tourists, and researchers. Exploring the history of famous landmarks by presenting a comprehensive view of a temporal panorama merged with geographical and historical information presents a unique challenge of dealing with images that span a long period, from the 1800’s up to the present. This work presents the concept of temporal panorama through a timeline display of aligned historic and modern images for many famous landmarks. Utilization of this panorama requires a collection of hundreds of thousands of landmark images from the Internet comprised of historic images and modern images of the digital age. These images have to be classified for subset selection to keep the more suitable images that chronologically document a landmark’s history. Processing of historic images captured using older analog technology under various different capturing conditions represents a big challenge when they have to be used with modern digital images. Successful processing of historic images to prepare them for next steps of temporal panorama creation represents an active contribution in cultural heritage preservation through the fulfillment of one of UNESCO goals in preservation and displaying famous worldwide landmarks.
Keywords: Cultural heritage, image registration, image subset selection, registered image similarity, temporal panorama, timescapes.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10517735 Renewed Urban Waterfront: Spatial Conditions of a Contemporary Urban Space Typology
Authors: Beate Niemann, Fabian Pramel
Abstract:
The formerly industrially or militarily used Urban Waterfront is a potential area for urban development. Extensive interventions in the urban space come along with the development of these previously inaccessible areas in the city. The development of the Urban Waterfront in the European City is not subject to any recognizable urban paradigm. In this study, the development of the Urban Waterfront as a new urban space typology is analyzed by case studies of Urban Waterfront developments in European Cities. For humans, perceptible spatial conditions are categorized and it is identified whether the themed Urban Waterfront Developments are congruent or incongruent urban design interventions and which deviations the Urban Waterfront itself induce. As congruent urban design, a design is understood, which fits in the urban fabric regarding its similar spatial conditions to the surrounding. Incongruent urban design, however, shows significantly different conditions in its shape. Finally, the spatial relationship of the themed Urban Waterfront developments and their associated environment are compared in order to identify contrasts between new and old urban space. In this way, conclusions about urban design paradigms of the new urban space typology are tried to be drawn.
Keywords: Composition, congruence, identity, paradigm, spatial condition, urban design, urban development, urban waterfront.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21387734 Redesigning Business Processes: A Method Based on Simulation and Process Mining Techniques
Authors: Zahra Mohammadnazari, Fateme Rostambeygi, Fatemeh Dehrouyeh, Hwang Ki-Soon, Amir Aghsami
Abstract:
Corporations have always prioritized efforts to examine and improve processes. Various metrics, such as the cost and time required to implement the process and can be specified in this regard. Process improvement can be defined as an improvement of these indicators. This is accomplished by looking at prospective adjustments to the current executive process model or the resources allotted to it. Research has been conducted in this paper to the improve the procurement process and aims to explore assessment prospects in the project using a combination of process mining and simulation (benefiting from Play-In and Play-Out methodologies). To run the simulation, we will need to complete the control flow diagram, institution settings, resource settings, and activity settings. The process of mining event logs yields the process control flow. However, both the entry of institutions and the distribution of resources must be modeled. The rate of admission of institutions and the distribution of time for the implementation of activities will be determined in the next step.
Keywords: Business reengineering, Petri net, process-based simulation, process mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 485