Search results for: multidimensional process mining
5630 Performance Comparison of ADTree and Naive Bayes Algorithms for Spam Filtering
Authors: Thanh Nguyen, Andrei Doncescu, Pierre Siegel
Abstract:
Classification is an important data mining technique and could be used as data filtering in artificial intelligence. The broad application of classification for all kind of data leads to be used in nearly every field of our modern life. Classification helps us to put together different items according to the feature items decided as interesting and useful. In this paper, we compare two classification methods Naïve Bayes and ADTree use to detect spam e-mail. This choice is motivated by the fact that Naive Bayes algorithm is based on probability calculus while ADTree algorithm is based on decision tree. The parameter settings of the above classifiers use the maximization of true positive rate and minimization of false positive rate. The experiment results present classification accuracy and cost analysis in view of optimal classifier choice for Spam Detection. It is point out the number of attributes to obtain a tradeoff between number of them and the classification accuracy.Keywords: Classification, data mining, spam filtering, naive Bayes, decision tree.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15025629 Software Engineering Interoperable Environment for University Process Workflow and Document Management
Authors: Bekim Fetaji, Majlinda Fetaji, Mirlinda Ebibi
Abstract:
The objective of the research was focused on the design, development and evaluation of a sustainable web based network system to be used as an interoperable environment for University process workflow and document management. In this manner the most of the process workflows in Universities can be entirely realized electronically and promote integrated University. Definition of the most used University process workflows enabled creating electronic workflows and their execution on standard workflow execution engines. Definition or reengineering of workflows provided increased work efficiency and helped in having standardized process through different faculties. The concept and the process definition as well as the solution applied as Case study are evaluated and findings are reported.Keywords: design process workflows, workflow and documentmanagement, Business Process, software engineering
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13325628 The Robust Clustering with Reduction Dimension
Authors: Dyah E. Herwindiati
Abstract:
A clustering is process to identify a homogeneous groups of object called as cluster. Clustering is one interesting topic on data mining. A group or class behaves similarly characteristics. This paper discusses a robust clustering process for data images with two reduction dimension approaches; i.e. the two dimensional principal component analysis (2DPCA) and principal component analysis (PCA). A standard approach to overcome this problem is dimension reduction, which transforms a high-dimensional data into a lower-dimensional space with limited loss of information. One of the most common forms of dimensionality reduction is the principal components analysis (PCA). The 2DPCA is often called a variant of principal component (PCA), the image matrices were directly treated as 2D matrices; they do not need to be transformed into a vector so that the covariance matrix of image can be constructed directly using the original image matrices. The decomposed classical covariance matrix is very sensitive to outlying observations. The objective of paper is to compare the performance of robust minimizing vector variance (MVV) in the two dimensional projection PCA (2DPCA) and the PCA for clustering on an arbitrary data image when outliers are hiden in the data set. The simulation aspects of robustness and the illustration of clustering images are discussed in the end of paperKeywords: Breakdown point, Consistency, 2DPCA, PCA, Outlier, Vector Variance
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17005627 Spatial Analysis and Statistics for Zoning of Urban Areas
Authors: Benedetto Manganelli, Beniamino Murgante
Abstract:
The use of statistical data and of the neural networks, capable of elaborate a series of data and territorial info, have allowed the making of a model useful in the subdivision of urban places into homogeneous zone under the profile of a social, real estate, environmental and urbanist background of a city. The development of homogeneous zone has fiscal and urbanist advantages. The tools in the model proposed, able to be adapted to the dynamic changes of the city, allow the application of the zoning fast and dynamic.
Keywords: Homogeneous Urban Areas, Multidimensional Scaling, Neural Network, Real Estate Market, Urban Planning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19395626 Acute Coronary Syndrome Prediction Using Data Mining Techniques- An Application
Authors: Tahseen A. Jilani, Huda Yasin, Madiha Yasin, C. Ardil
Abstract:
In this paper we use data mining techniques to investigate factors that contribute significantly to enhancing the risk of acute coronary syndrome. We assume that the dependent variable is diagnosis – with dichotomous values showing presence or absence of disease. We have applied binary regression to the factors affecting the dependent variable. The data set has been taken from two different cardiac hospitals of Karachi, Pakistan. We have total sixteen variables out of which one is assumed dependent and other 15 are independent variables. For better performance of the regression model in predicting acute coronary syndrome, data reduction techniques like principle component analysis is applied. Based on results of data reduction, we have considered only 14 out of sixteen factors.
Keywords: Acute coronary syndrome (ACS), binary logistic regression analyses, myocardial ischemia (MI), principle component analysis, unstable angina (U.A.).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21175625 Data Mining Determination of Sunlight Average Input for Solar Power Plant
Authors: Fl. Loury, P. Sablonière, C. Lamoureux, G. Magnier, Th. Gutierrez
Abstract:
A method is proposed to extract faithful representative patterns from data set of observations when they are suffering from non-negligible fluctuations. Supposing time interval between measurements to be extremely small compared to observation time, it consists in defining first a subset of intermediate time intervals characterizing coherent behavior. Data projection on these intervals gives a set of curves out of which an ideally “perfect” one is constructed by taking the sup limit of them. Then comparison with average real curve in corresponding interval gives an efficiency parameter expressing the degradation consecutive to fluctuation effect. The method is applied to sunlight data collected in a specific place, where ideal sunlight is the one resulting from direct exposure at location latitude over the year, and efficiency is resulting from action of meteorological parameters, mainly cloudiness, at different periods of the year. The extracted information already gives interesting element of decision, before being used for analysis of plant control.
Keywords: Base Input Reconstruction, Data Mining, Efficiency Factor, Information Pattern Operator.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15335624 The Sequestration of Heavy Metals Contaminating the Wonderfonteinspruit Catchment Area using Natural Zeolite
Authors: P.P. Diale, S.S.L. Mkhize, E. Muzenda, J. Zimba
Abstract:
For more than 120 years, gold mining formed the backbone the South Africa-s economy. The consequence of mine closure was observed in large-scale land degradation and widespread pollution of surface water and groundwater. This paper investigates the feasibility of using natural zeolite in removing heavy metals contaminating the Wonderfonteinspruit Catchment Area (WCA), a water stream with high levels of heavy metals and radionuclide pollution. Batch experiments were conducted to study the adsorption behavior of natural zeolite with respect to Fe2+, Mn2+, Ni2+, and Zn2+. The data was analysed using the Langmuir and Freudlich isotherms. Langmuir was found to correlate the adsorption of Fe2+, Mn2+, Ni2+, and Zn2+ better, with the adsorption capacity of 11.9 mg/g, 1.2 mg/g, 1.3 mg/g, and 14.7 mg/g, respectively. Two kinetic models namely, pseudo-first order and pseudo second order were also tested to fit the data. Pseudo-second order equation was found to be the best fit for the adsorption of heavy metals by natural zeolite. Zeolite functionalization with humic acid increased its uptake ability.Keywords: gold-mining, natural zeolites, water pollution, WestRand.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25305623 Determining Cluster Boundaries Using Particle Swarm Optimization
Authors: Anurag Sharma, Christian W. Omlin
Abstract:
Self-organizing map (SOM) is a well known data reduction technique used in data mining. Data visualization can reveal structure in data sets that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOMs, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of a generic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOMs. The application of our method to unlabeled call data for a mobile phone operator demonstrates its feasibility. PSO algorithm utilizes U-matrix of SOMs to determine cluster boundaries; the results of this novel automatic method correspond well to boundary detection through visual inspection of code vectors and k-means algorithm.
Keywords: Particle swarm optimization, self-organizing maps, clustering, data mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17225622 Application Methodology for the Generation of 3D Thermal Models Using UAV Photogrammety and Dual Sensors for Mining/Industrial Facilities Inspection
Authors: Javier Sedano-Cibrián, Julio Manuel de Luis-Ruiz, Rubén Pérez-Álvarez, Raúl Pereda-García, Beatriz Malagón-Picón
Abstract:
Structural inspection activities are necessary to ensure the correct functioning of infrastructures. UAV techniques have become more popular than traditional techniques. Specifically, UAV Photogrammetry allows time and cost savings. The development of this technology has permitted the use of low-cost thermal sensors in UAVs. The representation of 3D thermal models with this type of equipment is in continuous evolution. The direct processing of thermal images usually leads to errors and inaccurate results. In this paper, a methodology is proposed for the generation of 3D thermal models using dual sensors, which involves the application of RGB and thermal images in parallel. Hence, the RGB images are used as the basis for the generation of the model geometry, and the thermal images are the source of the surface temperature information that is projected onto the model. Mining/industrial facilities representations that are obtained can be used for inspection activities.
Keywords: Aerial thermography, data processing, drone, low-cost, point cloud.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3475621 Control-flow Complexity Measurement of Processes and Weyuker's Properties
Authors: Jorge Cardoso
Abstract:
Process measurement is the task of empirically and objectively assigning numbers to the properties of business processes in such a way as to describe them. Desirable attributes to study and measure include complexity, cost, maintainability, and reliability. In our work we will focus on investigating process complexity. We define process complexity as the degree to which a business process is difficult to analyze, understand or explain. One way to analyze a process- complexity is to use a process control-flow complexity measure. In this paper, an attempt has been made to evaluate the control-flow complexity measure in terms of Weyuker-s properties. Weyuker-s properties must be satisfied by any complexity measure to qualify as a good and comprehensive one.
Keywords: Business process measurement, workflow, complexity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26995620 Optimal Performance of Plastic Extrusion Process Using Fuzzy Goal Programming
Authors: Abbas Al-Refaie
Abstract:
This study optimized the performance of plastic extrusion process of drip irrigation pipes using fuzzy goal programming. Two main responses were of main interest; roll thickness and hardness. Four main process factors were studied. The L18 array was then used for experimental design. The individual-moving range control charts were used to assess the stability of the process, while the process capability index was used to assess process performance. Confirmation experiments were conducted at the obtained combination of optimal factor setting by fuzzy goal programming. The results revealed that process capability was improved significantly from -1.129 to 0.8148 for roll thickness and from 0.0965 to 0.714 and hardness. Such improvement results in considerable savings in production and quality costs.
Keywords: Fuzzy goal programming, extrusion process, process capability, irrigation plastic pipes.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9065619 Main Cause of Children's Deaths in Indigenous Wayuu Community from Department of La Guajira: A Research Developed through Data Mining Use
Authors: Isaura Esther Solano Núñez, David Suarez
Abstract:
The main purpose of this research is to discover what causes death in children of the Wayuu community, and deeply analyze those results in order to take corrective measures to properly control infant mortality. We consider important to determine the reasons that are producing early death in this specific type of population, since they are the most vulnerable to high risk environmental conditions. In this way, the government, through competent authorities, may develop prevention policies and the right measures to avoid an increase of this tragic fact. The methodology used to develop this investigation is data mining, which consists in gaining and examining large amounts of data to produce new and valuable information. Through this technique it has been possible to determine that the child population is dying mostly from malnutrition. In short, this technique has been very useful to develop this study; it has allowed us to transform large amounts of information into a conclusive and important statement, which has made it easier to take appropriate steps to resolve a particular situation.
Keywords: Malnutrition, datamining, analytical, descriptive, population, wayuu, indigenous.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7005618 Information Gain Ratio Based Clustering for Investigation of Environmental Parameters Effects on Human Mental Performance
Authors: H. Mehdi, Kh. S. Karimov, A. A. Kavokin
Abstract:
Methods of clustering which were developed in the data mining theory can be successfully applied to the investigation of different kinds of dependencies between the conditions of environment and human activities. It is known, that environmental parameters such as temperature, relative humidity, atmospheric pressure and illumination have significant effects on the human mental performance. To investigate these parameters effect, data mining technique of clustering using entropy and Information Gain Ratio (IGR) K(Y/X) = (H(X)–H(Y/X))/H(Y) is used, where H(Y)=-ΣPi ln(Pi). This technique allows adjusting the boundaries of clusters. It is shown that the information gain ratio (IGR) grows monotonically and simultaneously with degree of connectivity between two variables. This approach has some preferences if compared, for example, with correlation analysis due to relatively smaller sensitivity to shape of functional dependencies. Variant of an algorithm to implement the proposed method with some analysis of above problem of environmental effects is also presented. It was shown that proposed method converges with finite number of steps.Keywords: Clustering, Correlation analysis, EnvironmentalParameters, Information Gain Ratio, Mental Performance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18265617 Finding an Optimized Discriminate Function for Internet Application Recognition
Authors: E. Khorram, S.M. Mirzababaei
Abstract:
Everyday the usages of the Internet increase and simply a world of the data become accessible. Network providers do not want to let the provided services to be used in harmful or terrorist affairs, so they used a variety of methods to protect the special regions from the harmful data. One of the most important methods is supposed to be the firewall. Firewall stops the transfer of such packets through several ways, but in some cases they do not use firewall because of its blind packet stopping, high process power needed and expensive prices. Here we have proposed a method to find a discriminate function to distinguish between usual packets and harmful ones by the statistical processing on the network router logs. So an administrator can alarm to the user. This method is very fast and can be used simply in adjacent with the Internet routers.
Keywords: Data Mining, Firewall, Optimization, Packetclassification, Statistical Pattern Recognition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14125616 Delineating Concern Ground in Block Caving – Underground Mine Using Ground Penetrating Radar
Authors: Eric Sitorus, Septian Prahastudhi, Turgod Nainggolan, Erwin Riyanto
Abstract:
Mining by block or panel caving is a mining method that takes advantage of fractures within an ore body, coupled with gravity, to extract material from a predetermined column of ore. The caving column is weakened from beneath through the use of undercutting, after which the ore breaks up and is extracted from below in a continuous cycle. The nature of this method induces cyclical stresses on the pillars of excavations as stress is built up and released over time, which has a detrimental effect on both the installed ground support and the rock mass itself. Ground support capacity, especially on the production where excavation void ratio is highest, is subjected to heavy loading. Strain above threshold of the elongation of support capacity can yield resulting in damage to excavations. Geotechnical engineers must evaluate not only the remnant capacity of ground support systems but also investigate depth of rock mass yield within pillars, backs and floors. Ground Penetrating Radar (GPR) is a geophysical method that has the ability to evaluate rock mass damage using electromagnetic waves. This paper illustrates a case study from the Grasberg mining complex where non-invasive information on the depth of damage and condition of the remaining rock mass was required. GPR with 100 MHz antenna resolution was used to obtain images of the subsurface to determine rehabilitation requirements prior to recommencing production activities. The GPR surveys were used to calibrate the reflection coefficient response of varying rock mass conditions to known Rock Quality Designation (RQD) parameters observed at the mine. The calibrated GPR survey allowed site engineers to map subsurface conditions and plan rehabilitation accordingly.
Keywords: Block caving, ground penetrating radar, reflectivity, RQD.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6745615 Optimization of the Transfer Molding Process by Implementation of Online Monitoring Techniques for Electronic Packages
Authors: Burcu Kaya, Jan-Martin Kaiser, Karl-Friedrich Becker, Tanja Braun, Klaus-Dieter Lang
Abstract:
Quality of the molded packages is strongly influenced by the process parameters of the transfer molding. To achieve a better package quality and a stable transfer molding process, it is necessary to understand the influence of the process parameters on the package quality. This work aims to comprehend the relationship between the process parameters, and to identify the optimum process parameters for the transfer molding process in order to achieve less voids and wire sweep. To achieve this, a DoE is executed for process optimization and a regression analysis is carried out. A systematic approach is represented to generate models which enable an estimation of the number of voids and wire sweep. Validation experiments are conducted to verify the model and the results are presented.Keywords: Epoxy molding compounds, optimization, regression analysis, transfer molding process, voids, wire sweep.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15335614 Influence of Dynamic Loads in the Structural Integrity of Underground Rooms
Authors: M. Inmaculada Alvarez-Fernández, Celestino González-Nicieza, M. Belén Prendes-Gero, Fernando López-Gayarre
Abstract:
Among many factors affecting the stability of mining excavations, rock-bursts and tremors play a special role. These dynamic loads occur practically always and have different sources of generation. The most important of them is the commonly used mining technique, which disintegrates a certain area of the rock mass not only in the area of the planned mining, but also creates waves that significantly exceed this area affecting the structural elements. In this work it is analysed the consequences of dynamic loads over the structural elements in an underground room and pillar mine to avoid roof instabilities. With this end, dynamic loads were evaluated through in situ and laboratory tests and simulated with numerical modelling. Initially, the geotechnical characterization of all materials was carried out by mean of large-scale tests. Then, drill holes were done on the roof of the mine and were monitored to determine possible discontinuities in it. Three seismic stations and a triaxial accelerometer were employed to measure the vibrations from blasting tests, establish the dynamic behaviour of roof and pillars and develop the transmission laws. At last, computer simulations by FLAC3D software were done to check the effect of vibrations on the stability of the roofs. The study shows that in-situ tests have a greater reliability than laboratory samples because of eliminating the effect of heterogeneities, that the pillars work decreasing the amplitude of the vibration around them, and that the tensile strength of a beam and depending on its span is overcome with waves in phase and delayed. The obtained transmission law allows designing a blasting which guarantees safety and prevents the risk of future failures.
Keywords: Dynamic modelling, long term instability risks, room and pillar, seismic collapse.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4865613 Three-Stage Mining Metals Supply Chain Coordination and Product Quality Improvement with Revenue Sharing Contract
Authors: Hamed Homaei, Iraj Mahdavi, Ali Tajdin
Abstract:
One of the main concerns of miners is to increase the quality level of their products because the mining metals price depends on their quality level; however, increasing the quality level of these products has different costs at different levels of the supply chain. These costs usually increase after extractor level. This paper studies the coordination issue of a decentralized three-level supply chain with one supplier (extractor), one mineral processor and one manufacturer in which the increasing product quality level cost at the processor level is higher than the supplier and at the level of the manufacturer is more than the processor. We identify the optimal product quality level for each supply chain member by designing a revenue sharing contract. Finally, numerical examples show that the designed contract not only increases the final product quality level but also provides a win-win condition for all supply chain members and increases the whole supply chain profit.
Keywords: Three-stage supply chain, product quality improvement, channel coordination, revenue sharing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11085612 Agile Methodology for Modeling and Design of Data Warehouses -AM4DW-
Authors: Nieto Bernal Wilson, Carmona Suarez Edgar
Abstract:
The organizations have structured and unstructured information in different formats, sources, and systems. Part of these come from ERP under OLTP processing that support the information system, however these organizations in OLAP processing level, presented some deficiencies, part of this problematic lies in that does not exist interesting into extract knowledge from their data sources, as also the absence of operational capabilities to tackle with these kind of projects. Data Warehouse and its applications are considered as non-proprietary tools, which are of great interest to business intelligence, since they are repositories basis for creating models or patterns (behavior of customers, suppliers, products, social networks and genomics) and facilitate corporate decision making and research. The following paper present a structured methodology, simple, inspired from the agile development models as Scrum, XP and AUP. Also the models object relational, spatial data models, and the base line of data modeling under UML and Big data, from this way sought to deliver an agile methodology for the developing of data warehouses, simple and of easy application. The methodology naturally take into account the application of process for the respectively information analysis, visualization and data mining, particularly for patterns generation and derived models from the objects facts structured.
Keywords: Data warehouse, model data, big data, object fact, object relational fact, process developed data warehouse.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14805611 Performance Evaluation of Data Mining Techniques for Predicting Software Reliability
Authors: Pradeep Kumar, Abdul Wahid
Abstract:
Accurate software reliability prediction not only enables developers to improve the quality of software but also provides useful information to help them for planning valuable resources. This paper examines the performance of three well-known data mining techniques (CART, TreeNet and Random Forest) for predicting software reliability. We evaluate and compare the performance of proposed models with Cascade Correlation Neural Network (CCNN) using sixteen empirical databases from the Data and Analysis Center for Software. The goal of our study is to help project managers to concentrate their testing efforts to minimize the software failures in order to improve the reliability of the software systems. Two performance measures, Normalized Root Mean Squared Error (NRMSE) and Mean Absolute Errors (MAE), illustrate that CART model is accurate than the models predicted using Random Forest, TreeNet and CCNN in all datasets used in our study. Finally, we conclude that such methods can help in reliability prediction using real-life failure datasets.
Keywords: Classification, Cascade Correlation Neural Network, Random Forest, Software reliability, TreeNet.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18415610 Software Engineering Inspired Cost Estimation for Process Modelling
Authors: Felix Baumann, Aleksandar Milutinovic, Dieter Roller
Abstract:
Up to this point business process management projects in general and business process modelling projects in particular could not rely on a practical and scientifically validated method to estimate cost and effort. Especially the model development phase is not covered by a cost estimation method or model. Further phases of business process modelling starting with implementation are covered by initial solutions which are discussed in the literature. This article proposes a method of filling this gap by deriving a cost estimation method from available methods in similar domains namely software development or software engineering. Software development is regarded as closely similar to process modelling as we show. After the proposition of this method different ideas for further analysis and validation of the method are proposed. We derive this method from COCOMO II and Function Point which are established methods of effort estimation in the domain of software development. For this we lay out similarities of the software development process and the process of process modelling which is a phase of the Business Process Management life-cycle.Keywords: Cost Estimation, Effort Estimation, Process Modelling, Business Process Management, COCOMO.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22965609 Operational risks Classification for Information Systems with Service-Oriented Architecture (Including Loss Calculation Example)
Authors: Irina Pyrlina
Abstract:
This article presents the results of a study conducted to identify operational risks for information systems (IS) with service-oriented architecture (SOA). Analysis of current approaches to risk and system error classifications revealed that the system error classes were never used for SOA risk estimation. Additionally system error classes are not normallyexperimentally supported with realenterprise error data. Through the study several categories of various existing error classifications systems are applied and three new error categories with sub-categories are identified. As a part of operational risks a new error classification scheme is proposed for SOA applications. It is based on errors of real information systems which are service providers for application with service-oriented architecture. The proposed classification approach has been used to classify SOA system errors for two different enterprises (oil and gas industry, metal and mining industry). In addition we have conducted a research to identify possible losses from operational risks.
Keywords: Enterprise architecture, Error classification, Oil&Gas and Metal&Mining industries, Operational risks, Serviceoriented architecture
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16065608 Destination Port Detection for Vessels: An Analytic Tool for Optimizing Port Authorities Resources
Authors: Lubna Eljabu, Mohammad Etemad, Stan Matwin
Abstract:
Port authorities have many challenges in congested ports to allocate their resources to provide a safe and secure loading/unloading procedure for cargo vessels. Selecting a destination port is the decision of a vessel master based on many factors such as weather, wavelength and changes of priorities. Having access to a tool which leverages Automatic Identification System (AIS) messages to monitor vessel’s movements and accurately predict their next destination port promotes an effective resource allocation process for port authorities. In this research, we propose a method, namely, Reference Route of Trajectory (RRoT) to assist port authorities in predicting inflow and outflow traffic in their local environment by monitoring AIS messages. Our RRo method creates a reference route based on historical AIS messages. It utilizes some of the best trajectory similarity measures to identify the destination of a vessel using their recent movement. We evaluated five different similarity measures such as Discrete Frechet Distance (DFD), Dynamic Time ´ Warping (DTW), Partial Curve Mapping (PCM), Area between two curves (Area) and Curve length (CL). Our experiments show that our method identifies the destination port with an accuracy of 98.97% and an f-measure of 99.08% using Dynamic Time Warping (DTW) similarity measure.
Keywords: Spatial temporal data mining, trajectory mining, trajectory similarity, resource optimization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7055607 Join and Meet Block Based Default Definite Decision Rule Mining from IDT and an Incremental Algorithm
Authors: Chen Wu, Jingyu Yang
Abstract:
Using maximal consistent blocks of tolerance relation on the universe in incomplete decision table, the concepts of join block and meet block are introduced and studied. Including tolerance class, other blocks such as tolerant kernel and compatible kernel of an object are also discussed at the same time. Upper and lower approximations based on those blocks are also defined. Default definite decision rules acquired from incomplete decision table are proposed in the paper. An incremental algorithm to update default definite decision rules is suggested for effective mining tasks from incomplete decision table into which data is appended. Through an example, we demonstrate how default definite decision rules based on maximal consistent blocks, join blocks and meet blocks are acquired and how optimization is done in support of discernibility matrix and discernibility function in the incomplete decision table.Keywords: rough set, incomplete decision table, maximalconsistent block, default definite decision rule, join and meet block.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12905606 Performance Comparison of Particle Swarm Optimization with Traditional Clustering Algorithms used in Self-Organizing Map
Authors: Anurag Sharma, Christian W. Omlin
Abstract:
Self-organizing map (SOM) is a well known data reduction technique used in data mining. It can reveal structure in data sets through data visualization that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOM, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of an adaptive heuristic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOM. The application of our method to several standard data sets demonstrates its feasibility. PSO algorithm utilizes a so-called U-matrix of SOM to determine cluster boundaries; the results of this novel automatic method compare very favorably to boundary detection through traditional algorithms namely k-means and hierarchical based approach which are normally used to interpret the output of SOM.Keywords: cluster boundaries, clustering, code vectors, data mining, particle swarm optimization, self-organizing maps, U-matrix.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19125605 Approximation to the Hardy Operator on Topological Measure Spaces
Authors: Kairat T. Mynbaev, Elena N. Lomakina
Abstract:
We consider a Hardy type operator generated by a family of open subsets of a Hausdorff topological space. The family is indexed with non-negative real numbers and is totally ordered. For this operator, we obtain two-sided bounds of its norm, a compactness criterion and bounds for its approximation numbers. Previously bounds for its approximation numbers have been established only in the one-dimensional case, while we do not impose any restrictions on the dimension of the Hausdorff space. The bounds for the norm and conditions for compactness have been found earlier but our approach is different in that we use domain partitions for all problems under consideration.
Keywords: Approximation numbers, boundedness and compactness, multidimensional Hardy operator, Hausdorff topological space.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1335604 Accessible Business Process Modelling
Authors: D. D. Vaziri, D. DeOliveira
Abstract:
This article concerns with the accessibility of Business process modelling tools (BPMo tools) and business process modelling languages (BPMo languages). Therefore the reader will be introduced to business process management and the authors' motivation behind this inquiry. Afterwards, the paper will reflect problems when applying inaccessible BPMo tools. To illustrate these problems the authors distinguish between two different categories of issues and provide practical examples. Finally the article will present three approaches to improve the accessibility of BPMo tools and BPMo languages.Keywords: Accessibility, Business Process Management, BPM, Event Process Chains, Modelling Languages
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23495603 Predication Model for Leukemia Diseases Based on Data Mining Classification Algorithms with Best Accuracy
Authors: Fahd Sabry Esmail, M. Badr Senousy, Mohamed Ragaie
Abstract:
In recent years, there has been an explosion in the rate of using technology that help discovering the diseases. For example, DNA microarrays allow us for the first time to obtain a "global" view of the cell. It has great potential to provide accurate medical diagnosis, to help in finding the right treatment and cure for many diseases. Various classification algorithms can be applied on such micro-array datasets to devise methods that can predict the occurrence of Leukemia disease. In this study, we compared the classification accuracy and response time among eleven decision tree methods and six rule classifier methods using five performance criteria. The experiment results show that the performance of Random Tree is producing better result. Also it takes lowest time to build model in tree classifier. The classification rules algorithms such as nearest- neighbor-like algorithm (NNge) is the best algorithm due to the high accuracy and it takes lowest time to build model in classification.
Keywords: Data mining, classification techniques, decision tree, classification rule, leukemia diseases, microarray data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25625602 Multimedia Data Fusion for Event Detection in Twitter by Using Dempster-Shafer Evidence Theory
Authors: Samar M. Alqhtani, Suhuai Luo, Brian Regan
Abstract:
Data fusion technology can be the best way to extract useful information from multiple sources of data. It has been widely applied in various applications. This paper presents a data fusion approach in multimedia data for event detection in twitter by using Dempster-Shafer evidence theory. The methodology applies a mining algorithm to detect the event. There are two types of data in the fusion. The first is features extracted from text by using the bag-ofwords method which is calculated using the term frequency-inverse document frequency (TF-IDF). The second is the visual features extracted by applying scale-invariant feature transform (SIFT). The Dempster - Shafer theory of evidence is applied in order to fuse the information from these two sources. Our experiments have indicated that comparing to the approaches using individual data source, the proposed data fusion approach can increase the prediction accuracy for event detection. The experimental result showed that the proposed method achieved a high accuracy of 0.97, comparing with 0.93 with texts only, and 0.86 with images only.Keywords: Data fusion, Dempster-Shafer theory, data mining, event detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18015601 Unsupervised Texture Segmentation via Applying Geodesic Active Regions to Gaborian Feature Space
Authors: Yuan He, Yupin Luo, Dongcheng Hu
Abstract:
In this paper, we propose a novel variational method for unsupervised texture segmentation. We use a Gabor filter bank to extract texture features. Some of the filtered channels form a multidimensional Gaborian feature space. To avoid deforming contours directly in a vector-valued space we use a Gaussian mixture model to describe the statistical distribution of this space and get the boundary and region probabilities. Then a framework of geodesic active regions is applied based on them. In the end, experimental results are presented, and show that this method can obtain satisfied boundaries between different texture regions.
Keywords: Texture segmentation, Gabor filter, snakes, Geodesicactive regions
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1776