Search results for: data annotation models
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9018

Search results for: data annotation models

8868 Modeling of Random Variable with Digital Probability Hyper Digraph: Data-Oriented Approach

Authors: A. Habibizad Navin, M. Naghian Fesharaki, M. Mirnia, M. Kargar

Abstract:

In this paper we introduce Digital Probability Hyper Digraph for modeling random variable as the hierarchical data-oriented model.

Keywords: Data-Oriented Models, Data Structure, DigitalProbability Hyper Digraph, Random Variable, Statistic andProbability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1229
8867 CAD Based Predictive Models of the Undeformed Chip Geometry in Drilling

Authors: Panagiotis Kyratsis, Dr. Ing. Nikolaos Bilalis, Dr. Ing. Aristomenis Antoniadis

Abstract:

Twist drills are geometrical complex tools and thus various researchers have adopted different mathematical and experimental approaches for their simulation. The present paper acknowledges the increasing use of modern CAD systems and using the API (Application Programming Interface) of a CAD system, drilling simulations are carried out. The developed DRILL3D software routine, creates parametrically controlled tool geometries and using different cutting conditions, achieves the generation of solid models for all the relevant data involved (drilling tool, cut workpiece, undeformed chip). The final data derived, consist a platform for further direct simulations regarding the determination of cutting forces, tool wear, drilling optimizations etc.

Keywords: Drilling, CAD based simulation, 3D-modelling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1840
8866 Using Historical Data for Stock Prediction of a Tech Company

Authors: Sofia Stoica

Abstract:

In this paper, we use historical data to predict the stock price of a tech company. To this end, we use a dataset consisting of the stock prices over the past five years of 10 major tech companies: Adobe, Amazon, Apple, Facebook, Google, Microsoft, Netflix, Oracle, Salesforce, and Tesla. We implemented and tested three models – a linear regressor model, a k-nearest neighbor model (KNN), and a sequential neural network – and two algorithms – Multiplicative Weight Update and AdaBoost. We found that the sequential neural network performed the best, with a testing error of 0.18%. Interestingly, the linear model performed the second best with a testing error of 0.73%. These results show that using historical data is enough to obtain high accuracies, and a simple algorithm like linear regression has a performance similar to more sophisticated models while taking less time and resources to implement.

Keywords: Finance, machine learning, opening price, stock market.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 311
8865 Analyzing and Comparing the Hot-spot Thermal Models of HV/LV Prefabricated and Outdoor Oil-Immersed Power Transformers

Authors: Ali Mamizadeh, Ires Iskender

Abstract:

The most important parameter in transformers life expectancy is the hot-spot temperature level which accelerates the rate of aging of the insulation. The aim of this paper is to present thermal models for transformers loaded at prefabricated MV/LV transformer substations and outdoor situations. The hot-spot temperature of transformers is studied using their top-oil temperature rise models. The thermal models proposed for hot-spot and top-oil temperatures of different operating situations are compared. Since the thermal transfer is different for indoor and outdoor transformers considering their operating conditions, their hot-spot thermal models differ from each other. The proposed thermal models are verified by the results obtained from the experiments carried out on a typical 1600 kVA, 30 /0.4 kV, ONAN transformer for both indoor and outdoor situations.

Keywords: Hot-spot Temperature, Dynamic Thermal Model, MV/LV Prefabricated, Oil Immersed Transformers

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1473
8864 A Forecast Model for Projecting the Amount of Hazardous Waste

Authors: J. Vilgerts, L. Timma, D. Blumberga

Abstract:

The objective of the paper is to develop the forecast model for the HW flows. The methodology of the research included 6 modules: historical data, assumptions, choose of indicators, data processing, and data analysis with STATGRAPHICS, and forecast models. The proposed methodology was validated for the case study for Latvia. Hypothesis on the changes in HW for time period of 2010-2020 have been developed and mathematically described with confidence level of 95.0% and 50.0%. Sensitivity analysis for the analyzed scenarios was done. The results show that the growth of GDP affects the total amount of HW in the country. The total amount of the HW is projected to be within the corridor of – 27.7% in the optimistic scenario up to +87.8% in the pessimistic scenario with confidence level of 50.0% for period of 2010-2020. The optimistic scenario has shown to be the least flexible to the changes in the GDP growth.

Keywords: Forecast models, hazardous waste management, sustainable development, waste management indicators.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1811
8863 Development of a Non-invasive System to Measure the Thickness of the Subcutaneous Adipose Tissue Layer for Human

Authors: Hyuck Ki Hong, Young Chang Jo, Yeon Shik Choi, Beom Joon Kim, Hyo Derk Park

Abstract:

To measure the thickness of the subcutaneous adipose tissue layer, a non-invasive optical measurement system (λ=1300 nm) is introduced. Animal and human subjects are used for the experiments. The results of human subjects are compared with the data of ultrasound device measurements, and a high correlation (r=0.94 for n=11) is observed. There are two modes in the corresponding signals measured by the optical system, which can be explained by two-layered and three-layered tissue models. If the target tissue is thinner than the critical thickness, detected data using diffuse reflectance method follow the three-layered tissue model, so the data increase as the thickness increases. On the other hand, if the target tissue is thicker than the critical thickness, the data follow the two-layered tissue model, so they decrease as the thickness increases.

Keywords: Subcutaneous adipose tissue layer, non-invasive measurement system, two-layered and three-layered tissue models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1807
8862 The Impact of Semantic Web on E-Commerce

Authors: Karim Heidari

Abstract:

Semantic Web Technologies enable machines to interpret data published in a machine-interpretable form on the web. At the present time, only human beings are able to understand the product information published online. The emerging semantic Web technologies have the potential to deeply influence the further development of the Internet Economy. In this paper we propose a scenario based research approach to predict the effects of these new technologies on electronic markets and business models of traders and intermediaries and customers. Over 300 million searches are conducted everyday on the Internet by people trying to find what they need. A majority of these searches are in the domain of consumer ecommerce, where a web user is looking for something to buy. This represents a huge cost in terms of people hours and an enormous drain of resources. Agent enabled semantic search will have a dramatic impact on the precision of these searches. It will reduce and possibly eliminate information asymmetry where a better informed buyer gets the best value. By impacting this key determinant of market prices semantic web will foster the evolution of different business and economic models. We submit that there is a need for developing these futuristic models based on our current understanding of e-commerce models and nascent semantic web technologies. We believe these business models will encourage mainstream web developers and businesses to join the “semantic web revolution."

Keywords: E-Commerce, E-Business, Semantic Web, XML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3411
8861 A New Brazilian Friction-Resistant Low Alloy High Strength Steel – A Life Testing Approach

Authors: D. I. De Souza, G. P. Azevedo, R. Rocha

Abstract:

In this paper we will develop a sequential life test approach applied to a modified low alloy-high strength steel part used in highway overpasses in Brazil.We will consider two possible underlying sampling distributions: the Normal and theInverse Weibull models. The minimum life will be considered equal to zero. We will use the two underlying models to analyze a fatigue life test situation, comparing the results obtained from both.Since a major chemical component of this low alloy-high strength steel part has been changed, there is little information available about the possible values that the parameters of the corresponding Normal and Inverse Weibull underlying sampling distributions could have. To estimate the shape and the scale parameters of these two sampling models we will use a maximum likelihood approach for censored failure data. We will also develop a truncation mechanism for the Inverse Weibull and Normal models. We will provide rules to truncate a sequential life testing situation making one of the two possible decisions at the moment of truncation; that is, accept or reject the null hypothesis H0. An example will develop the proposed truncated sequential life testing approach for the Inverse Weibull and Normal models.

Keywords: Sequential life testing, normal and inverse Weibull models, maximum likelihood approach, truncation mechanism.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1390
8860 Comparison of Artificial Neural Network Architectures in the Task of Tourism Time Series Forecast

Authors: João Paulo Teixeira, Paula Odete Fernandes

Abstract:

The authors have been developing several models based on artificial neural networks, linear regression models, Box- Jenkins methodology and ARIMA models to predict the time series of tourism. The time series consist in the “Monthly Number of Guest Nights in the Hotels" of one region. Several comparisons between the different type models have been experimented as well as the features used at the entrance of the models. The Artificial Neural Network (ANN) models have always had their performance at the top of the best models. Usually the feed-forward architecture was used due to their huge application and results. In this paper the author made a comparison between different architectures of the ANNs using simply the same input. Therefore, the traditional feed-forward architecture, the cascade forwards, a recurrent Elman architecture and a radial based architecture were discussed and compared based on the task of predicting the mentioned time series.

Keywords: Artificial Neural Network Architectures, time series forecast, tourism.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1834
8859 MATLAB-Based Graphical User Interface (GUI) for Data Mining as a Tool for Environment Management

Authors: M. Awawdeh, A. Fedi

Abstract:

The application of data mining to environmental monitoring has become crucial for a number of tasks related to emergency management. Over recent years, many tools have been developed for decision support system (DSS) for emergency management. In this article a graphical user interface (GUI) for environmental monitoring system is presented. This interface allows accomplishing (i) data collection and observation and (ii) extraction for data mining. This tool may be the basis for future development along the line of the open source software paradigm.

Keywords: Data Mining, Environmental data, Mathematical Models, Matlab Graphical User Interface.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4693
8858 Rapid Study on Feature Extraction and Classification Models in Healthcare Applications

Authors: S. Sowmyayani

Abstract:

The advancement of computer-aided design helps the medical force and security force. Some applications include biometric recognition, elderly fall detection, face recognition, cancer recognition, tumor recognition, etc. This paper deals with different machine learning algorithms that are more generically used for any health care system. The most focused problems are classification and regression. With the rise of big data, machine learning has become particularly important for solving problems. Machine learning uses two types of techniques: supervised learning and unsupervised learning. The former trains a model on known input and output data and predicts future outputs. Classification and regression are supervised learning techniques. Unsupervised learning finds hidden patterns in input data. Clustering is one such unsupervised learning technique. The above-mentioned models are discussed briefly in this paper.

Keywords: Supervised learning, unsupervised learning, regression, neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 275
8857 3D Point Cloud Model Color Adjustment by Combining Terrestrial Laser Scanner and Close Range Photogrammetry Datasets

Authors: M. Pepe, S. Ackermann, L. Fregonese, C. Achille

Abstract:

3D models obtained with advanced survey techniques such as close-range photogrammetry and laser scanner are nowadays particularly appreciated in Cultural Heritage and Archaeology fields. In order to produce high quality models representing archaeological evidences and anthropological artifacts, the appearance of the model (i.e. color) beyond the geometric accuracy, is not a negligible aspect. The integration of the close-range photogrammetry survey techniques with the laser scanner is still a topic of study and research. By combining point cloud data sets of the same object generated with both technologies, or with the same technology but registered in different moment and/or natural light condition, could construct a final point cloud with accentuated color dissimilarities. In this paper, a methodology to uniform the different data sets, to improve the chromatic quality and to highlight further details by balancing the point color will be presented.

Keywords: Color models, cultural heritage, laser scanner, photogrammetry, point cloud color.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1582
8856 Currency Exchange Rate Forecasts Using Quantile Regression

Authors: Yuzhi Cai

Abstract:

In this paper, we discuss a Bayesian approach to quantile autoregressive (QAR) time series model estimation and forecasting. Together with a combining forecasts technique, we then predict USD to GBP currency exchange rates. Combined forecasts contain all the information captured by the fitted QAR models at different quantile levels and are therefore better than those obtained from individual models. Our results show that an unequally weighted combining method performs better than other forecasting methodology. We found that a median AR model can perform well in point forecasting when the predictive density functions are symmetric. However, in practice, using the median AR model alone may involve the loss of information about the data captured by other QAR models. We recommend that combined forecasts should be used whenever possible.

Keywords: Exchange rate, quantile regression, combining forecasts.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1726
8855 Review of Models of Consumer Behaviour and Influence of Emotions in the Decision Making

Authors: Mikel Alonso López

Abstract:

In order to begin the process of studying the task of making consumer decisions, the main decision models must be analyzed. The objective of this task is to see if there is a presence of emotions in those models, and analyze how authors that have created them consider their impact in consumer choices. In this paper, the most important models of consumer behavior are analysed. This review is useful to consider an unproblematic background knowledge in the literature. The order that has been established for this study is chronological.

Keywords: Consumer behaviour, emotions, decision making, consumer psychology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2812
8854 Building a Scalable Telemetry Based Multiclass Predictive Maintenance Model in R

Authors: Jaya Mathew

Abstract:

Many organizations are faced with the challenge of how to analyze and build Machine Learning models using their sensitive telemetry data. In this paper, we discuss how users can leverage the power of R without having to move their big data around as well as a cloud based solution for organizations willing to host their data in the cloud. By using ScaleR technology to benefit from parallelization and remote computing or R Services on premise or in the cloud, users can leverage the power of R at scale without having to move their data around.

Keywords: Predictive maintenance, machine learning, big data, cloud, on premise SQL, R.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1873
8853 Automated Process Quality Monitoring with Prediction of Fault Condition Using Measurement Data

Authors: Hyun-Woo Cho

Abstract:

Detection of incipient abnormal events is important to improve safety and reliability of machine operations and reduce losses caused by failures. Improper set-ups or aligning of parts often leads to severe problems in many machines. The construction of prediction models for predicting faulty conditions is quite essential in making decisions on when to perform machine maintenance. This paper presents a multivariate calibration monitoring approach based on the statistical analysis of machine measurement data. The calibration model is used to predict two faulty conditions from historical reference data. This approach utilizes genetic algorithms (GA) based variable selection, and we evaluate the predictive performance of several prediction methods using real data. The results shows that the calibration model based on supervised probabilistic principal component analysis (SPPCA) yielded best performance in this work. By adopting a proper variable selection scheme in calibration models, the prediction performance can be improved by excluding non-informative variables from their model building steps.

Keywords: Prediction, operation monitoring, on-line data, nonlinear statistical methods, empirical model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1623
8852 Investigation of Layer Thickness and Surface Roughness on Aerodynamic Coefficients of Wind Tunnel RP Models

Authors: S. Daneshmand, A. Ahmadi Nadooshan, C. Aghanajafi

Abstract:

Traditional wind tunnel models are meticulously machined from metal in a process that can take several months. While very precise, the manufacturing process is too slow to assess a new design's feasibility quickly. Rapid prototyping technology makes this concurrent study of air vehicle concepts via computer simulation and in the wind tunnel possible. This paper described the Affects layer thickness models product with rapid prototyping on Aerodynamic Coefficients for Constructed wind tunnel testing models. Three models were evaluated. The first model was a 0.05mm layer thickness and Horizontal plane 0.1μm (Ra) second model was a 0.125mm layer thickness and Horizontal plane 0.22μm (Ra) third model was a 0.15mm layer thickness and Horizontal plane 4.6μm (Ra). These models were fabricated from somos 18420 by a stereolithography (SLA). A wing-body-tail configuration was chosen for the actual study. Testing covered the Mach range of Mach 0.3 to Mach 0.9 at an angle-of-attack range of -2° to +12° at zero sideslip. Coefficients of normal force, axial force, pitching moment, and lift over drag are shown at each of these Mach numbers. Results from this study show that layer thickness does have an effect on the aerodynamic characteristics in general; the data differ between the three models by fewer than 5%. The layer thickness does have more effect on the aerodynamic characteristics when Mach number is decreased and had most effect on the aerodynamic characteristics of axial force and its derivative coefficients.

Keywords: Aerodynamic characteristics, stereolithography, layer thickness, Rapid prototyping, surface finish.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2884
8851 Wind Power Forecast Error Simulation Model

Authors: Josip Vasilj, Petar Sarajcev, Damir Jakus

Abstract:

One of the major difficulties introduced with wind power penetration is the inherent uncertainty in production originating from uncertain wind conditions. This uncertainty impacts many different aspects of power system operation, especially the balancing power requirements. For this reason, in power system development planing, it is necessary to evaluate the potential uncertainty in future wind power generation. For this purpose, simulation models are required, reproducing the performance of wind power forecasts. This paper presents a wind power forecast error simulation models which are based on the stochastic process simulation. Proposed models capture the most important statistical parameters recognized in wind power forecast error time series. Furthermore, two distinct models are presented based on data availability. First model uses wind speed measurements on potential or existing wind power plant locations, while the seconds model uses statistical distribution of wind speeds.

Keywords: Wind power, Uncertainty, Stochastic process, Monte Carlo simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3882
8850 Dynamic Analyses for Passenger Volume of Domestic Airline and High Speed Rail

Authors: Shih-Ching Lo

Abstract:

Discrete choice model is the most used methodology for studying traveler-s mode choice and demand. However, to calibrate the discrete choice model needs to have plenty of questionnaire survey. In this study, an aggregative model is proposed. The historical data of passenger volumes for high speed rail and domestic civil aviation are employed to calibrate and validate the model. In this study, different models are compared so as to propose the best one. From the results, systematic equations forecast better than single equation do. Models with the external variable, which is oil price, are better than models based on closed system assumption.

Keywords: forecasting, passenger volume, dynamic competition model, external variable, oil price

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1418
8849 Comparison of Methods of Estimation for Use in Goodness of Fit Tests for Binary Multilevel Models

Authors: I. V. Pinto, M. R. Sooriyarachchi

Abstract:

It can be frequently observed that the data arising in our environment have a hierarchical or a nested structure attached with the data. Multilevel modelling is a modern approach to handle this kind of data. When multilevel modelling is combined with a binary response, the estimation methods get complex in nature and the usual techniques are derived from quasi-likelihood method. The estimation methods which are compared in this study are, marginal quasi-likelihood (order 1 & order 2) (MQL1, MQL2) and penalized quasi-likelihood (order 1 & order 2) (PQL1, PQL2). A statistical model is of no use if it does not reflect the given dataset. Therefore, checking the adequacy of the fitted model through a goodness-of-fit (GOF) test is an essential stage in any modelling procedure. However, prior to usage, it is also equally important to confirm that the GOF test performs well and is suitable for the given model. This study assesses the suitability of the GOF test developed for binary response multilevel models with respect to the method used in model estimation. An extensive set of simulations was conducted using MLwiN (v 2.19) with varying number of clusters, cluster sizes and intra cluster correlations. The test maintained the desirable Type-I error for models estimated using PQL2 and it failed for almost all the combinations of MQL. Power of the test was adequate for most of the combinations in all estimation methods except MQL1. Moreover, models were fitted using the four methods to a real-life dataset and performance of the test was compared for each model.

Keywords: Goodness-of-fit test, marginal quasi-likelihood, multilevel modelling, type-I error, penalized quasi-likelihood, power, quasi-likelihood.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 696
8848 Data Envelopment Analysis under Uncertainty and Risk

Authors: P. Beraldi, M. E. Bruni

Abstract:

Data Envelopment Analysis (DEA) is one of the most widely used technique for evaluating the relative efficiency of a set of homogeneous decision making units. Traditionally, it assumes that input and output variables are known in advance, ignoring the critical issue of data uncertainty. In this paper, we deal with the problem of efficiency evaluation under uncertain conditions by adopting the general framework of the stochastic programming. We assume that output parameters are represented by discretely distributed random variables and we propose two different models defined according to a neutral and risk-averse perspective. The models have been validated by considering a real case study concerning the evaluation of the technical efficiency of a sample of individual firms operating in the Italian leather manufacturing industry. Our findings show the validity of the proposed approach as ex-ante evaluation technique by providing the decision maker with useful insights depending on his risk aversion degree.

Keywords: DEA, Stochastic Programming, Ex-ante evaluation technique, Conditional Value at Risk.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1924
8847 MIM: A Species Independent Approach for Classifying Coding and Non-Coding DNA Sequences in Bacterial and Archaeal Genomes

Authors: Achraf El Allali, John R. Rose

Abstract:

A number of competing methodologies have been developed to identify genes and classify DNA sequences into coding and non-coding sequences. This classification process is fundamental in gene finding and gene annotation tools and is one of the most challenging tasks in bioinformatics and computational biology. An information theory measure based on mutual information has shown good accuracy in classifying DNA sequences into coding and noncoding. In this paper we describe a species independent iterative approach that distinguishes coding from non-coding sequences using the mutual information measure (MIM). A set of sixty prokaryotes is used to extract universal training data. To facilitate comparisons with the published results of other researchers, a test set of 51 bacterial and archaeal genomes was used to evaluate MIM. These results demonstrate that MIM produces superior results while remaining species independent.

Keywords: Coding Non-coding Classification, Entropy, GeneRecognition, Mutual Information.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1677
8846 Native Language Identification with Cross-Corpus Evaluation Using Social Media Data: 'Reddit'

Authors: Yasmeen Bassas, Sandra Kuebler, Allen Riddell

Abstract:

Native Language Identification is one of the growing subfields in Natural Language Processing (NLP). The task of Native Language Identification (NLI) is mainly concerned with predicting the native language of an author’s writing in a second language. In this paper, we investigate the performance of two types of features; content-based features vs. content independent features when they are evaluated on a different corpus (using social media data “Reddit”). In this NLI task, the predefined models are trained on one corpus (TOEFL) and then the trained models are evaluated on a different data using an external corpus (Reddit). Three classifiers are used in this task; the baseline, linear SVM, and Logistic Regression. Results show that content-based features are more accurate and robust than content independent ones when tested within corpus and across corpus.

Keywords: NLI, NLP, content-based features, content independent features, social media corpus, ML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 325
8845 Innovative Methods of Improving Train Formation in Freight Transport

Authors: Jaroslav Masek, Juraj Camaj, Eva Nedeliakova

Abstract:

The paper is focused on the operational model for transport the single wagon consignments on railway network by using two different models of train formation. The paper gives an overview of possibilities of improving the quality of transport services. Paper deals with two models used in problematic of train formatting - time continuously and time discrete. By applying these models in practice, the transport company can guarantee a higher quality of service and expect increasing of transport performance. The models are also applicable into others transport networks. The models supplement a theoretical problem of train formation by new ways of looking to affecting the organization of wagon flows.

Keywords: Train formation, wagon flows, marshalling yard, railway technology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1971
8844 Moving Data Mining Tools toward a Business Intelligence System

Authors: Nittaya Kerdprasop, Kittisak Kerdprasop

Abstract:

Data mining (DM) is the process of finding and extracting frequent patterns that can describe the data, or predict unknown or future values. These goals are achieved by using various learning algorithms. Each algorithm may produce a mining result completely different from the others. Some algorithms may find millions of patterns. It is thus the difficult job for data analysts to select appropriate models and interpret the discovered knowledge. In this paper, we describe a framework of an intelligent and complete data mining system called SUT-Miner. Our system is comprised of a full complement of major DM algorithms, pre-DM and post-DM functionalities. It is the post-DM packages that ease the DM deployment for business intelligence applications.

Keywords: Business intelligence, data mining, functionalprogramming, intelligent system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1685
8843 Analysis of Mathematical Models and Their Application to Extreme Events

Authors: Avellino I. Mondlane, Karin Hansson, Oliver Popov

Abstract:

This paper discusses the application of extreme events distribution taking the Limpopo River Basin at Xai-Xai station, in Mozambique, as a case analysis. We analyze the extreme value concepts, namely Gumbel, Fréchet, Weibull and Generalized Extreme Value Distributions and then extrapolate the original data to 1000, 5000 and 10000 figures for further simulations and we compare their outcomes based on these three main distributions.

Keywords: Catastrophes, extreme event, disasters, mathematical models, simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2487
8842 Effect of Assumptions of Normal Shock Location on the Design of Supersonic Ejectors for Refrigeration

Authors: Payam Haghparast, Mikhail V. Sorin, Hakim Nesreddine

Abstract:

The complex oblique shock phenomenon can be simply assumed as a normal shock at the constant area section to simulate a sharp pressure increase and velocity decrease in 1-D thermodynamic models. The assumed normal shock location is one of the greatest sources of error in ejector thermodynamic models. Most researchers consider an arbitrary location without justifying it. Our study compares the effect of normal shock place on ejector dimensions in 1-D models. To this aim, two different ejector experimental test benches, a constant area-mixing ejector (CAM) and a constant pressure-mixing (CPM) are considered, with different known geometries, operating conditions and working fluids (R245fa, R141b). In the first step, in order to evaluate the real value of the efficiencies in the different ejector parts and critical back pressure, a CFD model was built and validated by experimental data for two types of ejectors. These reference data are then used as input to the 1D model to calculate the lengths and the diameters of the ejectors. Afterwards, the design output geometry calculated by the 1D model is compared directly with the corresponding experimental geometry. It was found that there is a good agreement between the ejector dimensions obtained by the 1D model, for both CAM and CPM, with experimental ejector data. Furthermore, it is shown that normal shock place affects only the constant area length as it is proven that the inlet normal shock assumption results in more accurate length. Taking into account previous 1D models, the results suggest the use of the assumed normal shock location at the inlet of the constant area duct to design the supersonic ejectors.

Keywords: 1D model, constant area-mixing, constant pressure-mixing, normal shock location, ejector dimensions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 909
8841 Review of the Road Crash Data Availability in Iraq

Authors: Abeer K. Jameel, Harry Evdorides

Abstract:

Iraq is a middle income country where the road safety issue is considered one of the leading causes of deaths. To control the road risk issue, the Iraqi Ministry of Planning, General Statistical Organization started to organise a collection system of traffic accidents data with details related to their causes and severity. These data are published as an annual report. In this paper, a review of the available crash data in Iraq will be presented. The available data represent the rate of accidents in aggregated level and classified according to their types, road users’ details, and crash severity, type of vehicles, causes and number of causalities. The review is according to the types of models used in road safety studies and research, and according to the required road safety data in the road constructions tasks. The available data are also compared with the road safety dataset published in the United Kingdom as an example of developed country. It is concluded that the data in Iraq are suitable for descriptive and exploratory models, aggregated level comparison analysis, and evaluation and monitoring the progress of the overall traffic safety performance. However, important traffic safety studies require disaggregated level of data and details related to the factors of the likelihood of traffic crashes. Some studies require spatial geographic details such as the location of the accidents which is essential in ranking the roads according to their level of safety, and name the most dangerous roads in Iraq which requires tactic plan to control this issue. Global Road safety agencies interested in solve this problem in low and middle-income countries have designed road safety assessment methodologies which are basing on the road attributes data only. Therefore, in this research it is recommended to use one of these methodologies.

Keywords: Data availability, Iraq, road safety.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 859
8840 Identification of Nonlinear Predictor and Simulator Models of a Cement Rotary Kiln by Locally Linear Neuro-Fuzzy Technique

Authors: Masoud Sadeghian, Alireza Fatehi

Abstract:

One of the most important parts of a cement factory is the cement rotary kiln which plays a key role in quality and quantity of produced cement. In this part, the physical exertion and bilateral movement of air and materials, together with chemical reactions take place. Thus, this system has immensely complex and nonlinear dynamic equations. These equations have not worked out yet. Only in exceptional case; however, a large number of the involved parameter were crossed out and an approximation model was presented instead. This issue caused many problems for designing a cement rotary kiln controller. In this paper, we presented nonlinear predictor and simulator models for a real cement rotary kiln by using nonlinear identification technique on the Locally Linear Neuro- Fuzzy (LLNF) model. For the first time, a simulator model as well as a predictor one with a precise fifteen minute prediction horizon for a cement rotary kiln is presented. These models are trained by LOLIMOT algorithm which is an incremental tree-structure algorithm. At the end, the characteristics of these models are expressed. Furthermore, we presented the pros and cons of these models. The data collected from White Saveh Cement Company is used for modeling.

Keywords: Cement rotary kiln, nonlinear identification, Locally Linear Neuro-Fuzzy model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1977
8839 Dimensional Modeling of HIV Data Using Open Source

Authors: Charles D. Otine, Samuel B. Kucel, Lena Trojer

Abstract:

Selecting the data modeling technique for an information system is determined by the objective of the resultant data model. Dimensional modeling is the preferred modeling technique for data destined for data warehouses and data mining, presenting data models that ease analysis and queries which are in contrast with entity relationship modeling. The establishment of data warehouses as components of information system landscapes in many organizations has subsequently led to the development of dimensional modeling. This has been significantly more developed and reported for the commercial database management systems as compared to the open sources thereby making it less affordable for those in resource constrained settings. This paper presents dimensional modeling of HIV patient information using open source modeling tools. It aims to take advantage of the fact that the most affected regions by the HIV virus are also heavily resource constrained (sub-Saharan Africa) whereas having large quantities of HIV data. Two HIV data source systems were studied to identify appropriate dimensions and facts these were then modeled using two open source dimensional modeling tools. Use of open source would reduce the software costs for dimensional modeling and in turn make data warehousing and data mining more feasible even for those in resource constrained settings but with data available.

Keywords: About Database, Data Mining, Data warehouse, Dimensional Modeling, Open Source.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1912