Search results for: regression models.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3040

Search results for: regression models.

2890 A General Regression Test Selection Technique

Authors: Walid S. Abd El-hamid, Sherif S. El-etriby, Mohiy M. Hadhoud

Abstract:

This paper presents a new methodology to select test cases from regression test suites. The selection strategy is based on analyzing the dynamic behavior of the applications that written in any programming language. Methods based on dynamic analysis are more safe and efficient. We design a technique that combine the code based technique and model based technique, to allow comparing the object oriented of an application that written in any programming language. We have developed a prototype tool that detect changes and select test cases from test suite.

Keywords: Regression testing, Model based testing, Dynamicbehavior.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1929
2889 Power MOSFET Models Including Quasi-Saturation Effect

Authors: Abdelghafour Galadi

Abstract:

In this paper, accurate power MOSFET models including quasi-saturation effect are presented. These models have no internal node voltages determined by the circuit simulator and use one JFET or one depletion mode MOSFET transistors controlled by an “effective” gate voltage taking into account the quasi-saturation effect. The proposed models achieve accurate simulation results with an average error percentage less than 9%, which is an improvement of 21 percentage points compared to the commonly used standard power MOSFET model. In addition, the models can be integrated in any available commercial circuit simulators by using their analytical equations. A description of the models will be provided along with the parameter extraction procedure.

Keywords: Power MOSFET, drift layer, quasi-saturation effect, SPICE model, circuit simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1960
2888 Electron-Impact Excitation of Kr 5s, 5p Levels

Authors: Alla A. Mityureva

Abstract:

The available data on the cross sections of electronimpact excitation of krypton 5s and 5p configuration levels out of the ground state are represented in convenient and compact form. The results are obtained by regression through all known published data related to this process.

Keywords: Cross section, electron excitation, krypton, regression

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1048
2887 Enhancing Temporal Extrapolation of Wind Speed Using a Hybrid Technique: A Case Study in West Coast of Denmark

Authors: B. Elshafei, X. Mao

Abstract:

The demand for renewable energy is significantly increasing, major investments are being supplied to the wind power generation industry as a leading source of clean energy. The wind energy sector is entirely dependable and driven by the prediction of wind speed, which by the nature of wind is very stochastic and widely random. This s0tudy employs deep multi-fidelity Gaussian process regression, used to predict wind speeds for medium term time horizons. Data of the RUNE experiment in the west coast of Denmark were provided by the Technical University of Denmark, which represent the wind speed across the study area from the period between December 2015 and March 2016. The study aims to investigate the effect of pre-processing the data by denoising the signal using empirical wavelet transform (EWT) and engaging the vector components of wind speed to increase the number of input data layers for data fusion using deep multi-fidelity Gaussian process regression (GPR). The outcomes were compared using root mean square error (RMSE) and the results demonstrated a significant increase in the accuracy of predictions which demonstrated that using vector components of the wind speed as additional predictors exhibits more accurate predictions than strategies that ignore them, reflecting the importance of the inclusion of all sub data and pre-processing signals for wind speed forecasting models.

Keywords: Data fusion, Gaussian process regression, signal denoise, temporal extrapolation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 449
2886 Zero Inflated Strict Arcsine Regression Model

Authors: Y. N. Phang, E. F. Loh

Abstract:

Zero inflated strict arcsine model is a newly developed model which is found to be appropriate in modeling overdispersed count data. In this study, we extend zero inflated strict arcsine model to zero inflated strict arcsine regression model by taking into consideration the extra variability caused by extra zeros and covariates in count data. Maximum likelihood estimation method is used in estimating the parameters for this zero inflated strict arcsine regression model.

Keywords: Overdispersed count data, maximum likelihood estimation, simulated annealing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1708
2885 Comparison of Polynomial and Radial Basis Kernel Functions based SVR and MLR in Modeling Mass Transfer by Vertical and Inclined Multiple Plunging Jets

Authors: S. Deswal, M. Pal

Abstract:

Presently various computational techniques are used in modeling and analyzing environmental engineering data. In the present study, an intra-comparison of polynomial and radial basis kernel functions based on Support Vector Regression and, in turn, an inter-comparison with Multi Linear Regression has been attempted in modeling mass transfer capacity of vertical (θ = 90O) and inclined (θ multiple plunging jets (varying from 1 to 16 numbers). The data set used in this study consists of four input parameters with a total of eighty eight cases, forty four each for vertical and inclined multiple plunging jets. For testing, tenfold cross validation was used. Correlation coefficient values of 0.971 and 0.981 along with corresponding root mean square error values of 0.0025 and 0.0020 were achieved by using polynomial and radial basis kernel functions based Support Vector Regression respectively. An intra-comparison suggests improved performance by radial basis function in comparison to polynomial kernel based Support Vector Regression. Further, an inter-comparison with Multi Linear Regression (correlation coefficient = 0.973 and root mean square error = 0.0024) reveals that radial basis kernel functions based Support Vector Regression performs better in modeling and estimating mass transfer by multiple plunging jets.

Keywords: Mass transfer, multiple plunging jets, polynomial and radial basis kernel functions, Support Vector Regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1381
2884 Analyzing and Comparing the Hot-spot Thermal Models of HV/LV Prefabricated and Outdoor Oil-Immersed Power Transformers

Authors: Ali Mamizadeh, Ires Iskender

Abstract:

The most important parameter in transformers life expectancy is the hot-spot temperature level which accelerates the rate of aging of the insulation. The aim of this paper is to present thermal models for transformers loaded at prefabricated MV/LV transformer substations and outdoor situations. The hot-spot temperature of transformers is studied using their top-oil temperature rise models. The thermal models proposed for hot-spot and top-oil temperatures of different operating situations are compared. Since the thermal transfer is different for indoor and outdoor transformers considering their operating conditions, their hot-spot thermal models differ from each other. The proposed thermal models are verified by the results obtained from the experiments carried out on a typical 1600 kVA, 30 /0.4 kV, ONAN transformer for both indoor and outdoor situations.

Keywords: Hot-spot Temperature, Dynamic Thermal Model, MV/LV Prefabricated, Oil Immersed Transformers

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1473
2883 Solving Single Machine Total Weighted Tardiness Problem Using Gaussian Process Regression

Authors: Wanatchapong Kongkaew

Abstract:

This paper proposes an application of probabilistic technique, namely Gaussian process regression, for estimating an optimal sequence of the single machine with total weighted tardiness (SMTWT) scheduling problem. In this work, the Gaussian process regression (GPR) model is utilized to predict an optimal sequence of the SMTWT problem, and its solution is improved by using an iterated local search based on simulated annealing scheme, called GPRISA algorithm. The results show that the proposed GPRISA method achieves a very good performance and a reasonable trade-off between solution quality and time consumption. Moreover, in the comparison of deviation from the best-known solution, the proposed mechanism noticeably outperforms the recently existing approaches.

 

Keywords: Gaussian process regression, iterated local search, simulated annealing, single machine total weighted tardiness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2196
2882 Predictability of the Two Commonly Used Models to Represent the Thin-layer Re-wetting Characteristics of Barley

Authors: M. A. Basunia

Abstract:

Thirty three re-wetting tests were conducted at different combinations of temperatures (5.7- 46.30C) and relative humidites (48.2-88.6%) with barley. Two most commonly used thinlayer drying and rewetting models i.e. Page and Diffusion were compared for their ability to the fit the experimental re-wetting data based on the standard error of estimate (SEE) of the measured and simulated moisture contents. The comparison shows both the Page and Diffusion models fit the re-wetting experimental data of barley well. The average SEE values for the Page and Diffusion models were 0.176 % d.b. and 0.199 % d.b., respectively. The Page and Diffusion models were found to be most suitable equations, to describe the thin-layer re-wetting characteristics of barley over a typically five day re-wetting. These two models can be used for the simulation of deep-bed re-wetting of barley occurring during ventilated storage and deep bed drying.

Keywords: Thin-layer, barley, re-wetting parameters, temperature, relative humidity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1451
2881 Native Language Identification with Cross-Corpus Evaluation Using Social Media Data: 'Reddit'

Authors: Yasmeen Bassas, Sandra Kuebler, Allen Riddell

Abstract:

Native Language Identification is one of the growing subfields in Natural Language Processing (NLP). The task of Native Language Identification (NLI) is mainly concerned with predicting the native language of an author’s writing in a second language. In this paper, we investigate the performance of two types of features; content-based features vs. content independent features when they are evaluated on a different corpus (using social media data “Reddit”). In this NLI task, the predefined models are trained on one corpus (TOEFL) and then the trained models are evaluated on a different data using an external corpus (Reddit). Three classifiers are used in this task; the baseline, linear SVM, and Logistic Regression. Results show that content-based features are more accurate and robust than content independent ones when tested within corpus and across corpus.

Keywords: NLI, NLP, content-based features, content independent features, social media corpus, ML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 324
2880 Influence of Parameters of Modeling and Data Distribution for Optimal Condition on Locally Weighted Projection Regression Method

Authors: Farhad Asadi, Mohammad Javad Mollakazemi, Aref Ghafouri

Abstract:

Recent research in neural networks science and neuroscience for modeling complex time series data and statistical learning has focused mostly on learning from high input space and signals. Local linear models are a strong choice for modeling local nonlinearity in data series. Locally weighted projection regression is a flexible and powerful algorithm for nonlinear approximation in high dimensional signal spaces. In this paper, different learning scenario of one and two dimensional data series with different distributions are investigated for simulation and further noise is inputted to data distribution for making different disordered distribution in time series data and for evaluation of algorithm in locality prediction of nonlinearity. Then, the performance of this algorithm is simulated and also when the distribution of data is high or when the number of data is less the sensitivity of this approach to data distribution and influence of important parameter of local validity in this algorithm with different data distribution is explained.

Keywords: Local nonlinear estimation, LWPR algorithm, Online training method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1559
2879 Comparative Studies of Support Vector Regression between Reproducing Kernel and Gaussian Kernel

Authors: Wei Zhang, Su-Yan Tang, Yi-Fan Zhu, Wei-Ping Wang

Abstract:

Support vector regression (SVR) has been regarded as a state-of-the-art method for approximation and regression. The importance of kernel function, which is so-called admissible support vector kernel (SV kernel) in SVR, has motivated many studies on its composition. The Gaussian kernel (RBF) is regarded as a “best" choice of SV kernel used by non-expert in SVR, whereas there is no evidence, except for its superior performance on some practical applications, to prove the statement. Its well-known that reproducing kernel (R.K) is also a SV kernel which possesses many important properties, e.g. positive definiteness, reproducing property and composing complex R.K by simpler ones. However, there are a limited number of R.Ks with explicit forms and consequently few quantitative comparison studies in practice. In this paper, two R.Ks, i.e. SV kernels, composed by the sum and product of a translation invariant kernel in a Sobolev space are proposed. An exploratory study on the performance of SVR based general R.K is presented through a systematic comparison to that of RBF using multiple criteria and synthetic problems. The results show that the R.K is an equivalent or even better SV kernel than RBF for the problems with more input variables (more than 5, especially more than 10) and higher nonlinearity.

Keywords: admissible support vector kernel, reproducing kernel, reproducing kernel Hilbert space, support vector regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1538
2878 An Automated Stock Investment System Using Machine Learning Techniques: An Application in Australia

Authors: Carol Anne Hargreaves

Abstract:

A key issue in stock investment is how to select representative features for stock selection. The objective of this paper is to firstly determine whether an automated stock investment system, using machine learning techniques, may be used to identify a portfolio of growth stocks that are highly likely to provide returns better than the stock market index. The second objective is to identify the technical features that best characterize whether a stock’s price is likely to go up and to identify the most important factors and their contribution to predicting the likelihood of the stock price going up. Unsupervised machine learning techniques, such as cluster analysis, were applied to the stock data to identify a cluster of stocks that was likely to go up in price – portfolio 1. Next, the principal component analysis technique was used to select stocks that were rated high on component one and component two – portfolio 2. Thirdly, a supervised machine learning technique, the logistic regression method, was used to select stocks with a high probability of their price going up – portfolio 3. The predictive models were validated with metrics such as, sensitivity (recall), specificity and overall accuracy for all models. All accuracy measures were above 70%. All portfolios outperformed the market by more than eight times. The top three stocks were selected for each of the three stock portfolios and traded in the market for one month. After one month the return for each stock portfolio was computed and compared with the stock market index returns. The returns for all three stock portfolios was 23.87% for the principal component analysis stock portfolio, 11.65% for the logistic regression portfolio and 8.88% for the K-means cluster portfolio while the stock market performance was 0.38%. This study confirms that an automated stock investment system using machine learning techniques can identify top performing stock portfolios that outperform the stock market.

Keywords: Machine learning, stock market trading, logistic principal component analysis, automated stock investment system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1031
2877 Review of Models of Consumer Behaviour and Influence of Emotions in the Decision Making

Authors: Mikel Alonso López

Abstract:

In order to begin the process of studying the task of making consumer decisions, the main decision models must be analyzed. The objective of this task is to see if there is a presence of emotions in those models, and analyze how authors that have created them consider their impact in consumer choices. In this paper, the most important models of consumer behavior are analysed. This review is useful to consider an unproblematic background knowledge in the literature. The order that has been established for this study is chronological.

Keywords: Consumer behaviour, emotions, decision making, consumer psychology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2812
2876 Prediction of Air-Water Two-Phase Frictional Pressure Drop Using Artificial Neural Network

Authors: H. B. Mehta, Vipul M. Patel, Jyotirmay Banerjee

Abstract:

The present paper discusses the prediction of gas-liquid two-phase frictional pressure drop in a 2.12 mm horizontal circular minichannel using Artificial Neural Network (ANN). The experimental results are obtained with air as gas phase and water as liquid phase. The superficial gas velocity is kept in the range of 0.0236 m/s to 0.4722 m/s while the values of 0.0944 m/s, 0.1416 m/s and 0.1889 m/s are considered for superficial liquid velocity. The experimental results are predicted using different Artificial Neural Network (ANN) models. Networks used for prediction are radial basis, generalised regression, linear layer, cascade forward back propagation, feed forward back propagation, feed forward distributed time delay, layer recurrent, and Elman back propagation. Transfer functions used for networks are Linear (PURELIN), Logistic sigmoid (LOGSIG), tangent sigmoid (TANSIG) and Gaussian RBF. Combination of networks and transfer functions give different possible neural network models. These models are compared for Mean Absolute Relative Deviation (MARD) and Mean Relative Deviation (MRD) to identify the best predictive model of ANN.

Keywords: Minichannel, Two-Phase Flow, Frictional Pressure Drop, ANN, MARD, MRD.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1365
2875 The Profit Trend of Cosmetics Products Using Bootstrap Edgeworth Approximation

Authors: Edlira Donefski, Lorenc Ekonomi, Tina Donefski

Abstract:

Edgeworth approximation is one of the most important statistical methods that has a considered contribution in the reduction of the sum of standard deviation of the independent variables’ coefficients in a Quantile Regression Model. This model estimates the conditional median or other quantiles. In this paper, we have applied approximating statistical methods in an economical problem. We have created and generated a quantile regression model to see how the profit gained is connected with the realized sales of the cosmetic products in a real data, taken from a local business. The Linear Regression of the generated profit and the realized sales was not free of autocorrelation and heteroscedasticity, so this is the reason that we have used this model instead of Linear Regression. Our aim is to analyze in more details the relation between the variables taken into study: the profit and the finalized sales and how to minimize the standard errors of the independent variable involved in this study, the level of realized sales. The statistical methods that we have applied in our work are Edgeworth Approximation for Independent and Identical distributed (IID) cases, Bootstrap version of the Model and the Edgeworth approximation for Bootstrap Quantile Regression Model. The graphics and the results that we have presented here identify the best approximating model of our study.

Keywords: Bootstrap, Edgeworth approximation, independent and Identical distributed, quantile.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 376
2874 Methods for Data Selection in Medical Databases: The Binary Logistic Regression -Relations with the Calculated Risks

Authors: Cristina G. Dascalu, Elena Mihaela Carausu, Daniela Manuc

Abstract:

The medical studies often require different methods for parameters selection, as a second step of processing, after the database-s designing and filling with information. One common task is the selection of fields that act as risk factors using wellknown methods, in order to find the most relevant risk factors and to establish a possible hierarchy between them. Different methods are available in this purpose, one of the most known being the binary logistic regression. We will present the mathematical principles of this method and a practical example of using it in the analysis of the influence of 10 different psychiatric diagnostics over 4 different types of offences (in a database made from 289 psychiatric patients involved in different types of offences). Finally, we will make some observations about the relation between the risk factors hierarchy established through binary logistic regression and the individual risks, as well as the results of Chi-squared test. We will show that the hierarchy built using the binary logistic regression doesn-t agree with the direct order of risk factors, even if it was naturally to assume this hypothesis as being always true.

Keywords: Databases, risk factors, binary logisticregression, hierarchy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1289
2873 Second Order Admissibilities in Multi-parameter Logistic Regression Model

Authors: Chie Obayashi, Hidekazu Tanaka, Yoshiji Takagi

Abstract:

In multi-parameter family of distributions, conditions for a modified maximum likelihood estimator to be second order admissible are given. Applying these results to the multi-parameter logistic regression model, it is shown that the maximum likelihood estimator is always second order inadmissible. Also, conditions for the Berkson estimator to be second order admissible are given.

Keywords: Berkson estimator, modified maximum likelihood estimator, Multi-parameter logistic regression model, second order admissibility.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1578
2872 Innovative Methods of Improving Train Formation in Freight Transport

Authors: Jaroslav Masek, Juraj Camaj, Eva Nedeliakova

Abstract:

The paper is focused on the operational model for transport the single wagon consignments on railway network by using two different models of train formation. The paper gives an overview of possibilities of improving the quality of transport services. Paper deals with two models used in problematic of train formatting - time continuously and time discrete. By applying these models in practice, the transport company can guarantee a higher quality of service and expect increasing of transport performance. The models are also applicable into others transport networks. The models supplement a theoretical problem of train formation by new ways of looking to affecting the organization of wagon flows.

Keywords: Train formation, wagon flows, marshalling yard, railway technology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1971
2871 Analyzing the Factors Influencing Exclusive Breastfeeding Using the Generalized Poisson Regression Model

Authors: Cheika Jahangeer, Naushad Mamode Khan, Maleika Heenaye-Mamode Khan

Abstract:

Exclusive breastfeeding is the feeding of a baby on no other milk apart from breast milk. Exclusive breastfeeding during the first 6 months of life is of fundamental importance because it supports optimal growth and development during infancy and reduces the risk of obliterating diseases and problems. Moreover, in developed countries, exclusive breastfeeding has decreased the incidence and/or severity of diarrhea, lower respiratory infection and urinary tract infection. In this paper, we study the factors that influence exclusive breastfeeding and use the Generalized Poisson regression model to analyze the practices of exclusive breastfeeding in Mauritius. We develop two sets of quasi-likelihood equations (QLE)to estimate the parameters.

Keywords: Exclusive breastfeeding, Regression model, Quasilikelihood.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1743
2870 Predictive Analysis for Big Data: Extension of Classification and Regression Trees Algorithm

Authors: Ameur Abdelkader, Abed Bouarfa Hafida

Abstract:

Since its inception, predictive analysis has revolutionized the IT industry through its robustness and decision-making facilities. It involves the application of a set of data processing techniques and algorithms in order to create predictive models. Its principle is based on finding relationships between explanatory variables and the predicted variables. Past occurrences are exploited to predict and to derive the unknown outcome. With the advent of big data, many studies have suggested the use of predictive analytics in order to process and analyze big data. Nevertheless, they have been curbed by the limits of classical methods of predictive analysis in case of a large amount of data. In fact, because of their volumes, their nature (semi or unstructured) and their variety, it is impossible to analyze efficiently big data via classical methods of predictive analysis. The authors attribute this weakness to the fact that predictive analysis algorithms do not allow the parallelization and distribution of calculation. In this paper, we propose to extend the predictive analysis algorithm, Classification And Regression Trees (CART), in order to adapt it for big data analysis. The major changes of this algorithm are presented and then a version of the extended algorithm is defined in order to make it applicable for a huge quantity of data.

Keywords: Predictive analysis, big data, predictive analysis algorithms. CART algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1016
2869 Energy Loss at Drops using Neuro Solutions

Authors: Farzin Salmasi

Abstract:

Energy dissipation in drops has been investigated by physical models. After determination of effective parameters on the phenomenon, three drops with different heights have been constructed from Plexiglas. They have been installed in two existing flumes in the hydraulic laboratory. Several runs of physical models have been undertaken to measured required parameters for determination of the energy dissipation. Results showed that the energy dissipation in drops depend on the drop height and discharge. Predicted relative energy dissipations varied from 10.0% to 94.3%. This work has also indicated that the energy loss at drop is mainly due to the mixing of the jet with the pool behind the jet that causes air bubble entrainment in the flow. Statistical model has been developed to predict the energy dissipation in vertical drops denotes nonlinear correlation between effective parameters. Further an artificial neural networks (ANNs) approach was used in this paper to develop an explicit procedure for calculating energy loss at drops using NeuroSolutions. Trained network was able to predict the response with R2 and RMSE 0.977 and 0.0085 respectively. The performance of ANN was found effective when compared to regression equations in predicting the energy loss.

Keywords: Air bubble, drop, energy loss, hydraulic jump, NeuroSolutions

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1595
2868 Speaker Independent Quranic Recognizer Basedon Maximum Likelihood Linear Regression

Authors: Ehab Mourtaga, Ahmad Sharieh, Mousa Abdallah

Abstract:

An automatic speech recognition system for the formal Arabic language is needed. The Quran is the most formal spoken book in Arabic, it is spoken all over the world. In this research, an automatic speech recognizer for Quranic based speakerindependent was developed and tested. The system was developed based on the tri-phone Hidden Markov Model and Maximum Likelihood Linear Regression (MLLR). The MLLR computes a set of transformations which reduces the mismatch between an initial model set and the adaptation data. It uses the regression class tree, as well as, estimates a set of linear transformations for the mean and variance parameters of a Gaussian mixture HMM system. The 30th Chapter of the Quran, with five of the most famous readers of the Quran, was used for the training and testing of the data. The chapter includes about 2000 distinct words. The advantages of using the Quranic verses as the database in this developed recognizer are the uniqueness of the words and the high level of orderliness between verses. The level of accuracy from the tested data ranged 68 to 85%.

Keywords: Hidden Markov Model (HMM), MaximumLikelihood Linear Regression (MLLR), Quran, Regression ClassTree, Speech Recognition, Speaker-independent.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1873
2867 Prediction on Housing Price Based on Deep Learning

Authors: Li Yu, Chenlu Jiao, Hongrun Xin, Yan Wang, Kaiyang Wang

Abstract:

In order to study the impact of various factors on the housing price, we propose to build different prediction models based on deep learning to determine the existing data of the real estate in order to more accurately predict the housing price or its changing trend in the future. Considering that the factors which affect the housing price vary widely, the proposed prediction models include two categories. The first one is based on multiple characteristic factors of the real estate. We built Convolution Neural Network (CNN) prediction model and Long Short-Term Memory (LSTM) neural network prediction model based on deep learning, and logical regression model was implemented to make a comparison between these three models. Another prediction model is time series model. Based on deep learning, we proposed an LSTM-1 model purely regard to time series, then implementing and comparing the LSTM model and the Auto-Regressive and Moving Average (ARMA) model. In this paper, comprehensive study of the second-hand housing price in Beijing has been conducted from three aspects: crawling and analyzing, housing price predicting, and the result comparing. Ultimately the best model program was produced, which is of great significance to evaluation and prediction of the housing price in the real estate industry.

Keywords: Deep learning, convolutional neural network, LSTM, housing prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4889
2866 Data Annotation Models and Annotation Query Language

Authors: Neerja Bhatnagar, Benjoe A. Juliano, Renee S. Renner

Abstract:

This paper presents data annotation models at five levels of granularity (database, relation, column, tuple, and cell) of relational data to address the problem of unsuitability of most relational databases to express annotations. These models do not require any structural and schematic changes to the underlying database. These models are also flexible, extensible, customizable, database-neutral, and platform-independent. This paper also presents an SQL-like query language, named Annotation Query Language (AnQL), to query annotation documents. AnQL is simple to understand and exploits the already-existent wide knowledge and skill set of SQL.

Keywords: annotation query language, data annotations, data annotation models, semantic data annotations

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2313
2865 Multi-Linear Regression Based Prediction of Mass Transfer by Multiple Plunging Jets

Authors: S. Deswal, M. Pal

Abstract:

The paper aims to compare the performance of vertical and inclined multiple plunging jets and to model and predict their mass transfer capacity by multi-linear regression based approach. The multiple vertical plunging jets have jet impact angle of θ = 90O; whereas, multiple inclined plunging jets have jet impact angle of θ = 60O. The results of the study suggests that mass transfer is higher for multiple jets, and inclined multiple plunging jets have up to 1.6 times higher mass transfer than vertical multiple plunging jets under similar conditions. The derived relationship, based on multi-linear regression approach, has successfully predicted the volumetric mass transfer coefficient (KLa) from operational parameters of multiple plunging jets with a correlation coefficient of 0.973, root mean square error of 0.002 and coefficient of determination of 0.946. The results suggests that predicted overall mass transfer coefficient is in good agreement with actual experimental values; thereby, suggesting the utility of derived relationship based on multi-linear regression based approach and can be successfully employed in modeling mass transfer by multiple plunging jets.

Keywords: Mass transfer, multiple plunging jets, multi-linear regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2151
2864 Ordinal Regression with Fenton-Wilkinson Order Statistics: A Case Study of an Orienteering Race

Authors: Joonas Pääkkönen

Abstract:

In sports, individuals and teams are typically interested in final rankings. Final results, such as times or distances, dictate these rankings, also known as places. Places can be further associated with ordered random variables, commonly referred to as order statistics. In this work, we introduce a simple, yet accurate order statistical ordinal regression function that predicts relay race places with changeover-times. We call this function the Fenton-Wilkinson Order Statistics model. This model is built on the following educated assumption: individual leg-times follow log-normal distributions. Moreover, our key idea is to utilize Fenton-Wilkinson approximations of changeover-times alongside an estimator for the total number of teams as in the notorious German tank problem. This original place regression function is sigmoidal and thus correctly predicts the existence of a small number of elite teams that significantly outperform the rest of the teams. Our model also describes how place increases linearly with changeover-time at the inflection point of the log-normal distribution function. With real-world data from Jukola 2019, a massive orienteering relay race, the model is shown to be highly accurate even when the size of the training set is only 5% of the whole data set. Numerical results also show that our model exhibits smaller place prediction root-mean-square-errors than linear regression, mord regression and Gaussian process regression.

Keywords: Fenton-Wilkinson approximation, German tank problem, log-normal distribution, order statistics, ordinal regression, orienteering, sports analytics, sports modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 759
2863 Improve Safety Performance of Un-Signalized Intersections in Oman

Authors: Siham G. Farag

Abstract:

The main objective of this paper is to provide a new methodology for road safety assessment in Oman through the development of suitable accident prediction models. GLM technique with Poisson or NBR using SAS package was carried out to develop these models. The paper utilized the accidents data of 31 un-signalized T-intersections during three years. Five goodness-of-fit measures were used to assess the overall quality of the developed models. Two types of models were developed separately; the flow-based models including only traffic exposure functions, and the full models containing both exposure functions and other significant geometry and traffic variables. The results show that, traffic exposure functions produced much better fit to the accident data. The most effective geometric variables were major-road mean speed, minor-road 85th percentile speed, major-road lane width, distance to the nearest junction, and right-turn curb radius. The developed models can be used for intersection treatment or upgrading and specify the appropriate design parameters of T-intersections. Finally, the models presented in this thesis reflect the intersection conditions in Oman and could represent the typical conditions in several countries in the middle east area, especially gulf countries.

Keywords: Accidents Prediction Models (APMs), Generalized Linear Model (GLM), T-intersections, Oman.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2013
2862 Drainage Prediction for Dam using Fuzzy Support Vector Regression

Authors: S. Wiriyarattanakun, A. Ruengsiriwatanakun, S. Noimanee

Abstract:

The drainage Estimating is an important factor in dam management. In this paper, we use fuzzy support vector regression (FSVR) to predict the drainage of the Sirikrit Dam at Uttaradit province, Thailand. The results show that the FSVR is a suitable method in drainage estimating.

Keywords: Drainage Estimation, Prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1221
2861 Sensitivity Analysis for Determining Priority of Factors Controlling SOC Content in Semiarid Condition of West of Iran

Authors: Y. Parvizi, M. Gorji, M.H. Mahdian, M. Omid

Abstract:

Soil organic carbon (SOC) plays a key role in soil fertility, hydrology, contaminants control and acts as a sink or source of terrestrial carbon content that can affect the concentration of atmospheric CO2. SOC supports the sustainability and quality of ecosystems, especially in semi-arid region. This study was conducted to determine relative importance of 13 different exploratory climatic, soil and geometric factors on the SOC contents in one of the semiarid watershed zones in Iran. Two methods canonical discriminate analysis (CDA) and feed-forward back propagation neural networks were used to predict SOC. Stepwise regression and sensitivity analysis were performed to identify relative importance of exploratory variables. Results from sensitivity analysis showed that 7-2-1 neural networks and 5 inputs in CDA models output have highest predictive ability that explains %70 and %65 of SOC variability. Since neural network models outperformed CDA model, it should be preferred for estimating SOC.

Keywords: Soil organic carbon, modeling, neural networks, CDA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1390