Search results for: empirical bayesian kriging regression prediction
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7550

Search results for: empirical bayesian kriging regression prediction

7160 Using High Performance Computing for Online Flood Monitoring and Prediction

Authors: Stepan Kuchar, Martin Golasowski, Radim Vavrik, Michal Podhoranyi, Boris Sir, Jan Martinovic

Abstract:

The main goal of this article is to describe the online flood monitoring and prediction system Floreon+ primarily developed for the Moravian-Silesian region in the Czech Republic and the basic process it uses for running automatic rainfall-runoff and hydrodynamic simulations along with their calibration and uncertainty modeling. It takes a long time to execute such process sequentially, which is not acceptable in the online scenario, so the use of high-performance computing environment is proposed for all parts of the process to shorten their duration. Finally, a case study on the Ostravice river catchment is presented that shows actual durations and their gain from the parallel implementation.

Keywords: flood prediction process, high performance computing, online flood prediction system, parallelization

Procedia PDF Downloads 487
7159 A Stochastic Model to Predict Earthquake Ground Motion Duration Recorded in Soft Soils Based on Nonlinear Regression

Authors: Issam Aouari, Abdelmalek Abdelhamid

Abstract:

For seismologists, the characterization of seismic demand should include the amplitude and duration of strong shaking in the system. The duration of ground shaking is one of the key parameters in earthquake resistant design of structures. This paper proposes a nonlinear statistical model to estimate earthquake ground motion duration in soft soils using multiple seismicity indicators. Three definitions of ground motion duration proposed by literature have been applied. With a comparative study, we select the most significant definition to use for predict the duration. A stochastic model is presented for the McCann and Shah Method using nonlinear regression analysis based on a data set for moment magnitude, source to site distance and site conditions. The data set applied is taken from PEER strong motion databank and contains shallow earthquakes from different regions in the world; America, Turkey, London, China, Italy, Chili, Mexico...etc. Main emphasis is placed on soft site condition. The predictive relationship has been developed based on 600 records and three input indicators. Results have been compared with others published models. It has been found that the proposed model can predict earthquake ground motion duration in soft soils for different regions and sites conditions.

Keywords: duration, earthquake, prediction, regression, soft soil

Procedia PDF Downloads 149
7158 Effect of Leadership Style on Organizational Performance

Authors: Khadija Mushtaq, Mian Saqib Mehmood

Abstract:

This paper attempts to determine the impact of leadership style and learning orientation on organizational performance in Pakistan. A sample of 158 middle managers selected from sports and surgical factories from Sialkot. The empirical estimation is based on a multiple linear regression analysis of the relationship between leadership style, learning orientation and organizational performance. Leadership style is measure through transformational leadership and transactional leadership. The transformational leadership has insignificant impact on organizational performance. The transactional leadership has positive and significant relation with organizational performance. Learning orientation also has positive and significant relation with organizational performance. Linear regression used to estimate the relation between dependent and independent variables. This study suggests top manger should prefer continuous process for improvement for any change in system rather radical change.

Keywords: transformational leadership, transactional leadership, learning orientation, organizational performance, Pakistan

Procedia PDF Downloads 400
7157 An Epsilon Hierarchical Fuzzy Twin Support Vector Regression

Authors: Arindam Chaudhuri

Abstract:

The research presents epsilon- hierarchical fuzzy twin support vector regression (epsilon-HFTSVR) based on epsilon-fuzzy twin support vector regression (epsilon-FTSVR) and epsilon-twin support vector regression (epsilon-TSVR). Epsilon-FTSVR is achieved by incorporating trapezoidal fuzzy numbers to epsilon-TSVR which takes care of uncertainty existing in forecasting problems. Epsilon-FTSVR determines a pair of epsilon-insensitive proximal functions by solving two related quadratic programming problems. The structural risk minimization principle is implemented by introducing regularization term in primal problems of epsilon-FTSVR. This yields dual stable positive definite problems which improves regression performance. Epsilon-FTSVR is then reformulated as epsilon-HFTSVR consisting of a set of hierarchical layers each containing epsilon-FTSVR. Experimental results on both synthetic and real datasets reveal that epsilon-HFTSVR has remarkable generalization performance with minimum training time.

Keywords: regression, epsilon-TSVR, epsilon-FTSVR, epsilon-HFTSVR

Procedia PDF Downloads 367
7156 Data Refinement Enhances The Accuracy of Short-Term Traffic Latency Prediction

Authors: Man Fung Ho, Lap So, Jiaqi Zhang, Yuheng Zhao, Huiyang Lu, Tat Shing Choi, K. Y. Michael Wong

Abstract:

Nowadays, a tremendous amount of data is available in the transportation system, enabling the development of various machine learning approaches to make short-term latency predictions. A natural question is then the choice of relevant information to enable accurate predictions. Using traffic data collected from the Taiwan Freeway System, we consider the prediction of short-term latency of a freeway segment with a length of 17 km covering 5 measurement points, each collecting vehicle-by-vehicle data through the electronic toll collection system. The processed data include the past latencies of the freeway segment with different time lags, the traffic conditions of the individual segments (the accumulations, the traffic fluxes, the entrance and exit rates), the total accumulations, and the weekday latency profiles obtained by Gaussian process regression of past data. We arrive at several important conclusions about how data should be refined to obtain accurate predictions, which have implications for future system-wide latency predictions. (1) We find that the prediction of median latency is much more accurate and meaningful than the prediction of average latency, as the latter is plagued by outliers. This is verified by machine-learning prediction using XGBoost that yields a 35% improvement in the mean square error of the 5-minute averaged latencies. (2) We find that the median latency of the segment 15 minutes ago is a very good baseline for performance comparison, and we have evidence that further improvement is achieved by machine learning approaches such as XGBoost and Long Short-Term Memory (LSTM). (3) By analyzing the feature importance score in XGBoost and calculating the mutual information between the inputs and the latencies to be predicted, we identify a sequence of inputs ranked in importance. It confirms that the past latencies are most informative of the predicted latencies, followed by the total accumulation, whereas inputs such as the entrance and exit rates are uninformative. It also confirms that the inputs are much less informative of the average latencies than the median latencies. (4) For predicting the latencies of segments composed of two or three sub-segments, summing up the predicted latencies of each sub-segment is more accurate than the one-step prediction of the whole segment, especially with the latency prediction of the downstream sub-segments trained to anticipate latencies several minutes ahead. The duration of the anticipation time is an increasing function of the traveling time of the upstream segment. The above findings have important implications to predicting the full set of latencies among the various locations in the freeway system.

Keywords: data refinement, machine learning, mutual information, short-term latency prediction

Procedia PDF Downloads 166
7155 A Case Study of Control of Blast-Induced Ground Vibration on Adjacent Structures

Authors: H. Mahdavinezhad, M. Labbaf, H. R. Tavakoli

Abstract:

In recent decades, the study and control of the destructive effects of explosive vibration in construction projects has received more attention, and several experimental equations in the field of vibration prediction as well as allowable vibration limit for various structures are presented. Researchers have developed a number of experimental equations to estimate the peak particle velocity (PPV), in which the experimental constants must be obtained at the site of the explosion by fitting the data from experimental explosions. In this study, the most important of these equations was evaluated for strong massive conglomerates around Dez Dam by collecting data on explosions, including 30 particle velocities, 27 displacements, 27 vibration frequencies and 27 acceleration of earth vibration at different distances; they were recorded in the form of two types of detonation systems, NUNEL and electric. Analysis showed that the data from the explosion had the best correlation with the cube root of the explosive, R2=0.8636, but overall the correlation coefficients are not much different. To estimate the vibration in this project, data regression was performed in the other formats, which resulted in the presentation of new equation with R2=0.904 correlation coefficient. Finally according to the importance of the studied structures in order to ensure maximum non damage to adjacent structures for each diagram, a range of application was defined so that for distances 0 to 70 meters from blast site, exponent n=0.33 and for distances more than 70 m, n =0.66 was suggested.

Keywords: blasting, blast-induced vibration, empirical equations, PPV, tunnel

Procedia PDF Downloads 126
7154 Life Prediction of Condenser Tubes Applying Fuzzy Logic and Neural Network Algorithms

Authors: A. Majidian

Abstract:

The life prediction of thermal power plant components is necessary to prevent the unexpected outages, optimize maintenance tasks in periodic overhauls and plan inspection tasks with their schedules. One of the main critical components in a power plant is condenser because its failure can affect many other components which are positioned in downstream of condenser. This paper deals with factors affecting life of condenser. Failure rates dependency vs. these factors has been investigated using Artificial Neural Network (ANN) and fuzzy logic algorithms. These algorithms have shown their capabilities as dynamic tools to evaluate life prediction of power plant equipments.

Keywords: life prediction, condenser tube, neural network, fuzzy logic

Procedia PDF Downloads 347
7153 Statistical Model of Water Quality in Estero El Macho, Machala-El Oro

Authors: Rafael Zhindon Almeida

Abstract:

Surface water quality is an important concern for the evaluation and prediction of water quality conditions. The objective of this study is to develop a statistical model that can accurately predict the water quality of the El Macho estuary in the city of Machala, El Oro province. The methodology employed in this study is of a basic type that involves a thorough search for theoretical foundations to improve the understanding of statistical modeling for water quality analysis. The research design is correlational, using a multivariate statistical model involving multiple linear regression and principal component analysis. The results indicate that water quality parameters such as fecal coliforms, biochemical oxygen demand, chemical oxygen demand, iron and dissolved oxygen exceed the allowable limits. The water of the El Macho estuary is determined to be below the required water quality criteria. The multiple linear regression model, based on chemical oxygen demand and total dissolved solids, explains 99.9% of the variance of the dependent variable. In addition, principal component analysis shows that the model has an explanatory power of 86.242%. The study successfully developed a statistical model to evaluate the water quality of the El Macho estuary. The estuary did not meet the water quality criteria, with several parameters exceeding the allowable limits. The multiple linear regression model and principal component analysis provide valuable information on the relationship between the various water quality parameters. The findings of the study emphasize the need for immediate action to improve the water quality of the El Macho estuary to ensure the preservation and protection of this valuable natural resource.

Keywords: statistical modeling, water quality, multiple linear regression, principal components, statistical models

Procedia PDF Downloads 92
7152 Monitoring Large-Coverage Forest Canopy Height by Integrating LiDAR and Sentinel-2 Images

Authors: Xiaobo Liu, Rakesh Mishra, Yun Zhang

Abstract:

Continuous monitoring of forest canopy height with large coverage is essential for obtaining forest carbon stocks and emissions, quantifying biomass estimation, analyzing vegetation coverage, and determining biodiversity. LiDAR can be used to collect accurate woody vegetation structure such as canopy height. However, LiDAR’s coverage is usually limited because of its high cost and limited maneuverability, which constrains its use for dynamic and large area forest canopy monitoring. On the other hand, optical satellite images, like Sentinel-2, have the ability to cover large forest areas with a high repeat rate, but they do not have height information. Hence, exploring the solution of integrating LiDAR data and Sentinel-2 images to enlarge the coverage of forest canopy height prediction and increase the prediction repeat rate has been an active research topic in the environmental remote sensing community. In this study, we explore the potential of training a Random Forest Regression (RFR) model and a Convolutional Neural Network (CNN) model, respectively, to develop two predictive models for predicting and validating the forest canopy height of the Acadia Forest in New Brunswick, Canada, with a 10m ground sampling distance (GSD), for the year 2018 and 2021. Two 10m airborne LiDAR-derived canopy height models, one for 2018 and one for 2021, are used as ground truth to train and validate the RFR and CNN predictive models. To evaluate the prediction performance of the trained RFR and CNN models, two new predicted canopy height maps (CHMs), one for 2018 and one for 2021, are generated using the trained RFR and CNN models and 10m Sentinel-2 images of 2018 and 2021, respectively. The two 10m predicted CHMs from Sentinel-2 images are then compared with the two 10m airborne LiDAR-derived canopy height models for accuracy assessment. The validation results show that the mean absolute error (MAE) for year 2018 of the RFR model is 2.93m, CNN model is 1.71m; while the MAE for year 2021 of the RFR model is 3.35m, and the CNN model is 3.78m. These demonstrate the feasibility of using the RFR and CNN models developed in this research for predicting large-coverage forest canopy height at 10m spatial resolution and a high revisit rate.

Keywords: remote sensing, forest canopy height, LiDAR, Sentinel-2, artificial intelligence, random forest regression, convolutional neural network

Procedia PDF Downloads 86
7151 Electrical Machine Winding Temperature Estimation Using Stateful Long Short-Term Memory Networks (LSTM) and Truncated Backpropagation Through Time (TBPTT)

Authors: Yujiang Wu

Abstract:

As electrical machine (e-machine) power density re-querulents become more stringent in vehicle electrification, mounting a temperature sensor for e-machine stator windings becomes increasingly difficult. This can lead to higher manufacturing costs, complicated harnesses, and reduced reliability. In this paper, we propose a deep-learning method for predicting electric machine winding temperature, which can either replace the sensor entirely or serve as a backup to the existing sensor. We compare the performance of our method, the stateful long short-term memory networks (LSTM) with truncated backpropagation through time (TBTT), with that of linear regression, as well as stateless LSTM with/without residual connection. Our results demonstrate the strength of combining stateful LSTM and TBTT in tackling nonlinear time series prediction problems with long sequence lengths. Additionally, in industrial applications, high-temperature region prediction accuracy is more important because winding temperature sensing is typically used for derating machine power when the temperature is high. To evaluate the performance of our algorithm, we developed a temperature-stratified MSE. We propose a simple but effective data preprocessing trick to improve the high-temperature region prediction accuracy. Our experimental results demonstrate the effectiveness of our proposed method in accurately predicting winding temperature, particularly in high-temperature regions, while also reducing manufacturing costs and improving reliability.

Keywords: deep learning, electrical machine, functional safety, long short-term memory networks (LSTM), thermal management, time series prediction

Procedia PDF Downloads 95
7150 Foreign Direct Investment on Economic Growth by Industries in Central and Eastern European Countries

Authors: Shorena Pharjiani

Abstract:

The Present empirical paper investigates the relationship between FDI and economic growth by 10 selected industries in 10 Central and Eastern European countries from the period 1995 to 2012. Different estimation approaches were used to explore the connection between FDI and economic growth, for example OLS, RE, FE with and without time dummies. Obtained empirical results leads to some main consequences: First, the Central and East European countries (CEEC) attracted foreign direct investment, which raised the productivity of industries they entered in. It should be concluded that the linkage between FDI and output growth by industries is positive and significant enough to suggest that foreign firm’s participation enhanced the productivity of the industries they occupied. There had been an endogeneity problem in the regression and fixed effects estimation approach was used which partially corrected the regression analysis in order to make the results less biased. Second, it should be stressed that the results show that time has an important role in making FDI operational for enhancing output growth by industries via total factor productivity. Third, R&D positively affected economic growth and at the same time, it should take some time for research and development to influence economic growth. Fourth, the general trends masked crucial differences at the country level: over the last 20 years, the analysis of the tables and figures at the country level show that the main recipients of FDI of the 11 Central and Eastern European countries were Hungary, Poland and the Czech Republic. The main reason was that these countries had more open door policies for attracting the FDI. Fifth, according to the graphical analysis, while Hungary had the highest FDI inflow in this region, it was not reflected in the GDP growth as much as in other Central and Eastern European countries.

Keywords: central and East European countries (CEEC), economic growth, FDI, panel data

Procedia PDF Downloads 232
7149 Rd-PLS Regression: From the Analysis of Two Blocks of Variables to Path Modeling

Authors: E. Tchandao Mangamana, V. Cariou, E. Vigneau, R. Glele Kakai, E. M. Qannari

Abstract:

A new definition of a latent variable associated with a dataset makes it possible to propose variants of the PLS2 regression and the multi-block PLS (MB-PLS). We shall refer to these variants as Rd-PLS regression and Rd-MB-PLS respectively because they are inspired by both Redundancy analysis and PLS regression. Usually, a latent variable t associated with a dataset Z is defined as a linear combination of the variables of Z with the constraint that the length of the loading weights vector equals 1. Formally, t=Zw with ‖w‖=1. Denoting by Z' the transpose of Z, we define herein, a latent variable by t=ZZ’q with the constraint that the auxiliary variable q has a norm equal to 1. This new definition of a latent variable entails that, as previously, t is a linear combination of the variables in Z and, in addition, the loading vector w=Z’q is constrained to be a linear combination of the rows of Z. More importantly, t could be interpreted as a kind of projection of the auxiliary variable q onto the space generated by the variables in Z, since it is collinear to the first PLS1 component of q onto Z. Consider the situation in which we aim to predict a dataset Y from another dataset X. These two datasets relate to the same individuals and are assumed to be centered. Let us consider a latent variable u=YY’q to which we associate the variable t= XX’YY’q. Rd-PLS consists in seeking q (and therefore u and t) so that the covariance between t and u is maximum. The solution to this problem is straightforward and consists in setting q to the eigenvector of YY’XX’YY’ associated with the largest eigenvalue. For the determination of higher order components, we deflate X and Y with respect to the latent variable t. Extending Rd-PLS to the context of multi-block data is relatively easy. Starting from a latent variable u=YY’q, we consider its ‘projection’ on the space generated by the variables of each block Xk (k=1, ..., K) namely, tk= XkXk'YY’q. Thereafter, Rd-MB-PLS seeks q in order to maximize the average of the covariances of u with tk (k=1, ..., K). The solution to this problem is given by q, eigenvector of YY’XX’YY’, where X is the dataset obtained by horizontally merging datasets Xk (k=1, ..., K). For the determination of latent variables of order higher than 1, we use a deflation of Y and Xk with respect to the variable t= XX’YY’q. In the same vein, extending Rd-MB-PLS to the path modeling setting is straightforward. Methods are illustrated on the basis of case studies and performance of Rd-PLS and Rd-MB-PLS in terms of prediction is compared to that of PLS2 and MB-PLS.

Keywords: multiblock data analysis, partial least squares regression, path modeling, redundancy analysis

Procedia PDF Downloads 146
7148 Wind Speed Prediction Using Passive Aggregation Artificial Intelligence Model

Authors: Tarek Aboueldahab, Amin Mohamed Nassar

Abstract:

Wind energy is a fluctuating energy source unlike conventional power plants, thus, it is necessary to accurately predict short term wind speed to integrate wind energy in the electricity supply structure. To do so, we present a hybrid artificial intelligence model of short term wind speed prediction based on passive aggregation of the particle swarm optimization and neural networks. As a result, improvement of the prediction accuracy is obviously obtained compared to the standard artificial intelligence method.

Keywords: artificial intelligence, neural networks, particle swarm optimization, passive aggregation, wind speed prediction

Procedia PDF Downloads 444
7147 Allometric Models for Biomass Estimation in Savanna Woodland Area, Niger State, Nigeria

Authors: Abdullahi Jibrin, Aishetu Abdulkadir

Abstract:

The development of allometric models is crucial to accurate forest biomass/carbon stock assessment. The aim of this study was to develop a set of biomass prediction models that will enable the determination of total tree aboveground biomass for savannah woodland area in Niger State, Nigeria. Based on the data collected through biometric measurements of 1816 trees and destructive sampling of 36 trees, five species specific and one site specific models were developed. The sample size was distributed equally between the five most dominant species in the study site (Vitellaria paradoxa, Irvingia gabonensis, Parkia biglobosa, Anogeissus leiocarpus, Pterocarpus erinaceous). Firstly, the equations were developed for five individual species. Secondly these five species were mixed and were used to develop an allometric equation of mixed species. Overall, there was a strong positive relationship between total tree biomass and the stem diameter. The coefficient of determination (R2 values) ranging from 0.93 to 0.99 P < 0.001 were realised for the models; with considerable low standard error of the estimates (SEE) which confirms that the total tree above ground biomass has a significant relationship with the dbh. The F-test value for the biomass prediction models were also significant at p < 0.001 which indicates that the biomass prediction models are valid. This study recommends that for improved biomass estimates in the study site, the site specific biomass models should preferably be used instead of using generic models.

Keywords: allometriy, biomass, carbon stock , model, regression equation, woodland, inventory

Procedia PDF Downloads 442
7146 SNR Classification Using Multiple CNNs

Authors: Thinh Ngo, Paul Rad, Brian Kelley

Abstract:

Noise estimation is essential in today wireless systems for power control, adaptive modulation, interference suppression and quality of service. Deep learning (DL) has already been applied in the physical layer for modulation and signal classifications. Unacceptably low accuracy of less than 50% is found to undermine traditional application of DL classification for SNR prediction. In this paper, we use divide-and-conquer algorithm and classifier fusion method to simplify SNR classification and therefore enhances DL learning and prediction. Specifically, multiple CNNs are used for classification rather than a single CNN. Each CNN performs a binary classification of a single SNR with two labels: less than, greater than or equal. Together, multiple CNNs are combined to effectively classify over a range of SNR values from −20 ≤ SNR ≤ 32 dB.We use pre-trained CNNs to predict SNR over a wide range of joint channel parameters including multiple Doppler shifts (0, 60, 120 Hz), power-delay profiles, and signal-modulation types (QPSK,16QAM,64-QAM). The approach achieves individual SNR prediction accuracy of 92%, composite accuracy of 70% and prediction convergence one order of magnitude faster than that of traditional estimation.

Keywords: classification, CNN, deep learning, prediction, SNR

Procedia PDF Downloads 131
7145 Winter Wheat Yield Forecasting Using Sentinel-2 Imagery at the Early Stages

Authors: Chunhua Liao, Jinfei Wang, Bo Shan, Yang Song, Yongjun He, Taifeng Dong

Abstract:

Winter wheat is one of the main crops in Canada. Forecasting of within-field variability of yield in winter wheat at the early stages is essential for precision farming. However, the crop yield modelling based on high spatial resolution satellite data is generally affected by the lack of continuous satellite observations, resulting in reducing the generalization ability of the models and increasing the difficulty of crop yield forecasting at the early stages. In this study, the correlations between Sentinel-2 data (vegetation indices and reflectance) and yield data collected by combine harvester were investigated and a generalized multivariate linear regression (MLR) model was built and tested with data acquired in different years. It was found that the four-band reflectance (blue, green, red, near-infrared) performed better than their vegetation indices (NDVI, EVI, WDRVI and OSAVI) in wheat yield prediction. The optimum phenological stage for wheat yield prediction with highest accuracy was at the growing stages from the end of the flowering to the beginning of the filling stage. The best MLR model was therefore built to predict wheat yield before harvest using Sentinel-2 data acquired at the end of the flowering stage. Further, to improve the ability of the yield prediction at the early stages, three simple unsupervised domain adaptation (DA) methods were adopted to transform the reflectance data at the early stages to the optimum phenological stage. The winter wheat yield prediction using multiple vegetation indices showed higher accuracy than using single vegetation index. The optimum stage for winter wheat yield forecasting varied with different fields when using vegetation indices, while it was consistent when using multispectral reflectance and the optimum stage for winter wheat yield prediction was at the end of flowering stage. The average testing RMSE of the MLR model at the end of the flowering stage was 604.48 kg/ha. Near the booting stage, the average testing RMSE of yield prediction using the best MLR was reduced to 799.18 kg/ha when applying the mean matching domain adaptation approach to transform the data to the target domain (at the end of the flowering) compared to that using the original data based on the models developed at the booting stage directly (“MLR at the early stage”) (RMSE =1140.64 kg/ha). This study demonstrated that the simple mean matching (MM) performed better than other DA methods and it was found that “DA then MLR at the optimum stage” performed better than “MLR directly at the early stages” for winter wheat yield forecasting at the early stages. The results indicated that the DA had a great potential in near real-time crop yield forecasting at the early stages. This study indicated that the simple domain adaptation methods had a great potential in crop yield prediction at the early stages using remote sensing data.

Keywords: wheat yield prediction, domain adaptation, Sentinel-2, within-field scale

Procedia PDF Downloads 63
7144 Nonparametric Truncated Spline Regression Model on the Data of Human Development Index in Indonesia

Authors: Kornelius Ronald Demu, Dewi Retno Sari Saputro, Purnami Widyaningsih

Abstract:

Human Development Index (HDI) is a standard measurement for a country's human development. Several factors may have influenced it, such as life expectancy, gross domestic product (GDP) based on the province's annual expenditure, the number of poor people, and the percentage of an illiterate people. The scatter plot between HDI and the influenced factors show that the plot does not follow a specific pattern or form. Therefore, the HDI's data in Indonesia can be applied with a nonparametric regression model. The estimation of the regression curve in the nonparametric regression model is flexible because it follows the shape of the data pattern. One of the nonparametric regression's method is a truncated spline. Truncated spline regression is one of the nonparametric approach, which is a modification of the segmented polynomial functions. The estimator of a truncated spline regression model was affected by the selection of the optimal knots point. Knot points is a focus point of spline truncated functions. The optimal knots point was determined by the minimum value of generalized cross validation (GCV). In this article were applied the data of Human Development Index with a truncated spline nonparametric regression model. The results of this research were obtained the best-truncated spline regression model to the HDI's data in Indonesia with the combination of optimal knots point 5-5-5-4. Life expectancy and the percentage of an illiterate people were the significant factors depend to the HDI in Indonesia. The coefficient of determination is 94.54%. This means the regression model is good enough to applied on the data of HDI in Indonesia.

Keywords: generalized cross validation (GCV), Human Development Index (HDI), knots point, nonparametric regression, truncated spline

Procedia PDF Downloads 334
7143 Fuzzy Logic Classification Approach for Exponential Data Set in Health Care System for Predication of Future Data

Authors: Manish Pandey, Gurinderjit Kaur, Meenu Talwar, Sachin Chauhan, Jagbir Gill

Abstract:

Health-care management systems are a unit of nice connection as a result of the supply a straightforward and fast management of all aspects relating to a patient, not essentially medical. What is more, there are unit additional and additional cases of pathologies during which diagnosing and treatment may be solely allotted by victimization medical imaging techniques. With associate ever-increasing prevalence, medical pictures area unit directly acquired in or regenerate into digital type, for his or her storage additionally as sequent retrieval and process. Data Mining is the process of extracting information from large data sets through using algorithms and Techniques drawn from the field of Statistics, Machine Learning and Data Base Management Systems. Forecasting may be a prediction of what's going to occur within the future, associated it's an unsure method. Owing to the uncertainty, the accuracy of a forecast is as vital because the outcome foretold by foretelling the freelance variables. A forecast management should be wont to establish if the accuracy of the forecast is within satisfactory limits. Fuzzy regression strategies have normally been wont to develop shopper preferences models that correlate the engineering characteristics with shopper preferences relating to a replacement product; the patron preference models offer a platform, wherever by product developers will decide the engineering characteristics so as to satisfy shopper preferences before developing the merchandise. Recent analysis shows that these fuzzy regression strategies area units normally will not to model client preferences. We tend to propose a Testing the strength of Exponential Regression Model over regression toward the mean Model.

Keywords: health-care management systems, fuzzy regression, data mining, forecasting, fuzzy membership function

Procedia PDF Downloads 274
7142 Uplink Throughput Prediction in Cellular Mobile Networks

Authors: Engin Eyceyurt, Josko Zec

Abstract:

The current and future cellular mobile communication networks generate enormous amounts of data. Networks have become extremely complex with extensive space of parameters, features and counters. These networks are unmanageable with legacy methods and an enhanced design and optimization approach is necessary that is increasingly reliant on machine learning. This paper proposes that machine learning as a viable approach for uplink throughput prediction. LTE radio metric, such as Reference Signal Received Power (RSRP), Reference Signal Received Quality (RSRQ), and Signal to Noise Ratio (SNR) are used to train models to estimate expected uplink throughput. The prediction accuracy with high determination coefficient of 91.2% is obtained from measurements collected with a simple smartphone application.

Keywords: drive test, LTE, machine learning, uplink throughput prediction

Procedia PDF Downloads 152
7141 Heart Ailment Prediction Using Machine Learning Methods

Authors: Abhigyan Hedau, Priya Shelke, Riddhi Mirajkar, Shreyash Chaple, Mrunali Gadekar, Himanshu Akula

Abstract:

The heart is the coordinating centre of the major endocrine glandular structure of the body, which produces hormones that profoundly affect the operations of the body, and diagnosing cardiovascular disease is a difficult but critical task. By extracting knowledge and information about the disease from patient data, data mining is a more practical technique to help doctors detect disorders. We use a variety of machine learning methods here, including logistic regression and support vector classifiers (SVC), K-nearest neighbours Classifiers (KNN), Decision Tree Classifiers, Random Forest classifiers and Gradient Boosting classifiers. These algorithms are applied to patient data containing 13 different factors to build a system that predicts heart disease in less time with more accuracy.

Keywords: logistic regression, support vector classifier, k-nearest neighbour, decision tree, random forest and gradient boosting

Procedia PDF Downloads 44
7140 Study on the Model Predicting Post-Construction Settlement of Soft Ground

Authors: Pingshan Chen, Zhiliang Dong

Abstract:

In order to estimate the post-construction settlement more objectively, the power-polynomial model is proposed, which can reflect the trend of settlement development based on the observed settlement data. It was demonstrated by an actual case history of an embankment, and during the prediction. Compared with the other three prediction models, the power-polynomial model can estimate the post-construction settlement more accurately with more simple calculation.

Keywords: prediction, model, post-construction settlement, soft ground

Procedia PDF Downloads 421
7139 Chemometric Regression Analysis of Radical Scavenging Ability of Kombucha Fermented Kefir-Like Products

Authors: Strahinja Kovacevic, Milica Karadzic Banjac, Jasmina Vitas, Stefan Vukmanovic, Radomir Malbasa, Lidija Jevric, Sanja Podunavac-Kuzmanovic

Abstract:

The present study deals with chemometric regression analysis of quality parameters and the radical scavenging ability of kombucha fermented kefir-like products obtained with winter savory (WS), peppermint (P), stinging nettle (SN) and wild thyme tea (WT) kombucha inoculums. Each analyzed sample was described by milk fat content (MF, %), total unsaturated fatty acids content (TUFA, %), monounsaturated fatty acids content (MUFA, %), polyunsaturated fatty acids content (PUFA, %), the ability of free radicals scavenging (RSA Dₚₚₕ, % and RSA.ₒₕ, %) and pH values measured after each hour from the start until the end of fermentation. The aim of the conducted regression analysis was to establish chemometric models which can predict the radical scavenging ability (RSA Dₚₚₕ, % and RSA.ₒₕ, %) of the samples by correlating it with the MF, TUFA, MUFA, PUFA and the pH value at the beginning, in the middle and at the end of fermentation process which lasted between 11 and 17 hours, until pH value of 4.5 was reached. The analysis was carried out applying univariate linear (ULR) and multiple linear regression (MLR) methods on the raw data and the data standardized by the min-max normalization method. The obtained models were characterized by very limited prediction power (poor cross-validation parameters) and weak statistical characteristics. Based on the conducted analysis it can be concluded that the resulting radical scavenging ability cannot be precisely predicted only on the basis of MF, TUFA, MUFA, PUFA content, and pH values, however, other quality parameters should be considered and included in the further modeling. This study is based upon work from project: Kombucha beverages production using alternative substrates from the territory of the Autonomous Province of Vojvodina, 142-451-2400/2019-03, supported by Provincial Secretariat for Higher Education and Scientific Research of AP Vojvodina.

Keywords: chemometrics, regression analysis, kombucha, quality control

Procedia PDF Downloads 137
7138 Model Averaging in a Multiplicative Heteroscedastic Model

Authors: Alan Wan

Abstract:

In recent years, the body of literature on frequentist model averaging in statistics has grown significantly. Most of this work focuses on models with different mean structures but leaves out the variance consideration. In this paper, we consider a regression model with multiplicative heteroscedasticity and develop a model averaging method that combines maximum likelihood estimators of unknown parameters in both the mean and variance functions of the model. Our weight choice criterion is based on a minimisation of a plug-in estimator of the model average estimator's squared prediction risk. We prove that the new estimator possesses an asymptotic optimality property. Our investigation of finite-sample performance by simulations demonstrates that the new estimator frequently exhibits very favourable properties compared to some existing heteroscedasticity-robust model average estimators. The model averaging method hedges against the selection of very bad models and serves as a remedy to variance function misspecification, which often discourages practitioners from modeling heteroscedasticity altogether. The proposed model average estimator is applied to the analysis of two real data sets.

Keywords: heteroscedasticity-robust, model averaging, multiplicative heteroscedasticity, plug-in, squared prediction risk

Procedia PDF Downloads 378
7137 Spatio-temporal Distribution of the Groundwater Quality in the El Milia Plain, Kebir Rhumel Basin, Algeria

Authors: Lazhar Belkhiri, Ammar Tiri, Lotfi Mouni

Abstract:

In this research, we analyzed the groundwater quality index in the El Milia plain, Kebir Rhumel Basin, Algeria. Thirty-three groundwater samples were collected from wells in the El Milia plain during April 2015. In this study, pH and electrical conductivity (EC) were conducted at each sampling well. Eight hydrochemical parameters such as calcium (Ca), magnesium (Mg), sodium (Na), potassium (K), chlorid (Cl), sulfate (SO4), bicarbonate (HCO3), and Nnitrate (NO3) were analysed. The entropy water quality index (EWQI) method was employed to evaluate the groundwater quality in the study area. Moran’s I and the ordinary kriging (OK) interpolation technique were used to examine the spatial distribution pattern of the hydrochemical parameters in the groundwater. It was found that the hydrochemical parameters Ca, Cl, and HCO3 showed strong spatial autocorrelation in the El Milia plain, indicating a spatial dependence and clustering of these parameters in the groundwater. The groundwater quality was evaluated using the entropy water quality index (EWQI). The results showed that approximately 86% of the total groundwater samples in the study area fall within the moderate groundwater quality category. The spatial map of the EWQI values indicated an increasing trend from the south-west to the northeast, following the direction of groundwater flow. The highest EWQI values were observed near El Milia city in the center of the plain. This spatial pattern suggests variations in groundwater quality across the study area, with potentially higher risks near the city center. Therefore, the results obtained in this research provide very useful information to decision-makers.

Keywords: entropy water quality index (EWQI), moran’s i, ordinary kriging interpolation, el milia plain

Procedia PDF Downloads 53
7136 An Auxiliary Technique for Coronary Heart Disease Prediction by Analyzing Electrocardiogram Based on ResNet and Bi-Long Short-Term Memory

Authors: Yang Zhang, Jian He

Abstract:

Heart disease is one of the leading causes of death in the world, and coronary heart disease (CHD) is one of the major heart diseases. Electrocardiogram (ECG) is widely used in the detection of heart diseases, but the traditional manual method for CHD prediction by analyzing ECG requires lots of professional knowledge for doctors. This paper introduces sliding window and continuous wavelet transform (CWT) to transform ECG signals into images, and then ResNet and Bi-LSTM are introduced to build the ECG feature extraction network (namely ECGNet). At last, an auxiliary system for coronary heart disease prediction was developed based on modified ResNet18 and Bi-LSTM, and the public ECG dataset of CHD from MIMIC-3 was used to train and test the system. The experimental results show that the accuracy of the method is 83%, and the F1-score is 83%. Compared with the available methods for CHD prediction based on ECG, such as kNN, decision tree, VGGNet, etc., this method not only improves the prediction accuracy but also could avoid the degradation phenomenon of the deep learning network.

Keywords: Bi-LSTM, CHD, ECG, ResNet, sliding window

Procedia PDF Downloads 84
7135 Institutional Quality and Tax Compliance: A Cross-Country Regression Evidence

Authors: Debi Konukcu Onal, Tarkan Cavusoglu

Abstract:

In modern societies, the costs of public goods and services are shared through taxes paid by citizens. However, taxation has always been a frictional issue, as tax obligations are perceived to be a financial burden for taxpayers rather than being merit that fulfills the redistribution, regulation and stabilization functions of the welfare state. The tax compliance literature evolves into discussing why people still pay taxes in systems with low costs of legal enforcement. Related empirical and theoretical works show that a wide range of socially oriented behavioral factors can stimulate voluntary compliance and subversive effects as well. These behavioral motivations are argued to be driven by self-enforcing rules of informal institutions, either independently or through interactions with legal orders set by formal institutions. The main focus of this study is to investigate empirically whether institutional particularities have a significant role in explaining the cross-country differences in the tax noncompliance levels. A part of the controversy about the driving forces behind tax noncompliance may be attributed to the lack of empirical evidence. Thus, this study aims to fill this gap through regression estimates, which help to trace the link between institutional quality and noncompliance on a cross-country basis. Tax evasion estimates of Buehn and Schneider is used as the proxy measure for the tax noncompliance levels. Institutional quality is quantified by three different indicators (percentile ranks of Worldwide Governance Indicators, ratings of the International Country Risk Guide, and the country ratings of the Freedom in the World). Robust Least Squares and Threshold Regression estimates based on the sample of the Organization for Economic Co-operation and Development (OECD) countries imply that tax compliance increases with institutional quality. Moreover, a threshold-based asymmetry is detected in the effect of institutional quality on tax noncompliance. That is, the negative effects of tax burdens on compliance are found to be more pronounced in countries with institutional quality below a certain threshold. These findings are robust to all alternative indicators of institutional quality, supporting the significant interaction of societal values with the individual taxpayer decisions.

Keywords: institutional quality, OECD economies, tax compliance, tax evasion

Procedia PDF Downloads 127
7134 Applicability of Cameriere’s Age Estimation Method in a Sample of Turkish Adults

Authors: Hatice Boyacioglu, Nursel Akkaya, Humeyra Ozge Yilanci, Hilmi Kansu, Nihal Avcu

Abstract:

The strong relationship between the reduction in the size of the pulp cavity and increasing age has been reported in the literature. This relationship can be utilized to estimate the age of an individual by measuring the pulp cavity size using dental radiographs as a non-destructive method. The purpose of this study is to develop a population specific regression model for age estimation in a sample of Turkish adults by applying Cameriere’s method on panoramic radiographs. The sample consisted of 100 panoramic radiographs of Turkish patients (40 men, 60 women) aged between 20 and 70 years. Pulp and tooth area ratios (AR) of the maxilla¬¬ry canines were measured by two maxillofacial radiologists and then the results were subjected to regression analysis. There were no statistically significant intra-observer and inter-observer differences. The correlation coefficient between age and the AR of the maxillary canines was -0.71 and the following regression equation was derived: Estimated Age = 77,365 – ( 351,193 × AR ). The mean prediction error was 4 years which is within acceptable errors limits for age estimation. This shows that the pulp/tooth area ratio is a useful variable for assessing age with reasonable accuracy. Based on the results of this research, it was concluded that Cameriere’s method is suitable for dental age estimation and it can be used for forensic procedures in Turkish adults. These instructions give you guidelines for preparing papers for conferences or journals.

Keywords: age estimation by teeth, forensic dentistry, panoramic radiograph, Cameriere's method

Procedia PDF Downloads 446
7133 Modeling of System Availability and Bayesian Analysis of Bivariate Distribution

Authors: Muhammad Farooq, Ahtasham Gul

Abstract:

To meet the desired standard, it is important to monitor and analyze different engineering processes to get desired output. The bivariate distributions got a lot of attention in recent years to describe the randomness of natural as well as artificial mechanisms. In this article, a bivariate model is constructed using two independent models developed by the nesting approach to study the effect of each component on reliability for better understanding. Further, the Bayes analysis of system availability is studied by considering prior parametric variations in the failure time and repair time distributions. Basic statistical characteristics of marginal distribution, like mean median and quantile function, are discussed. We use inverse Gamma prior to study its frequentist properties by conducting Monte Carlo Markov Chain (MCMC) sampling scheme.

Keywords: reliability, system availability Weibull, inverse Lomax, Monte Carlo Markov Chain, Bayesian

Procedia PDF Downloads 69
7132 Comparing Machine Learning Estimation of Fuel Consumption of Heavy-Duty Vehicles

Authors: Victor Bodell, Lukas Ekstrom, Somayeh Aghanavesi

Abstract:

Fuel consumption (FC) is one of the key factors in determining expenses of operating a heavy-duty vehicle. A customer may therefore request an estimate of the FC of a desired vehicle. The modular design of heavy-duty vehicles allows their construction by specifying the building blocks, such as gear box, engine and chassis type. If the combination of building blocks is unprecedented, it is unfeasible to measure the FC, since this would first r equire the construction of the vehicle. This paper proposes a machine learning approach to predict FC. This study uses around 40,000 vehicles specific and o perational e nvironmental c onditions i nformation, such as road slopes and driver profiles. A ll v ehicles h ave d iesel engines and a mileage of more than 20,000 km. The data is used to investigate the accuracy of machine learning algorithms Linear regression (LR), K-nearest neighbor (KNN) and Artificial n eural n etworks (ANN) in predicting fuel consumption for heavy-duty vehicles. Performance of the algorithms is evaluated by reporting the prediction error on both simulated data and operational measurements. The performance of the algorithms is compared using nested cross-validation and statistical hypothesis testing. The statistical evaluation procedure finds that ANNs have the lowest prediction error compared to LR and KNN in estimating fuel consumption on both simulated and operational data. The models have a mean relative prediction error of 0.3% on simulated data, and 4.2% on operational data.

Keywords: artificial neural networks, fuel consumption, friedman test, machine learning, statistical hypothesis testing

Procedia PDF Downloads 177
7131 A Reasoning Method of Cyber-Attack Attribution Based on Threat Intelligence

Authors: Li Qiang, Yang Ze-Ming, Liu Bao-Xu, Jiang Zheng-Wei

Abstract:

With the increasing complexity of cyberspace security, the cyber-attack attribution has become an important challenge of the security protection systems. The difficult points of cyber-attack attribution were forced on the problems of huge data handling and key data missing. According to this situation, this paper presented a reasoning method of cyber-attack attribution based on threat intelligence. The method utilizes the intrusion kill chain model and Bayesian network to build attack chain and evidence chain of cyber-attack on threat intelligence platform through data calculation, analysis and reasoning. Then, we used a number of cyber-attack events which we have observed and analyzed to test the reasoning method and demo system, the result of testing indicates that the reasoning method can provide certain help in cyber-attack attribution.

Keywords: reasoning, Bayesian networks, cyber-attack attribution, Kill Chain, threat intelligence

Procedia PDF Downloads 446