Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 7400

Search results for: empirical bayesian kriging regression prediction

6950 Modern Machine Learning Conniptions for Automatic Speech Recognition

Abstract:

This expose presents a luculent of recent machine learning practices as employed in the modern and as pertinent to prospective automatic speech recognition schemes. The aspiration is to promote additional traverse ablution among the machine learning and automatic speech recognition factions that have transpired in the precedent. The manuscript is structured according to the chief machine learning archetypes that are furthermore trendy by now or have latency for building momentous hand-outs to automatic speech recognition expertise. The standards offered and convoluted in this article embraces adaptive and multi-task learning, active learning, Bayesian learning, discriminative learning, generative learning, supervised and unsupervised learning. These learning archetypes are aggravated and conferred in the perspective of automatic speech recognition tools and functions. This manuscript bequeaths and surveys topical advances of deep learning and learning with sparse depictions; further limelight is on their incessant significance in the evolution of automatic speech recognition.

Keywords: automatic speech recognition, deep learning methods, machine learning archetypes, Bayesian learning, supervised and unsupervised learning

Procedia PDF Downloads 421

6949 Intelligent Earthquake Prediction System Based On Neural Network

Authors: Emad Amar, Tawfik Khattab, Fatma Zada

Abstract:

Predicting earthquakes is an important issue in the study of geography. Accurate prediction of earthquakes can help people to take effective measures to minimize the loss of personal and economic damage, such as large casualties, destruction of buildings and broken of traffic, occurred within a few seconds. United States Geological Survey (USGS) science organization provides reliable scientific information of Earthquake Existed throughout history & Preliminary database from the National Center Earthquake Information (NEIC) show some useful factors to predict an earthquake in a seismic area like Aleutian Arc in the U.S. state of Alaska. The main advantage of this prediction method that it does not require any assumption, it makes prediction according to the future evolution of object's time series. The article compares between simulation data result from trained BP and RBF neural network versus actual output result from the system calculations. Therefore, this article focuses on analysis of data relating to real earthquakes. Evaluation results show better accuracy and higher speed by using radial basis functions (RBF) neural network.

Keywords: BP neural network, prediction, RBF neural network, earthquake

Procedia PDF Downloads 472

6948 Hybrid Wavelet-Adaptive Neuro-Fuzzy Inference System Model for a Greenhouse Energy Demand Prediction

Authors: Azzedine Hamza, Chouaib Chakour, Messaoud Ramdani

Abstract:

Energy demand prediction plays a crucial role in achieving next-generation power systems for agricultural greenhouses. As a result, high prediction quality is required for efficient smart grid management and therefore low-cost energy consumption. The aim of this paper is to investigate the effectiveness of a hybrid data-driven model in day-ahead energy demand prediction. The proposed model consists of Discrete Wavelet Transform (DWT), and Adaptive Neuro-Fuzzy Inference System (ANFIS). The DWT is employed to decompose the original signal in a set of subseries and then an ANFIS is used to generate the forecast for each subseries. The proposed hybrid method (DWT-ANFIS) was evaluated using a greenhouse energy demand data for a week and compared with ANFIS. The performances of the different models were evaluated by comparing the corresponding values of Mean Absolute Percentage Error (MAPE). It was demonstrated that discret wavelet transform can improve agricultural greenhouse energy demand modeling.

Keywords: wavelet transform, ANFIS, energy consumption prediction, greenhouse

Procedia PDF Downloads 64

6947 Bayesian Parameter Inference for Continuous Time Markov Chains with Intractable Likelihood

Authors: Randa Alharbi, Vladislav Vyshemirsky

Abstract:

Systems biology is an important field in science which focuses on studying behaviour of biological systems. Modelling is required to produce detailed description of the elements of a biological system, their function, and their interactions. A well-designed model requires selecting a suitable mechanism which can capture the main features of the system, define the essential components of the system and represent an appropriate law that can define the interactions between its components. Complex biological systems exhibit stochastic behaviour. Thus, using probabilistic models are suitable to describe and analyse biological systems. Continuous-Time Markov Chain (CTMC) is one of the probabilistic models that describe the system as a set of discrete states with continuous time transitions between them. The system is then characterised by a set of probability distributions that describe the transition from one state to another at a given time. The evolution of these probabilities through time can be obtained by chemical master equation which is analytically intractable but it can be simulated. Uncertain parameters of such a model can be inferred using methods of Bayesian inference. Yet, inference in such a complex system is challenging as it requires the evaluation of the likelihood which is intractable in most cases. There are different statistical methods that allow simulating from the model despite intractability of the likelihood. Approximate Bayesian computation is a common approach for tackling inference which relies on simulation of the model to approximate the intractable likelihood. Particle Markov chain Monte Carlo (PMCMC) is another approach which is based on using sequential Monte Carlo to estimate intractable likelihood. However, both methods are computationally expensive. In this paper we discuss the efficiency and possible practical issues for each method, taking into account the computational time for these methods. We demonstrate likelihood-free inference by performing analysing a model of the Repressilator using both methods. Detailed investigation is performed to quantify the difference between these methods in terms of efficiency and computational cost.

Keywords: Approximate Bayesian computation(ABC), Continuous-Time Markov Chains, Sequential Monte Carlo, Particle Markov chain Monte Carlo (PMCMC)

Procedia PDF Downloads 186

6946 Exploring the Applications of Neural Networks in the Adaptive Learning Environment

Authors: Baladitya Swaika, Rahul Khatry

Abstract:

Computer Adaptive Tests (CATs) is one of the most efficient ways for testing the cognitive abilities of students. CATs are based on Item Response Theory (IRT) which is based on item selection and ability estimation using statistical methods of maximum information selection/selection from posterior and maximum-likelihood (ML)/maximum a posteriori (MAP) estimators respectively. This study aims at combining both classical and Bayesian approaches to IRT to create a dataset which is then fed to a neural network which automates the process of ability estimation and then comparing it to traditional CAT models designed using IRT. This study uses python as the base coding language, pymc for statistical modelling of the IRT and scikit-learn for neural network implementations. On creation of the model and on comparison, it is found that the Neural Network based model performs 7-10% worse than the IRT model for score estimations. Although performing poorly, compared to the IRT model, the neural network model can be beneficially used in back-ends for reducing time complexity as the IRT model would have to re-calculate the ability every-time it gets a request whereas the prediction from a neural network could be done in a single step for an existing trained Regressor. This study also proposes a new kind of framework whereby the neural network model could be used to incorporate feature sets, other than the normal IRT feature set and use a neural network’s capacity of learning unknown functions to give rise to better CAT models. Categorical features like test type, etc. could be learnt and incorporated in IRT functions with the help of techniques like logistic regression and can be used to learn functions and expressed as models which may not be trivial to be expressed via equations. This kind of a framework, when implemented would be highly advantageous in psychometrics and cognitive assessments. This study gives a brief overview as to how neural networks can be used in adaptive testing, not only by reducing time-complexity but also by being able to incorporate newer and better datasets which would eventually lead to higher quality testing.

Keywords: computer adaptive tests, item response theory, machine learning, neural networks

Procedia PDF Downloads 158

6945 Comparison of Various Classification Techniques Using WEKA for Colon Cancer Detection

Authors: Beema Akbar, Varun P. Gopi, V. Suresh Babu

Abstract:

Colon cancer causes the deaths of about half a million people every year. The common method of its detection is histopathological tissue analysis, it leads to tiredness and workload to the pathologist. A novel method is proposed that combines both structural and statistical pattern recognition used for the detection of colon cancer. This paper presents a comparison among the different classifiers such as Multilayer Perception (MLP), Sequential Minimal Optimization (SMO), Bayesian Logistic Regression (BLR) and k-star by using classification accuracy and error rate based on the percentage split method. The result shows that the best algorithm in WEKA is MLP classifier with an accuracy of 83.333% and kappa statistics is 0.625. The MLP classifier which has a lower error rate, will be preferred as more powerful classification capability.

Keywords: colon cancer, histopathological image, structural and statistical pattern recognition, multilayer perception

Procedia PDF Downloads 555

6944 Classifying and Predicting Efficiencies Using Interval DEA Grid Setting

Authors: Yiannis G. Smirlis

Abstract:

The classification and the prediction of efficiencies in Data Envelopment Analysis (DEA) is an important issue, especially in large scale problems or when new units frequently enter the under-assessment set. In this paper, we contribute to the subject by proposing a grid structure based on interval segmentations of the range of values for the inputs and outputs. Such intervals combined, define hyper-rectangles that partition the space of the problem. This structure, exploited by Interval DEA models and a dominance relation, acts as a DEA pre-processor, enabling the classification and prediction of efficiency scores, without applying any DEA models.

Keywords: data envelopment analysis, interval DEA, efficiency classification, efficiency prediction

Procedia PDF Downloads 154

6943 Stock Market Developments, Income Inequality, Wealth Inequality

Authors: Quang Dong Dang

Abstract:

This paper examines the possible effects of stock market developments by channels on income and wealth inequality. We use the Bayesian Multilevel Model with the explanatory variables of the market’s channels, such as accessibility, efficiency, and market health in six selected countries: the US, UK, Japan, Vietnam, Thailand, and Malaysia. We found that generally, the improvements in the stock market alleviate income inequality. However, stock market expansions in higher-income countries are likely to trigger income inequality. We also found that while enhancing the quality of channels of the stock market has counter-effects on wealth equality distributions, open accessibilities help reduce wealth inequality distributions within the scope of the study. In addition, the inverted U-shaped hypothesis seems not to be valid in six selected countries between the period from 2006 to 2020.

Keywords: Bayesian multilevel model, income inequality, inverted u-shaped hypothesis, stock market development, wealth inequality

Procedia PDF Downloads 87

6942 Learning Dynamic Representations of Nodes in Temporally Variant Graphs

Authors: Sandra Mitrovic, Gaurav Singh

Abstract:

In many industries, including telecommunications, churn prediction has been a topic of active research. A lot of attention has been drawn on devising the most informative features, and this area of research has gained even more focus with spread of (social) network analytics. The call detail records (CDRs) have been used to construct customer networks and extract potentially useful features. However, to the best of our knowledge, no studies including network features have yet proposed a generic way of representing network information. Instead, ad-hoc and dataset dependent solutions have been suggested. In this work, we build upon a recently presented method (node2vec) to obtain representations for nodes in observed network. The proposed approach is generic and applicable to any network and domain. Unlike node2vec, which assumes a static network, we consider a dynamic and time-evolving network. To account for this, we propose an approach that constructs the feature representation of each node by generating its node2vec representations at different timestamps, concatenating them and finally compressing using an auto-encoder-like method in order to retain reasonably long and informative feature vectors. We test the proposed method on churn prediction task in telco domain. To predict churners at timestamp ts+1, we construct training and testing datasets consisting of feature vectors from time intervals [t1, ts-1] and [t2, ts] respectively, and use traditional supervised classification models like SVM and Logistic Regression. Observed results show the effectiveness of proposed approach as compared to ad-hoc feature selection based approaches and static node2vec.

Keywords: churn prediction, dynamic networks, node2vec, auto-encoders

Procedia PDF Downloads 298

6941 A Simple and Empirical Refraction Correction Method for UAV-Based Shallow-Water Photogrammetry

Authors: I GD Yudha Partama, A. Kanno, Y. Akamatsu, R. Inui, M. Goto, M. Sekine

Abstract:

The aerial photogrammetry of shallow water bottoms has the potential to be an efficient high-resolution survey technique for shallow water topography, thanks to the advent of convenient UAV and automatic image processing techniques Structure-from-Motion (SfM) and Multi-View Stereo (MVS)). However, it suffers from the systematic overestimation of the bottom elevation, due to the light refraction at the air-water interface. In this study, we present an empirical method to correct for the effect of refraction after the usual SfM-MVS processing, using common software. The presented method utilizes the empirical relation between the measured true depth and the estimated apparent depth to generate an empirical correction factor. Furthermore, this correction factor was utilized to convert the apparent water depth into a refraction-corrected (real-scale) water depth. To examine its effectiveness, we applied the method to two river sites, and compared the RMS errors in the corrected bottom elevations with those obtained by three existing methods. The result shows that the presented method is more effective than the two existing methods: The method without applying correction factor and the method utilizes the refractive index of water (1.34) as correction factor. In comparison with the remaining existing method, which used the additive terms (offset) after calculating correction factor, the presented method performs well in Site 2 and worse in Site 1. However, we found this linear regression method to be unstable when the training data used for calibration are limited. It also suffers from a large negative bias in the correction factor when the apparent water depth estimated is affected by noise, according to our numerical experiment. Overall, the good accuracy of refraction correction method depends on various factors such as the locations, image acquisition, and GPS measurement conditions. The most effective method can be selected by using statistical selection (e.g. leave-one-out cross validation).

Keywords: bottom elevation, MVS, river, SfM

Procedia PDF Downloads 287

6940 CD133 and CD44 - Stem Cell Markers for Prediction of Clinically Aggressive Form of Colorectal Cancer

Authors: Ognen Kostovski, Svetozar Antovic, Rubens Jovanovic, Irena Kostovska, Nikola Jankulovski

Abstract:

Introduction:Colorectal carcinoma (CRC) is one of the most common malignancies in the world. The cancer stem cell (CSC) markers are associated with aggressive cancer types and poor prognosis. The aim of study was to determine whether the expression of colorectal cancer stem cell markers CD133 and CD44 could be significant in prediction of clinically aggressive form of CRC. Materials and methods: Our study included ninety patients (n=90) with CRC. Patients were divided into two subgroups: with metatstatic CRC and non-metastatic CRC. Tumor samples were analyzed with standard histopathological methods, than was performed immunohistochemical analysis with monoclonal antibodies against CD133 and CD44 stem cell markers. Results: High coexpression of CD133 and CD44 was observed in 71.4% of patients with metastatic disease, compared to 37.9% in patients without metastases. Discordant expression of both markers was found in 8% of the subgroup with metastatic CRC, and in 13.4% of the subgroup without metastatic CRC. Statistical analyses showed a significant association of increased expression of CD133 and CD44 with the disease stage, T - category and N - nodal status. With multiple regression analysis the stage of disease was designate as a factor with the greatest statistically significant influence on expression of CD133 (p <0.0001) and CD44 (p <0.0001). Conclusion: Our results suggest that the coexpression of CD133 and CD44 have an important role in prediction of clinically aggressive form of CRC. Both stem cell markers can be routinely implemented in standard pathohistological diagnostics and can be useful markers for pre-therapeutic oncology screening.

Keywords: colorectal carcinoma, stem cells, CD133+, CD44+

Procedia PDF Downloads 125

6939 Nonlinear Estimation Model for Rail Track Deterioration

Authors: M. Karimpour, L. Hitihamillage, N. Elkhoury, S. Moridpour, R. Hesami

Abstract:

Rail transport authorities around the world have been facing a significant challenge when predicting rail infrastructure maintenance work for a long period of time. Generally, maintenance monitoring and prediction is conducted manually. With the restrictions in economy, the rail transport authorities are in pursuit of improved modern methods, which can provide precise prediction of rail maintenance time and location. The expectation from such a method is to develop models to minimize the human error that is strongly related to manual prediction. Such models will help them in understanding how the track degradation occurs overtime under the change in different conditions (e.g. rail load, rail type, rail profile). They need a well-structured technique to identify the precise time that rail tracks fail in order to minimize the maintenance cost/time and secure the vehicles. The rail track characteristics that have been collected over the years will be used in developing rail track degradation prediction models. Since these data have been collected in large volumes and the data collection is done both electronically and manually, it is possible to have some errors. Sometimes these errors make it impossible to use them in prediction model development. This is one of the major drawbacks in rail track degradation prediction. An accurate model can play a key role in the estimation of the long-term behavior of rail tracks. Accurate models increase the track safety and decrease the cost of maintenance in long term. In this research, a short review of rail track degradation prediction models has been discussed before estimating rail track degradation for the curve sections of Melbourne tram track system using Adaptive Network-based Fuzzy Inference System (ANFIS) model.

Keywords: ANFIS, MGT, prediction modeling, rail track degradation

Procedia PDF Downloads 296

6938 Mathematical Modeling for Diabetes Prediction: A Neuro-Fuzzy Approach

Authors: Vijay Kr. Yadav, Nilam Rathi

Abstract:

Accurate prediction of glucose level for diabetes mellitus is required to avoid affecting the functioning of major organs of human body. This study describes the fundamental assumptions and two different methodologies of the Blood glucose prediction. First is based on the back-propagation algorithm of Artificial Neural Network (ANN), and second is based on the Neuro-Fuzzy technique, called Fuzzy Inference System (FIS). Errors between proposed methods further discussed through various statistical methods such as mean square error (MSE), normalised mean absolute error (NMAE). The main objective of present study is to develop mathematical model for blood glucose prediction before 12 hours advanced using data set of three patients for 60 days. The comparative studies of the accuracy with other existing models are also made with same data set.

Keywords: back-propagation, diabetes mellitus, fuzzy inference system, neuro-fuzzy

Procedia PDF Downloads 231

6937 Prediction of Gully Erosion with Stochastic Modeling by using Geographic Information System and Remote Sensing Data in North of Iran

Authors: Reza Zakerinejad

Abstract:

Gully erosion is a serious problem that threading the sustainability of agricultural area and rangeland and water in a large part of Iran. This type of water erosion is the main source of sedimentation in many catchment areas in the north of Iran. Since in many national assessment approaches just qualitative models were applied the aim of this study is to predict the spatial distribution of gully erosion processes by means of detail terrain analysis and GIS -based logistic regression in the loess deposition in a case study in the Golestan Province. This study the DEM with 25 meter result ion from ASTER data has been used. The Landsat ETM data have been used to mapping of land use. The TreeNet model as a stochastic modeling was applied to prediction the susceptible area for gully erosion. In this model ROC we have set 20 % of data as learning and 20 % as learning data. Therefore, applying the GIS and satellite image analysis techniques has been used to derive the input information for these stochastic models. The result of this study showed a high accurate map of potential for gully erosion.

Keywords: TreeNet model, terrain analysis, Golestan Province, Iran

Procedia PDF Downloads 515

6936 Evaluation of Groundwater Quality and Contamination Sources Using Geostatistical Methods and GIS in Miryang City, Korea

Authors: H. E. Elzain, S. Y. Chung, V. Senapathi, Kye-Hun Park

Abstract:

Groundwater is considered a significant source for drinking and irrigation purposes in Miryang city, and it is attributed to a limited number of a surface water reservoirs and high seasonal variations in precipitation. Population growth in addition to the expansion of agricultural land uses and industrial development may affect the quality and management of groundwater. This research utilized multidisciplinary approaches of geostatistics such as multivariate statistics, factor analysis, cluster analysis and kriging technique in order to identify the hydrogeochemical process and characterizing the control factors of the groundwater geochemistry distribution for developing risk maps, exploiting data obtained from chemical investigation of groundwater samples under the area of study. A total of 79 samples have been collected and analyzed using atomic absorption spectrometer (AAS) for major and trace elements. Chemical maps using 2-D spatial Geographic Information System (GIS) of groundwater provided a powerful tool for detecting the possible potential sites of groundwater that involve the threat of contamination. GIS computer based map exhibited that the higher rate of contamination observed in the central and southern area with relatively less extent in the northern and southwestern parts. It could be attributed to the effect of irrigation, residual saline water, municipal sewage and livestock wastes. At wells elevation over than 85m, the scatter diagram represents that the groundwater of the research area was mainly influenced by saline water and NO3. Level of pH measurement revealed low acidic condition due to dissolved atmospheric CO2 in the soil, while the saline water had a major impact on the higher values of TDS and EC. Based on the cluster analysis results, the groundwater has been categorized into three group includes the CaHCO3 type of the fresh water, NaHCO3 type slightly influenced by sea water and Ca-Cl, Na-Cl types which are heavily affected by saline water. The most predominant water type was CaHCO3 in the study area. Contamination sources and chemical characteristics were identified from factor analysis interrelationship and cluster analysis. The chemical elements that belong to factor 1 analysis were related to the effect of sea water while the elements of factor 2 associated with agricultural fertilizers. The degree level, distribution, and location of groundwater contamination have been generated by using Kriging methods. Thus, geostatistics model provided more accurate results for identifying the source of contamination and evaluating the groundwater quality. GIS was also a creative tool to visualize and analyze the issues affecting water quality in the Miryang city.

Keywords: groundwater characteristics, GIS chemical maps, factor analysis, cluster analysis, Kriging techniques

Procedia PDF Downloads 148

6935 Clinical Feature Analysis and Prediction on Recurrence in Cervical Cancer

Authors: Ravinder Bahl, Jamini Sharma

Abstract:

The paper demonstrates analysis of the cervical cancer based on a probabilistic model. It involves technique for classification and prediction by recognizing typical and diagnostically most important test features relating to cervical cancer. The main contributions of the research include predicting the probability of recurrences in no recurrence (first time detection) cases. The combination of the conventional statistical and machine learning tools is applied for the analysis. Experimental study with real data demonstrates the feasibility and potential of the proposed approach for the said cause.

Keywords: cervical cancer, recurrence, no recurrence, probabilistic, classification, prediction, machine learning

Procedia PDF Downloads 341

6934 Dynamic vs. Static Bankruptcy Prediction Models: A Dynamic Performance Evaluation Framework

Authors: Mohammad Mahdi Mousavi

Abstract:

Bankruptcy prediction models have been implemented for continuous evaluation and monitoring of firms. With the huge number of bankruptcy models, an extensive number of studies have focused on answering the question that which of these models are superior in performance. In practice, one of the drawbacks of existing comparative studies is that the relative assessment of alternative bankruptcy models remains an exercise that is mono-criterion in nature. Further, a very restricted number of criteria and measure have been applied to compare the performance of competing bankruptcy prediction models. In this research, we overcome these methodological gaps through implementing an extensive range of criteria and measures for comparison between dynamic and static bankruptcy models, and through proposing a multi-criteria framework to compare the relative performance of bankruptcy models in forecasting firm distress for UK firms.

Keywords: bankruptcy prediction, data envelopment analysis, performance criteria, performance measures

Procedia PDF Downloads 227

6933 Prediction of Extreme Precipitation in East Asia Using Complex Network

Authors: Feng Guolin, Gong Zhiqiang

Abstract:

In order to study the spatial structure and dynamical mechanism of extreme precipitation in East Asia, a corresponding climate network is constructed by employing the method of event synchronization. It is found that the area of East Asian summer extreme precipitation can be separated into two regions: one with high area weighted connectivity receiving heavy precipitation mostly during the active phase of the East Asian Summer Monsoon (EASM), and another one with low area weighted connectivity receiving heavy precipitation during both the active and the retreat phase of the EASM. Besides，a way for the prediction of extreme precipitation is also developed by constructing a directed climate networks. The simulation accuracy in East Asia is 58% with a 0-day lead, and the prediction accuracy is 21% and average 12% with a 1-day and an n-day (2≤n≤10) lead, respectively. Compare to the normal EASM year, the prediction accuracy is lower in a weak year and higher in a strong year, which is relevant to the differences in correlations and extreme precipitation rates in different EASM situations. Recognizing and identifying these effects is good for understanding and predicting extreme precipitation in East Asia.

Keywords: synchronization, climate network, prediction, rainfall

Procedia PDF Downloads 422

6932 MapReduce Logistic Regression Algorithms with RHadoop

Authors: Byung Ho Jung, Dong Hoon Lim

Abstract:

Logistic regression is a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. Logistic regression is used extensively in numerous disciplines, including the medical and social science fields. In this paper, we address the problem of estimating parameters in the logistic regression based on MapReduce framework with RHadoop that integrates R and Hadoop environment applicable to large scale data. There exist three learning algorithms for logistic regression, namely Gradient descent method, Cost minimization method and Newton-Rhapson's method. The Newton-Rhapson's method does not require a learning rate, while gradient descent and cost minimization methods need to manually pick a learning rate. The experimental results demonstrated that our learning algorithms using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also compared the performance of our Newton-Rhapson's method with gradient descent and cost minimization methods. The results showed that our newton's method appeared to be the most robust to all data tested.

Keywords: big data, logistic regression, MapReduce, RHadoop

Procedia PDF Downloads 254

6931 The Persistence of Abnormal Return on Assets: An Exploratory Analysis of the Differences between Industries and Differences between Firms by Country and Sector

Authors: José Luis Gallizo, Pilar Gargallo, Ramon Saladrigues, Manuel Salvador

Abstract:

This study offers an exploratory statistical analysis of the persistence of annual profits across a sample of firms from different European Union (EU) countries. To this end, a hierarchical Bayesian dynamic model has been used which enables the annual behaviour of those profits to be broken down into a permanent structural and a transitory component, while also distinguishing between general effects affecting the industry as a whole to which each firm belongs and specific effects affecting each firm in particular. This breakdown enables the relative importance of those fundamental components to be more accurately evaluated by country and sector. Furthermore, Bayesian approach allows for testing different hypotheses about the homogeneity of the behaviour of the above components with respect to the sector and the country where the firm develops its activity. The data analysed come from a sample of 23,293 firms in EU countries selected from the AMADEUS data-base. The period analysed ran from 1999 to 2007 and 21 sectors were analysed, chosen in such a way that there was a sufficiently large number of firms in each country sector combination for the industry effects to be estimated accurately enough for meaningful comparisons to be made by sector and country. The analysis has been conducted by sector and by country from a Bayesian perspective, thus making the study more flexible and realistic since the estimates obtained do not depend on asymptotic results. In general terms, the study finds that, although the industry effects are significant, more important are the firm specific effects. That importance varies depending on the sector or the country in which the firm carries out its activity. The influence of firm effects accounts for around 81% of total variation and display a significantly lower degree of persistence, with adjustment speeds oscillating around 34%. However, this pattern is not homogeneous but depends on the sector and country analysed. Industry effects depends also on sector and country analysed have a more marginal importance, being significantly more persistent, with adjustment speeds oscillating around 7-8% with this degree of persistence being very similar for most of sectors and countries analysed.

Keywords: dynamic models, Bayesian inference, MCMC, abnormal returns, persistence of profits, return on assets

Procedia PDF Downloads 381

6930 Machine Learning Assisted Prediction of Sintered Density of Binary W(MO) Alloys

Authors: Hexiong Liu

Abstract:

Powder metallurgy is the optimal method for the consolidation and preparation of W(Mo) alloys, which exhibit excellent application prospects at high temperatures. The properties of W(Mo) alloys are closely related to the sintered density. However, controlling the sintered density and porosity of these alloys is still challenging. In the past, the regulation methods mainly focused on time-consuming and costly trial-and-error experiments. In this study, the sintering data for more than a dozen W(Mo) alloys constituted a small-scale dataset, including both solid and liquid phases of sintering. Furthermore, simple descriptors were used to predict the sintered density of W(Mo) alloys based on the descriptor selection strategy and machine learning method (ML), where the ML algorithm included the least absolute shrinkage and selection operator (Lasso) regression, k-nearest neighbor (k-NN), random forest (RF), and multi-layer perceptron (MLP). The results showed that the interpretable descriptors extracted by our proposed selection strategy and the MLP neural network achieved a high prediction accuracy (R>0.950). By further predicting the sintered density of W(Mo) alloys using different sintering processes, the error between the predicted and experimental values was less than 0.063, confirming the application potential of the model.

Keywords: sintered density, machine learning, interpretable descriptors, W(Mo) alloy

Procedia PDF Downloads 57

6929 Multicollinearity and MRA in Sustainability: Application of the Raise Regression

Authors: Claudia García-García, Catalina B. García-García, Román Salmerón-Gómez

Abstract:

Much economic-environmental research includes the analysis of possible interactions by using Moderated Regression Analysis (MRA), which is a specific application of multiple linear regression analysis. This methodology allows analyzing how the effect of one of the independent variables is moderated by a second independent variable by adding a cross-product term between them as an additional explanatory variable. Due to the very specification of the methodology, the moderated factor is often highly correlated with the constitutive terms. Thus, great multicollinearity problems arise. The appearance of strong multicollinearity in a model has important consequences. Inflated variances of the estimators may appear, there is a tendency to consider non-significant regressors that they probably are together with a very high coefficient of determination, incorrect signs of our coefficients may appear and also the high sensibility of the results to small changes in the dataset. Finally, the high relationship among explanatory variables implies difficulties in fixing the individual effects of each one on the model under study. These consequences shifted to the moderated analysis may imply that it is not worth including an interaction term that may be distorting the model. Thus, it is important to manage the problem with some methodology that allows for obtaining reliable results. After a review of those works that applied the MRA among the ten top journals of the field, it is clear that multicollinearity is mostly disregarded. Less than 15% of the reviewed works take into account potential multicollinearity problems. To overcome the issue, this work studies the possible application of recent methodologies to MRA. Particularly, the raised regression is analyzed. This methodology mitigates collinearity from a geometrical point of view: the collinearity problem arises because the variables under study are very close geometrically, so by separating both variables, the problem can be mitigated. Raise regression maintains the available information and modifies the problematic variables instead of deleting variables, for example. Furthermore, the global characteristics of the initial model are also maintained (sum of squared residuals, estimated variance, coefficient of determination, global significance test and prediction). The proposal is implemented to data from countries of the European Union during the last year available regarding greenhouse gas emissions, per capita GDP and a dummy variable that represents the topography of the country. The use of a dummy variable as the moderator is a special variant of MRA, sometimes called “subgroup regression analysis.” The main conclusion of this work is that applying new techniques to the field can improve in a substantial way the results of the analysis. Particularly, the use of raised regression mitigates great multicollinearity problems, so the researcher is able to rely on the interaction term when interpreting the results of a particular study.

Keywords: multicollinearity, MRA, interaction, raise

Procedia PDF Downloads 80

6928 Representation Data without Lost Compression Properties in Time Series: A Review

Authors: Nabilah Filzah Mohd Radzuan, Zalinda Othman, Azuraliza Abu Bakar, Abdul Razak Hamdan

Abstract:

Uncertain data is believed to be an important issue in building up a prediction model. The main objective in the time series uncertainty analysis is to formulate uncertain data in order to gain knowledge and fit low dimensional model prior to a prediction task. This paper discusses the performance of a number of techniques in dealing with uncertain data specifically those which solve uncertain data condition by minimizing the loss of compression properties.

Keywords: compression properties, uncertainty, uncertain time series, mining technique, weather prediction

Procedia PDF Downloads 410

6927 Prediction of Trailing-Edge Noise under Adverse-Pressure Gradient Effect

Authors: Li Chen

Abstract:

For an aerofoil or hydrofoil in high Reynolds number flows, broadband noise is generated efficiently as the result of the turbulence convecting over the trailing edge. This noise can be related to the surface pressure fluctuations, which can be predicted by either CFD or empirical models. However, in reality, the aerofoil or hydrofoil often operates at an angle of attack. Under this situation, the flow is subjected to an Adverse-Pressure-Gradient (APG), and as a result, a flow separation may occur. This study is to assess trailing-edge noise models for such flows. In the present work, the trailing-edge noise from a 2D airfoil at 6 degree of angle of attach is investigated. Under this condition, the flow is experiencing a strong APG, and the flow separation occurs. The flow over the airfoil with a chord of 300 mm, equivalent to a Reynold Number 4x10⁵, is simulated using RANS with the SST k-ɛ turbulent model. The predicted surface pressure fluctuations are compared with the published experimental data and empirical models, and show a good agreement with the experimental data. The effect of the APG on the trailing edge noise is discussed, and the associated trailing edge noise is calculated.

Keywords: aero-acoustics, adverse-pressure gradient, computational fluid dynamics, trailing-edge noise

Procedia PDF Downloads 314

6926 Churn Prediction for Telecommunication Industry Using Artificial Neural Networks

Authors: Ulas Vural, M. Ergun Okay, E. Mesut Yildiz

Abstract:

Telecommunication service providers demand accurate and precise prediction of customer churn probabilities to increase the effectiveness of their customer relation services. The large amount of customer data owned by the service providers is suitable for analysis by machine learning methods. In this study, expenditure data of customers are analyzed by using an artificial neural network (ANN). The ANN model is applied to the data of customers with different billing duration. The proposed model successfully predicts the churn probabilities at 83% accuracy for only three months expenditure data and the prediction accuracy increases up to 89% when the nine month data is used. The experiments also show that the accuracy of ANN model increases on an extended feature set with information of the changes on the bill amounts.

Keywords: customer relationship management, churn prediction, telecom industry, deep learning, artificial neural networks

Procedia PDF Downloads 124

6925 The role of Financial Development and Institutional Quality in Promoting Sustainable Development through Tourism Management

Authors: Hashim Zameer

Abstract:

Effective tourism management plays a vital role in promoting sustainability and supporting ecosystems. A common principle that has been in practice over the years is “first pollute and then clean,” indicating countries need financial resources to promote sustainability. Financial development and the tourism management both seems very important to promoting sustainable development. However, without institutional support, it is very difficult to succeed. In this context, it seems prominently significant to explore how institutional quality, tourism development, and financial development could promote sustainable development. In the past, no research explored the role of tourism development in sustainable development. Moreover, the role of financial development, natural resources, and institutional quality in sustainable development is also ignored. In this regard, this paper aims to investigate the role of tourism development, natural resources, financial development, and institutional quality in sustainable development in China. The study used time-series data from 2000–2021 and employed the Bayesian linear regression model because it is suitable for small data sets. The robustness of the findings was checked using a quantile regression approach. The results reveal that an increase in tourism expenditures stimulates the economy, creates jobs, encourages cultural exchange, and supports sustainability initiatives. Moreover, financial development and institution quality have a positive effect on sustainable development. However, reliance on natural resources can result in negative economic, social, and environmental outcomes, highlighting the need for resource diversification and management to reinforce sustainable development. These results highlight the significance of financial development, strong institutions, sustainable tourism, and careful utilization of natural resources for long-term sustainability. The study holds vital insights for policy formulation to promote sustainable tourism.

Keywords: sustainability, tourism development, financial development, institutional quality

Procedia PDF Downloads 58

6924 Interference among Lambsquarters and Oil Rapeseed Cultivars

Authors: Reza Siyami, Bahram Mirshekari

Abstract:

Seed and oil yield of rapeseed is considerably affected by weeds interference including mustard (Sinapis arvensis L.), lambsquarters (Chenopodium album L.) and redroot pigweed (Amaranthus retroflexus L.) throughout the East Azerbaijan province in Iran. To formulate the relationship between four independent growth variables measured in our experiment with a dependent variable, multiple regression analysis was carried out for the weed leaves number per plant (X1), green cover percentage (X2), LAI (X3) and leaf area per plant (X4) as independent variables and rapeseed oil yield as a dependent variable. The multiple regression equation is shown as follows: Seed essential oil yield (kg/ha) = 0.156 + 0.0325 (X1) + 0.0489 (X2) + 0.0415 (X3) + 0.133 (X4). Furthermore, the stepwise regression analysis was also carried out for the data obtained to test the significance of the independent variables affecting the oil yield as a dependent variable. The resulted stepwise regression equation is shown as follows: Oil yield = 4.42 + 0.0841 (X2) + 0.0801 (X3); R2 = 81.5. The stepwise regression analysis verified that the green cover percentage and LAI of weed had a marked increasing effect on the oil yield of rapeseed.

Keywords: green cover percentage, independent variable, interference, regression

Procedia PDF Downloads 392

6923 Recent Developments in the Application of Deep Learning to Stock Market Prediction

Authors: Shraddha Jain Sharma, Ratnalata Gupta

Abstract:

Predicting stock movements in the financial market is both difficult and rewarding. Analysts and academics are increasingly using advanced approaches such as machine learning techniques to anticipate stock price patterns, thanks to the expanding capacity of computing and the recent advent of graphics processing units and tensor processing units. Stock market prediction is a type of time series prediction that is incredibly difficult to do since stock prices are influenced by a variety of financial, socioeconomic, and political factors. Furthermore, even minor mistakes in stock market price forecasts can result in significant losses for companies that employ the findings of stock market price prediction for financial analysis and investment. Soft computing techniques are increasingly being employed for stock market prediction due to their better accuracy than traditional statistical methodologies. The proposed research looks at the need for soft computing techniques in stock market prediction, the numerous soft computing approaches that are important to the field, past work in the area with their prominent features, and the significant problems or issue domain that the area involves. For constructing a predictive model, the major focus is on neural networks and fuzzy logic. The stock market is extremely unpredictable, and it is unquestionably tough to correctly predict based on certain characteristics. This study provides a complete overview of the numerous strategies investigated for high accuracy prediction, with a focus on the most important characteristics.

Keywords: stock market prediction, artificial intelligence, artificial neural networks, fuzzy logic, accuracy, deep learning, machine learning, stock price, trading volume

Procedia PDF Downloads 63

6922 A Comparison between Empirical and Theoretical OC Curves Related to Acceptance Sampling for Attributes

Authors: Encarnacion Alvarez, Noemı Hidalgo-Rebollo, Juan F. Munoz, Francisco J. Blanco-Encomienda

Abstract:

Many companies use the technique named as acceptance sampling which consists on the inspection and decision making regarding products. According to the results derived from this method, the company takes the decision of acceptance or rejection of a product. The acceptance sampling can be applied to the technology management, since the acceptance sampling can be seen as a tool to improve the design planning, operation and control of technological products. The theoretical operating characteristic (OC) curves are widely used when dealing with acceptance sampling. In this paper, we carry out Monte Carlo simulation studies to compare numerically the empirical OC curves derived from the empirical results to the customary theoretical OC curves. We analyze various possible scenarios in such a way that the differences between the empirical and theoretical curves can be observed under different situations.

Keywords: single-sampling plan, lot, Monte Carlo simulation, quality control

Procedia PDF Downloads 445

6921 Don't Just Guess and Slip: Estimating Bayesian Knowledge Tracing Parameters When Observations Are Scant

Authors: Michael Smalenberger

Abstract:

Intelligent tutoring systems (ITS) are computer-based platforms which can incorporate artificial intelligence to provide step-by-step guidance as students practice problem-solving skills. ITS can replicate and even exceed some benefits of one-on-one tutoring, foster transactivity in collaborative environments, and lead to substantial learning gains when used to supplement the instruction of a teacher or when used as the sole method of instruction. A common facet of many ITS is their use of Bayesian Knowledge Tracing (BKT) to estimate parameters necessary for the implementation of the artificial intelligence component, and for the probability of mastery of a knowledge component relevant to the ITS. While various techniques exist to estimate these parameters and probability of mastery, none directly and reliably ask the user to self-assess these. In this study, 111 undergraduate students used an ITS in a college-level introductory statistics course for which detailed transaction-level observations were recorded, and users were also routinely asked direct questions that would lead to such a self-assessment. Comparisons were made between these self-assessed values and those obtained using commonly used estimation techniques. Our findings show that such self-assessments are particularly relevant at the early stages of ITS usage while transaction level data are scant. Once a user’s transaction level data become available after sufficient ITS usage, these can replace the self-assessments in order to eliminate the identifiability problem in BKT. We discuss how these findings are relevant to the number of exercises necessary to lead to mastery of a knowledge component, the associated implications on learning curves, and its relevance to instruction time.

Keywords: Bayesian Knowledge Tracing, Intelligent Tutoring System, in vivo study, parameter estimation

Procedia PDF Downloads 151