Search results for: multivariate statistics
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2373

Search results for: multivariate statistics

2373 Regression for Doubly Inflated Multivariate Poisson Distributions

Authors: Ishapathik Das, Sumen Sen, N. Rao Chaganty, Pooja Sengupta

Abstract:

Dependent multivariate count data occur in several research studies. These data can be modeled by a multivariate Poisson or Negative binomial distribution constructed using copulas. However, when some of the counts are inflated, that is, the number of observations in some cells are much larger than other cells, then the copula based multivariate Poisson (or Negative binomial) distribution may not fit well and it is not an appropriate statistical model for the data. There is a need to modify or adjust the multivariate distribution to account for the inflated frequencies. In this article, we consider the situation where the frequencies of two cells are higher compared to the other cells, and develop a doubly inflated multivariate Poisson distribution function using multivariate Gaussian copula. We also discuss procedures for regression on covariates for the doubly inflated multivariate count data. For illustrating the proposed methodologies, we present a real data containing bivariate count observations with inflations in two cells. Several models and linear predictors with log link functions are considered, and we discuss maximum likelihood estimation to estimate unknown parameters of the models.

Keywords: copula, Gaussian copula, multivariate distributions, inflated distributios

Procedia PDF Downloads 128
2372 The Moment of the Optimal Average Length of the Multivariate Exponentially Weighted Moving Average Control Chart for Equally Correlated Variables

Authors: Edokpa Idemudia Waziri, Salisu S. Umar

Abstract:

The Hotellng’s T^2 is a well-known statistic for detecting a shift in the mean vector of a multivariate normal distribution. Control charts based on T have been widely used in statistical process control for monitoring a multivariate process. Although it is a powerful tool, the T statistic is deficient when the shift to be detected in the mean vector of a multivariate process is small and consistent. The Multivariate Exponentially Weighted Moving Average (MEWMA) control chart is one of the control statistics used to overcome the drawback of the Hotellng’s T statistic. In this paper, the probability distribution of the Average Run Length (ARL) of the MEWMA control chart when the quality characteristics exhibit substantial cross correlation and when the process is in-control and out-of-control was derived using the Markov Chain algorithm. The derivation of the probability functions and the moments of the run length distribution were also obtained and they were consistent with some existing results for the in-control and out-of-control situation. By simulation process, the procedure identified a class of ARL for the MEWMA control when the process is in-control and out-of-control. From our study, it was observed that the MEWMA scheme is quite adequate for detecting a small shift and a good way to improve the quality of goods and services in a multivariate situation. It was also observed that as the in-control average run length ARL0¬ or the number of variables (p) increases, the optimum value of the ARL0pt increases asymptotically and as the magnitude of the shift σ increases, the optimal ARLopt decreases. Finally, we use the example from the literature to illustrate our method and demonstrate its efficiency.

Keywords: average run length, markov chain, multivariate exponentially weighted moving average, optimal smoothing parameter

Procedia PDF Downloads 389
2371 A Data-Driven Monitoring Technique Using Combined Anomaly Detectors

Authors: Fouzi Harrou, Ying Sun, Sofiane Khadraoui

Abstract:

Anomaly detection based on Principal Component Analysis (PCA) was studied intensively and largely applied to multivariate processes with highly cross-correlated process variables. Monitoring metrics such as the Hotelling's T2 and the Q statistics are usually used in PCA-based monitoring to elucidate the pattern variations in the principal and residual subspaces, respectively. However, these metrics are ill suited to detect small faults. In this paper, the Exponentially Weighted Moving Average (EWMA) based on the Q and T statistics, T2-EWMA and Q-EWMA, were developed for detecting faults in the process mean. The performance of the proposed methods was compared with that of the conventional PCA-based fault detection method using synthetic data. The results clearly show the benefit and the effectiveness of the proposed methods over the conventional PCA method, especially for detecting small faults in highly correlated multivariate data.

Keywords: data-driven method, process control, anomaly detection, dimensionality reduction

Procedia PDF Downloads 264
2370 Multivariate Rainfall Disaggregation Using MuDRain Model: Malaysia Experience

Authors: Ibrahim Suliman Hanaish

Abstract:

Disaggregation daily rainfall using stochastic models formulated based on multivariate approach (MuDRain) is discussed in this paper. Seven rain gauge stations are considered in this study for different distances from the referred station starting from 4 km to 160 km in Peninsular Malaysia. The hourly rainfall data used are covered the period from 1973 to 2008 and July and November months are considered as an example of dry and wet periods. The cross-correlation among the rain gauges is considered for the available hourly rainfall information at the neighboring stations or not. This paper discussed the applicability of the MuDRain model for disaggregation daily rainfall to hourly rainfall for both sources of cross-correlation. The goodness of fit of the model was based on the reproduction of fitting statistics like the means, variances, coefficients of skewness, lag zero cross-correlation of coefficients and the lag one auto correlation of coefficients. It is found the correlation coefficients based on extracted correlations that was based on daily are slightly higher than correlations based on available hourly rainfall especially for neighboring stations not more than 28 km. The results showed also the MuDRain model did not reproduce statistics very well. In addition, a bad reproduction of the actual hyetographs comparing to the synthetic hourly rainfall data. Mean while, it is showed a good fit between the distribution function of the historical and synthetic hourly rainfall. These discrepancies are unavoidable because of the lowest cross correlation of hourly rainfall. The overall performance indicated that the MuDRain model would not be appropriate choice for disaggregation daily rainfall.

Keywords: rainfall disaggregation, multivariate disaggregation rainfall model, correlation, stochastic model

Procedia PDF Downloads 462
2369 The Modality of Multivariate Skew Normal Mixture

Authors: Bader Alruwaili, Surajit Ray

Abstract:

Finite mixtures are a flexible and powerful tool that can be used for univariate and multivariate distributions, and a wide range of research analysis has been conducted based on the multivariate normal mixture and multivariate of a t-mixture. Determining the number of modes is an important activity that, in turn, allows one to determine the number of homogeneous groups in a population. Our work currently being carried out relates to the study of the modality of the skew normal distribution in the univariate and multivariate cases. For the skew normal distribution, the aims are associated with studying the modality of the skew normal distribution and providing the ridgeline, the ridgeline elevation function, the $\Pi$ function, and the curvature function, and this will be conducive to an exploration of the number and location of mode when mixing the two components of skew normal distribution. The subsequent objective is to apply these results to the application of real world data sets, such as flow cytometry data.

Keywords: mode, modality, multivariate skew normal, finite mixture, number of mode

Procedia PDF Downloads 459
2368 Basketball Game-Related Statistics Discriminating Teams Competing in Basketball Africa League and Euroleague: Comparative Analysis

Authors: Ng'etich K. Stephen

Abstract:

Abstract—Globally analytics in basketball has advanced tremendously in the last decade. Organizations are leveraging the insights to improve team and player performance and, in the long run, generate revenue out of it. Due to limited basketball game-related statistics in African competitions, teams are unaware of how they compete with other continental basketball teams. The purpose of this study is to evaluate the regional difference in basketball game-related statistics between African teams that played in the newly formed league, the basketball African league and the European league. The basketball African league, a competition created through the partnership between NBA and FIBA, offers a good starting point since it has valuable basketball metrics to analyze. This study sought to use multivariate linear discriminant analysis to identify the game-related statistics that discriminate the teams in Euro league and the basketball African league.

Keywords: basketball africa league, basketball, euroleague, fiba, africa

Procedia PDF Downloads 56
2367 Multivariate Analysis of Student’s Performance in Statistic Courses in Humanities Sciences

Authors: Carla Silva

Abstract:

The aim of this research is to study the relationship between the performance of humanities students in different statistics classes and their performance in their specific courses. Several factors are been studied, such as gender and final grades in statistics and math. Participants of this study comprised a sample of students at a Lisbon University during their academic year. A significant relationship tends to appear between these factors and the performance of these students. However this relationship tends to be stronger with students who had previous studied calculus and math.

Keywords: education, performance, statistic, humanities

Procedia PDF Downloads 296
2366 Undergraduate Students' Attitude towards the Statistics Course

Authors: Somruay Apichatibutarapong

Abstract:

The purpose of this study was to address and comparison of the attitudes towards the statistics course for undergraduate students. Data were collected from 120 students in Faculty of Sciences and Technology, Suan Sunandha Rajabhat University who enrolled in the statistics course. The quantitative approach was used to investigate the assessment and comparison of attitudes towards statistics course. It was revealed that the overall attitudes somewhat agree both in pre-test and post-test. In addition, the comparison of students’ attitudes towards the statistic course (Form A) has no difference in the overall attitudes. However, there is statistical significance in all dimensions and overall attitudes towards the statistics course (Form B).

Keywords: statistics attitude, student’s attitude, statistics, attitude test

Procedia PDF Downloads 421
2365 Multivariate Genome-Wide Association Studies for Identifying Additional Loci for Myopia

Authors: Qiao Fan, Xiaobo Guo, Junxian Zhu, Xiaohu Ding, Ching-Yu Cheng, Tien-Yin Wong, Mingguang He, Heping Zhang, Xueqin Wang

Abstract:

A systematic, simultaneous analysis of multiple phenotypes in genome-wide association studies (GWASs) draws a great attention to integrate the signals from single phenotypes with increased power. However, lacking an interpretable and efficient multivariate GWAS analysis impede the application of such approach. In this study, we propose to decompose the multivariate model into a series of simple univariate models. This transformation illuminates what exactly the individual trait contributes to the significant signals from the multivariate analyses. By employing our approach in the analysis of three myopia-related endophenotypes from the Singapore Malay Eye Study (SIMES), we identify novel candidate loci which were successfully validated in an independent Guangzhou Twin Eye Study (GTES).

Keywords: GWAS multivariate, multiple traits, myopia, association

Procedia PDF Downloads 192
2364 Ranking of Provinces in Iran for Capital Formation in Spatial Planning with Numerical Taxonomy Technique (An Improvement) Case Study: Agriculture Sector

Authors: Farhad Nouparast

Abstract:

For more production we need more capital formation. Capital formation in each country should be based on comparative advantages in different economic sectors due to the different production possibility curves. In regional planning, recognizing the relative advantages and consequently investing in more production requires identifying areas with the necessary capabilities and location of each region compared to other regions. In this article, ranking of Iran's provinces is done according to the specific and given variables as the best investment position in agricultural activity. So we can provide the necessary background for investment analysis in different regions of the country to formulate national and regional planning and execute investment projects. It is used factor analysis technique and numerical taxonomy analysis to do this in thisarticle. At first, the provinces are homogenized and graded according to the variables using cross-sectional data obtained from the agricultural census and population and housing census of Iran as data matrix. The results show that which provinces have the most potential for capital formation in agronomy sub-sector. Taxonomy classifies organisms based on similar genetic traits in biology and botany. Numerical taxonomy using quantitative methods controls large amounts of information and get the number of samples and categories and take them based on inherent characteristics and differences indirectly accommodates. Numerical taxonomy is related to multivariate statistics.

Keywords: Capital Formation, Factor Analysis, Multivariate statistics, Numerical Taxonomy Analysis, Production, Ranking, Spatial Planning

Procedia PDF Downloads 108
2363 An AK-Chart for the Non-Normal Data

Authors: Chia-Hau Liu, Tai-Yue Wang

Abstract:

Traditional multivariate control charts assume that measurement from manufacturing processes follows a multivariate normal distribution. However, this assumption may not hold or may be difficult to verify because not all the measurement from manufacturing processes are normal distributed in practice. This study develops a new multivariate control chart for monitoring the processes with non-normal data. We propose a mechanism based on integrating the one-class classification method and the adaptive technique. The adaptive technique is used to improve the sensitivity to small shift on one-class classification in statistical process control. In addition, this design provides an easy way to allocate the value of type I error so it is easier to be implemented. Finally, the simulation study and the real data from industry are used to demonstrate the effectiveness of the propose control charts.

Keywords: multivariate control chart, statistical process control, one-class classification method, non-normal data

Procedia PDF Downloads 391
2362 The Factors Predicting Credibility of News in Social Media in Thailand

Authors: Ekapon Thienthaworn

Abstract:

This research aims to study the reliability of the forecasting factor in social media by using survey research methods with questionnaires. The sampling is the group of undergraduate students in Bangkok. A multiple-step random number of 400 persons, data analysis are descriptive statistics with multivariate regression analysis. The research found the average of the overall trust at the intermediate level for reading the news in social media and the results of the multivariate regression analysis to find out the factors that forecast credibility of the media found the only content that has the power to forecast reliability of undergraduate students in Bangkok to reading the news on social media at the significance level.at 0.05.These can be factors with forecasts reliability of news in social media by a variable that has the highest influence factor of the media content and the speed is also important for reliability of the news.

Keywords: credibility of news, behaviors and attitudes, social media, web board

Procedia PDF Downloads 441
2361 Multivariate Analysis of Spectroscopic Data for Agriculture Applications

Authors: Asmaa M. Hussein, Amr Wassal, Ahmed Farouk Al-Sadek, A. F. Abd El-Rahman

Abstract:

In this study, a multivariate analysis of potato spectroscopic data was presented to detect the presence of brown rot disease or not. Near-Infrared (NIR) spectroscopy (1,350-2,500 nm) combined with multivariate analysis was used as a rapid, non-destructive technique for the detection of brown rot disease in potatoes. Spectral measurements were performed in 565 samples, which were chosen randomly at the infection place in the potato slice. In this study, 254 infected and 311 uninfected (brown rot-free) samples were analyzed using different advanced statistical analysis techniques. The discrimination performance of different multivariate analysis techniques, including classification, pre-processing, and dimension reduction, were compared. Applying a random forest algorithm classifier with different pre-processing techniques to raw spectra had the best performance as the total classification accuracy of 98.7% was achieved in discriminating infected potatoes from control.

Keywords: Brown rot disease, NIR spectroscopy, potato, random forest

Procedia PDF Downloads 154
2360 Applying Multivariate and Univariate Analysis of Variance on Socioeconomic, Health, and Security Variables in Jordan

Authors: Faisal G. Khamis, Ghaleb A. El-Refae

Abstract:

Many researchers have studied socioeconomic, health, and security variables in the developed countries; however, very few studies used multivariate analysis in developing countries. The current study contributes to the scarce literature about the determinants of the variance in socioeconomic, health, and security factors. Questions raised were whether the independent variables (IVs) of governorate and year impact the socioeconomic, health, and security dependent variables (DVs) in Jordan, whether the marginal mean of each DV in each governorate and in each year is significant, which governorates are similar in difference means of each DV, and whether these DVs vary. The main objectives were to determine the source of variances in DVs, collectively and separately, testing which governorates are similar and which diverge for each DV. The research design was time series and cross-sectional analysis. The main hypotheses are that IVs affect DVs collectively and separately. Multivariate and univariate analyses of variance were carried out to test these hypotheses. The population of 12 governorates in Jordan and the available data of 15 years (2000–2015) accrued from several Jordanian statistical yearbooks. We investigated the effect of two factors of governorate and year on the four DVs of divorce rate, mortality rate, unemployment percentage, and crime rate. All DVs were transformed to multivariate normal distribution. We calculated descriptive statistics for each DV. Based on the multivariate analysis of variance, we found a significant effect in IVs on DVs with p < .001. Based on the univariate analysis, we found a significant effect of IVs on each DV with p < .001, except the effect of the year factor on unemployment was not significant with p = .642. The grand and marginal means of each DV in each governorate and each year were significant based on a 95% confidence interval. Most governorates are not similar in DVs with p < .001. We concluded that the two factors produce significant effects on DVs, collectively and separately. Based on these findings, the government can distribute its financial and physical resources to governorates more efficiently. By identifying the sources of variance that contribute to the variation in DVs, insights can help inform focused variation prevention efforts.

Keywords: ANOVA, crime, divorce, governorate, hypothesis test, Jordan, MANOVA, means, mortality, unemployment, year

Procedia PDF Downloads 237
2359 A Cohort and Empirical Based Multivariate Mortality Model

Authors: Jeffrey Tzu-Hao Tsai, Yi-Shan Wong

Abstract:

This article proposes a cohort-age-period (CAP) model to characterize multi-population mortality processes using cohort, age, and period variables. Distinct from the factor-based Lee-Carter-type decomposition mortality model, this approach is empirically based and includes the age, period, and cohort variables into the equation system. The model not only provides a fruitful intuition for explaining multivariate mortality change rates but also has a better performance in forecasting future patterns. Using the US and the UK mortality data and performing ten-year out-of-sample tests, our approach shows smaller mean square errors in both countries compared to the models in the literature.

Keywords: longevity risk, stochastic mortality model, multivariate mortality rate, risk management

Procedia PDF Downloads 20
2358 Personality Traits of NEO Five Factors and Statistics Anxiety among Social Sciences University Students

Authors: Oluyinka Ojedokun, S. E. Idemudia

Abstract:

In Nigeria, statistics is a compulsory course required from all social sciences students as part of their academic training. However, a rising number of social sciences undergraduates usually express statistics anxiety. The prevalence of statistics anxiety among undergraduates in social sciences has created a growing concern for educators and researchers in the higher education institutions, mainly because this statistics anxiety adversely affects their performance in statistics and research methods courses. From a societal perspective it is important to reverse this trend. Although scholars and researchers have highlighted some psychosocial factors that influence statistics anxiety in students but few empirical studies exist on the association between personality traits of NEO five factors and statistics anxiety. It is in the light of this situation that this study was designed to assess the extent to which the personality traits of NEO five factors influence statistics anxiety of students in social sciences courses. The participants were 282 undergraduates in the faculty of social sciences at a state owned public university in Nigeria. The findings demonstrate that the personality traits contributing to statistics anxiety include openness to experience, conscientious, extraversion, and neuroticism. These results imply that statistics anxiety is related to individual differences in personality traits and suggest that certain aspects of statistics anxiety may be relatively stable and resistant to change. An effective and simple method to reduce statistics anxiety among social sciences students is to create awareness of the statistical and methodological requirements of the social sciences courses before commencement of their programmes.

Keywords: personality traits, statistics anxiety, social sciences, students

Procedia PDF Downloads 504
2357 Introduction of Robust Multivariate Process Capability Indices

Authors: Behrooz Khalilloo, Hamid Shahriari, Emad Roghanian

Abstract:

Process capability indices (PCIs) are important concepts of statistical quality control and measure the capability of processes and how much processes are meeting certain specifications. An important issue in statistical quality control is parameter estimation. Under the assumption of multivariate normality, the distribution parameters, mean vector and variance-covariance matrix must be estimated, when they are unknown. Classic estimation methods like method of moment estimation (MME) or maximum likelihood estimation (MLE) makes good estimation of the population parameters when data are not contaminated. But when outliers exist in the data, MME and MLE make weak estimators of the population parameters. So we need some estimators which have good estimation in the presence of outliers. In this work robust M-estimators for estimating these parameters are used and based on robust parameter estimators, robust process capability indices are introduced. The performances of these robust estimators in the presence of outliers and their effects on process capability indices are evaluated by real and simulated multivariate data. The results indicate that the proposed robust capability indices perform much better than the existing process capability indices.

Keywords: multivariate process capability indices, robust M-estimator, outlier, multivariate quality control, statistical quality control

Procedia PDF Downloads 249
2356 A Non-parametric Clustering Approach for Multivariate Geostatistical Data

Authors: Francky Fouedjio

Abstract:

Multivariate geostatistical data have become omnipresent in the geosciences and pose substantial analysis challenges. One of them is the grouping of data locations into spatially contiguous clusters so that data locations within the same cluster are more similar while clusters are different from each other, in some sense. Spatially contiguous clusters can significantly improve the interpretation that turns the resulting clusters into meaningful geographical subregions. In this paper, we develop an agglomerative hierarchical clustering approach that takes into account the spatial dependency between observations. It relies on a dissimilarity matrix built from a non-parametric kernel estimator of the spatial dependence structure of data. It integrates existing methods to find the optimal cluster number and to evaluate the contribution of variables to the clustering. The capability of the proposed approach to provide spatially compact, connected and meaningful clusters is assessed using bivariate synthetic dataset and multivariate geochemical dataset. The proposed clustering method gives satisfactory results compared to other similar geostatistical clustering methods.

Keywords: clustering, geostatistics, multivariate data, non-parametric

Procedia PDF Downloads 449
2355 On the Bootstrap P-Value Method in Identifying out of Control Signals in Multivariate Control Chart

Authors: O. Ikpotokin

Abstract:

In any production process, every product is aimed to attain a certain standard, but the presence of assignable cause of variability affects our process, thereby leading to low quality of product. The ability to identify and remove this type of variability reduces its overall effect, thereby improving the quality of the product. In case of a univariate control chart signal, it is easy to detect the problem and give a solution since it is related to a single quality characteristic. However, the problems involved in the use of multivariate control chart are the violation of multivariate normal assumption and the difficulty in identifying the quality characteristic(s) that resulted in the out of control signals. The purpose of this paper is to examine the use of non-parametric control chart (the bootstrap approach) for obtaining control limit to overcome the problem of multivariate distributional assumption and the p-value method for detecting out of control signals. Results from a performance study show that the proposed bootstrap method enables the setting of control limit that can enhance the detection of out of control signals when compared, while the p-value method also enhanced in identifying out of control variables.

Keywords: bootstrap control limit, p-value method, out-of-control signals, p-value, quality characteristics

Procedia PDF Downloads 319
2354 The Use of Boosted Multivariate Trees in Medical Decision-Making for Repeated Measurements

Authors: Ebru Turgal, Beyza Doganay Erdogan

Abstract:

Machine learning aims to model the relationship between the response and features. Medical decision-making researchers would like to make decisions about patients’ course and treatment, by examining the repeated measurements over time. Boosting approach is now being used in machine learning area for these aims as an influential tool. The aim of this study is to show the usage of multivariate tree boosting in this field. The main reason for utilizing this approach in the field of decision-making is the ease solutions of complex relationships. To show how multivariate tree boosting method can be used to identify important features and feature-time interaction, we used the data, which was collected retrospectively from Ankara University Chest Diseases Department records. Dataset includes repeated PF ratio measurements. The follow-up time is planned for 120 hours. A set of different models is tested. In conclusion, main idea of classification with weighed combination of classifiers is a reliable method which was shown with simulations several times. Furthermore, time varying variables will be taken into consideration within this concept and it could be possible to make accurate decisions about regression and survival problems.

Keywords: boosted multivariate trees, longitudinal data, multivariate regression tree, panel data

Procedia PDF Downloads 168
2353 Factors Influencing the Enjoyment and Performance of Students in Statistics Service Courses: A Mixed-Method Study

Authors: Wilma Coetzee

Abstract:

Statistics lecturers experience that many students who are taking a service course in statistics do not like statistics. Students in these courses tend to struggle and do not perform well. This research takes a look at the student’s perspective, with the aim to determine how to change the teaching of statistics so that students will enjoy it more and perform better. Questionnaires were used to determine the perspectives of first year service statistics students at a South African university. Factors addressed included motivation to study, attitude toward statistics, statistical anxiety, mathematical abilities and tendency to procrastinate. Logistic regression was used to determine what contributes to students performing badly in statistics. The results show that the factors that contribute the most to students performing badly are: statistical anxiety, not being motivated and having had mathematical literacy instead of mathematics in secondary school. Two open ended questions were included in the questionnaire: 'I will enjoy statistics more if…' and 'I will perform better in statistics if…'. The answers to these questions were analyzed using qualitative methods. Frequent themes were identified for each of the questions. A simulation study incorporating bootstrapping was done to determine the saturation of the themes. The majority of the students indicated that they would perform better in statistics if they studied more, managed their time better, had a flare for mathematics and if the lecturer was able to explain difficult concepts better. They also want more active learning. To ensure that students enjoy statistics more, they want an active learning experience. They want fun activities, more interaction with the lecturer and with one another, more computer based problems, and more challenges. They want a better understanding of the subject, want to understand the relevance of statistics to their future career and want excellent lecturers. These findings can be used to direct the improvement of the tuition of statistics.

Keywords: active learning, performance in statistics, statistical anxiety, statistics education

Procedia PDF Downloads 121
2352 Analyzing the Influence of Hydrometeorlogical Extremes, Geological Setting, and Social Demographic on Public Health

Authors: Irfan Ahmad Afip

Abstract:

This main research objective is to accurately identify the possibility for a Leptospirosis outbreak severity of a certain area based on its input features into a multivariate regression model. The research question is the possibility of an outbreak in a specific area being influenced by this feature, such as social demographics and hydrometeorological extremes. If the occurrence of an outbreak is being subjected to these features, then the epidemic severity for an area will be different depending on its environmental setting because the features will influence the possibility and severity of an outbreak. Specifically, this research objective was three-fold, namely: (a) to identify the relevant multivariate features and visualize the patterns data, (b) to develop a multivariate regression model based from the selected features and determine the possibility for Leptospirosis outbreak in an area, and (c) to compare the predictive ability of multivariate regression model and machine learning algorithms. Several secondary data features were collected locations in the state of Negeri Sembilan, Malaysia, based on the possibility it would be relevant to determine the outbreak severity in the area. The relevant features then will become an input in a multivariate regression model; a linear regression model is a simple and quick solution for creating prognostic capabilities. A multivariate regression model has proven more precise prognostic capabilities than univariate models. The expected outcome from this research is to establish a correlation between the features of social demographic and hydrometeorological with Leptospirosis bacteria; it will also become a contributor for understanding the underlying relationship between the pathogen and the ecosystem. The relationship established can be beneficial for the health department or urban planner to inspect and prepare for future outcomes in event detection and system health monitoring.

Keywords: geographical information system, hydrometeorological, leptospirosis, multivariate regression

Procedia PDF Downloads 80
2351 Using Discriminant Analysis to Forecast Crime Rate in Nigeria

Authors: O. P. Popoola, O. A. Alawode, M. O. Olayiwola, A. M. Oladele

Abstract:

This research work is based on using discriminant analysis to forecast crime rate in Nigeria between 1996 and 2008. The work is interested in how gender (male and female) relates to offences committed against the government, against other properties, disturbance in public places, murder/robbery offences and other offences. The data used was collected from the National Bureau of Statistics (NBS). SPSS, the statistical package was used to analyse the data. Time plot was plotted on all the 29 offences gotten from the raw data. Eigenvalues and Multivariate tests, Wilks’ Lambda, standardized canonical discriminant function coefficients and the predicted classifications were estimated. The research shows that the distribution of the scores from each function is standardized to have a mean O and a standard deviation of 1. The magnitudes of the coefficients indicate how strongly the discriminating variable affects the score. In the predicted group membership, 172 cases that were predicted to commit crime against Government group, 66 were correctly predicted and 106 were incorrectly predicted. After going through the predicted classifications, we found out that most groups numbers that were correctly predicted were less than those that were incorrectly predicted.

Keywords: discriminant analysis, DA, multivariate analysis of variance, MANOVA, canonical correlation, and Wilks’ Lambda

Procedia PDF Downloads 433
2350 Classification of Generative Adversarial Network Generated Multivariate Time Series Data Featuring Transformer-Based Deep Learning Architecture

Authors: Thrivikraman Aswathi, S. Advaith

Abstract:

As there can be cases where the use of real data is somehow limited, such as when it is hard to get access to a large volume of real data, we need to go for synthetic data generation. This produces high-quality synthetic data while maintaining the statistical properties of a specific dataset. In the present work, a generative adversarial network (GAN) is trained to produce multivariate time series (MTS) data since the MTS is now being gathered more often in various real-world systems. Furthermore, the GAN-generated MTS data is fed into a transformer-based deep learning architecture that carries out the data categorization into predefined classes. Further, the model is evaluated across various distinct domains by generating corresponding MTS data.

Keywords: GAN, transformer, classification, multivariate time series

Procedia PDF Downloads 86
2349 Co-Integration Model for Predicting Inflation Movement in Nigeria

Authors: Salako Rotimi, Oshungade Stephen, Ojewoye Opeyemi

Abstract:

The maintenance of price stability is one of the macroeconomic challenges facing Nigeria as a nation. This paper attempts to build a co-integration multivariate time series model for inflation movement in Nigeria using data extracted from the abstract of statistics of the Central Bank of Nigeria (CBN) from 2008 to 2017. The Johansen cointegration test suggests at least one co-integration vector describing the long run relationship between Consumer Price Index (CPI), Food Price Index (FPI) and Non-Food Price Index (NFPI). All three series show increasing pattern, which indicates a sign of non-stationary in each of the series. Furthermore, model predictability was established with root-mean-square-error, mean absolute error, mean average percentage error, and Theil’s unbiased statistics for n-step forecasting. The result depicts that the long run coefficient of a consumer price index (CPI) has a positive long-run relationship with the food price index (FPI) and non-food price index (NFPI).

Keywords: economic, inflation, model, series

Procedia PDF Downloads 213
2348 Multivariate Dependent Frequency-Severity Modeling of Insurance Claims: A Vine Copula Approach

Authors: Islem Kedidi, Rihab Bedoui Bensalem, Faysal Manssouri

Abstract:

In traditional models of insurance data, the number and size of claims are assumed to be independent. Relaxing the independence assumption, this article explores the Vine copula to model dependence structure between multivariate frequency and average severity of insurance claim. To illustrate this approach, we use the Wisconsin local government property insurance fund which offers several insurance protections for motor vehicles, property and contractor’s equipment claims. Results show that the C-vine copula can better characterize the multivariate dependence structure between frequency and severity. Furthermore, we find significant dependencies especially between frequency and average severity among different coverage types.

Keywords: dependency modeling, government insurance, insurance claims, vine copula

Procedia PDF Downloads 171
2347 On the Impact of Oil Price Fluctuations on Stock Markets: A Multivariate Long-Memory GARCH Framework

Authors: Manel Youssef, Lotfi Belkacem

Abstract:

This paper employs multivariate long memory GARCH models to simultaneously estimate mean and conditional variance spillover effects between oil prices and different financial markets. Since different financial assets are traded based on these market sector returns, it’s important for financial market participants to understand the volatility transmission mechanism over time and across these series in order to make optimal portfolio allocation decisions. We examine weekly returns from January 1, 2003 to November 30, 2012 and find evidence of significant transmission of shocks and volatilities between oil prices and some of the examined financial markets. The findings support the idea of cross-market hedging and sharing of common information by investors.

Keywords: oil prices, stock indices returns, oil volatility, contagion, DCC-multivariate (FI) GARCH

Procedia PDF Downloads 497
2346 A Multivariate 4/2 Stochastic Covariance Model: Properties and Applications to Portfolio Decisions

Authors: Yuyang Cheng, Marcos Escobar-Anel

Abstract:

This paper introduces a multivariate 4/2 stochastic covariance process generalizing the one-dimensional counterparts presented in Grasselli (2017). Our construction permits stochastic correlation not only among stocks but also among volatilities, also known as co-volatility movements, both driven by more convenient 4/2 stochastic structures. The parametrization is flexible enough to separate these types of correlation, permitting their individual study. Conditions for proper changes of measure and closed-form characteristic functions under risk-neutral and historical measures are provided, allowing for applications of the model to risk management and derivative pricing. We apply the model to an expected utility theory problem in incomplete markets. Our analysis leads to closed-form solutions for the optimal allocation and value function. Conditions are provided for well-defined solutions together with a verification theorem. Our numerical analysis highlights and separates the impact of key statistics on equity portfolio decisions, in particular, volatility, correlation, and co-volatility movements, with the latter being the least important in an incomplete market.

Keywords: stochastic covariance process, 4/2 stochastic volatility model, stochastic co-volatility movements, characteristic function, expected utility theory, veri cation theorem

Procedia PDF Downloads 121
2345 Alternative Computational Arrangements on g-Group (g > 2) Profile Analysis

Authors: Emmanuel U. Ohaegbulem, Felix N. Nwobi

Abstract:

Alternative and simple computational arrangements in carrying out multivariate profile analysis when more than two groups (populations) are involved are presented. These arrangements have been demonstrated to not only yield equivalent results for the test statistics (the Wilks lambdas), but they have less computational efforts relative to other arrangements so far presented in the literature; in addition to being quite simple and easy to apply.

Keywords: coincident profiles, g-group profile analysis, level profiles, parallel profiles, repeated measures MANOVA

Procedia PDF Downloads 413
2344 Applying Semi-Automatic Digital Aerial Survey Technology and Canopy Characters Classification for Surface Vegetation Interpretation of Archaeological Sites

Authors: Yung-Chung Chuang

Abstract:

The cultural layers of archaeological sites are mainly affected by surface land use, land cover, and root system of surface vegetation. For this reason, continuous monitoring of land use and land cover change is important for archaeological sites protection and management. However, in actual operation, on-site investigation and orthogonal photograph interpretation require a lot of time and manpower. For this reason, it is necessary to perform a good alternative for surface vegetation survey in an automated or semi-automated manner. In this study, we applied semi-automatic digital aerial survey technology and canopy characters classification with very high-resolution aerial photographs for surface vegetation interpretation of archaeological sites. The main idea is based on different landscape or forest type can easily be distinguished with canopy characters (e.g., specific texture distribution, shadow effects and gap characters) extracted by semi-automatic image classification. A novel methodology to classify the shape of canopy characters using landscape indices and multivariate statistics was also proposed. Non-hierarchical cluster analysis was used to assess the optimal number of canopy character clusters and canonical discriminant analysis was used to generate the discriminant functions for canopy character classification (seven categories). Therefore, people could easily predict the forest type and vegetation land cover by corresponding to the specific canopy character category. The results showed that the semi-automatic classification could effectively extract the canopy characters of forest and vegetation land cover. As for forest type and vegetation type prediction, the average prediction accuracy reached 80.3%~91.7% with different sizes of test frame. It represented this technology is useful for archaeological site survey, and can improve the classification efficiency and data update rate.

Keywords: digital aerial survey, canopy characters classification, archaeological sites, multivariate statistics

Procedia PDF Downloads 108