Search results for: bivariate statistical techniques
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9914

Search results for: bivariate statistical techniques

9914 Bivariate Time-to-Event Analysis with Copula-Based Cox Regression

Authors: Duhania O. Mahara, Santi W. Purnami, Aulia N. Fitria, Merissa N. Z. Wirontono, Revina Musfiroh, Shofi Andari, Sagiran Sagiran, Estiana Khoirunnisa, Wahyudi Widada

Abstract:

For assessing interventions in numerous disease areas, the use of multiple time-to-event outcomes is common. An individual might experience two different events called bivariate time-to-event data, the events may be correlated because it come from the same subject and also influenced by individual characteristics. The bivariate time-to-event case can be applied by copula-based bivariate Cox survival model, using the Clayton and Frank copulas to analyze the dependence structure of each event and also the covariates effect. By applying this method to modeling the recurrent event infection of hemodialysis insertion on chronic kidney disease (CKD) patients, from the AIC and BIC values we find that the Clayton copula model was the best model with Kendall’s Tau is (τ=0,02).

Keywords: bivariate cox, bivariate event, copula function, survival copula

Procedia PDF Downloads 44
9913 A Bivariate Inverse Generalized Exponential Distribution and Its Applications in Dependent Competing Risks Model

Authors: Fatemah A. Alqallaf, Debasis Kundu

Abstract:

The aim of this paper is to introduce a bivariate inverse generalized exponential distribution which has a singular component. The proposed bivariate distribution can be used when the marginals have heavy-tailed distributions, and they have non-monotone hazard functions. Due to the presence of the singular component, it can be used quite effectively when there are ties in the data. Since it has four parameters, it is a very flexible bivariate distribution, and it can be used quite effectively for analyzing various bivariate data sets. Several dependency properties and dependency measures have been obtained. The maximum likelihood estimators cannot be obtained in closed form, and it involves solving a four-dimensional optimization problem. To avoid that, we have proposed to use an EM algorithm, and it involves solving only one non-linear equation at each `E'-step. Hence, the implementation of the proposed EM algorithm is very straight forward in practice. Extensive simulation experiments and the analysis of one data set have been performed. We have observed that the proposed bivariate inverse generalized exponential distribution can be used for modeling dependent competing risks data. One data set has been analyzed to show the effectiveness of the proposed model.

Keywords: Block and Basu bivariate distributions, competing risks, EM algorithm, Marshall-Olkin bivariate exponential distribution, maximum likelihood estimators

Procedia PDF Downloads 105
9912 Modeling of System Availability and Bayesian Analysis of Bivariate Distribution

Authors: Muhammad Farooq, Ahtasham Gul

Abstract:

To meet the desired standard, it is important to monitor and analyze different engineering processes to get desired output. The bivariate distributions got a lot of attention in recent years to describe the randomness of natural as well as artificial mechanisms. In this article, a bivariate model is constructed using two independent models developed by the nesting approach to study the effect of each component on reliability for better understanding. Further, the Bayes analysis of system availability is studied by considering prior parametric variations in the failure time and repair time distributions. Basic statistical characteristics of marginal distribution, like mean median and quantile function, are discussed. We use inverse Gamma prior to study its frequentist properties by conducting Monte Carlo Markov Chain (MCMC) sampling scheme.

Keywords: reliability, system availability Weibull, inverse Lomax, Monte Carlo Markov Chain, Bayesian

Procedia PDF Downloads 45
9911 Statistical Convergence of the Szasz-Mirakjan-Kantorovich-Type Operators

Authors: Rishikesh Yadav, Ramakanta Meher, Vishnu Narayan Mishra

Abstract:

The main aim of this article is to investigate the statistical convergence of the summation of integral type operators and to obtain the weighted statistical convergence. The rate of statistical convergence by means of modulus of continuity and function belonging to the Lipschitz class are also studied. We discuss the convergence of the defined operators by graphical representation and put a better rate of convergence than the Szasz-Mirakjan-Kantorovich operators. In the last section, we extend said operators into bivariate operators to study about the rate of convergence in sense of modulus of continuity and by means of Lipschitz class by using function of two variables.

Keywords: The Szasz-Mirakjan-Kantorovich operators, statistical convergence, modulus of continuity, Peeters K-functional, weighted modulus of continuity

Procedia PDF Downloads 166
9910 Bivariate Generalization of q-α-Bernstein Polynomials

Authors: Tarul Garg, P. N. Agrawal

Abstract:

We propose to define the q-analogue of the α-Bernstein Kantorovich operators and then introduce the q-bivariate generalization of these operators to study the approximation of functions of two variables. We obtain the rate of convergence of these bivariate operators by means of the total modulus of continuity, partial modulus of continuity and the Peetre’s K-functional for continuous functions. Further, in order to study the approximation of functions of two variables in a space bigger than the space of continuous functions, i.e. Bögel space; the GBS (Generalized Boolean Sum) of the q-bivariate operators is considered and degree of approximation is discussed for the Bögel continuous and Bögel differentiable functions with the aid of the Lipschitz class and the mixed modulus of smoothness.

Keywords: Bögel continuous, Bögel differentiable, generalized Boolean sum, K-functional, mixed modulus of smoothness

Procedia PDF Downloads 349
9909 A Comparative Assessment of Information Value, Fuzzy Expert System Models for Landslide Susceptibility Mapping of Dharamshala and Surrounding, Himachal Pradesh, India

Authors: Kumari Sweta, Ajanta Goswami, Abhilasha Dixit

Abstract:

Landslide is a geomorphic process that plays an essential role in the evolution of the hill-slope and long-term landscape evolution. But its abrupt nature and the associated catastrophic forces of the process can have undesirable socio-economic impacts, like substantial economic losses, fatalities, ecosystem, geomorphologic and infrastructure disturbances. The estimated fatality rate is approximately 1person /100 sq. Km and the average economic loss is more than 550 crores/year in the Himalayan belt due to landslides. This study presents a comparative performance of a statistical bivariate method and a machine learning technique for landslide susceptibility mapping in and around Dharamshala, Himachal Pradesh. The final produced landslide susceptibility maps (LSMs) with better accuracy could be used for land-use planning to prevent future losses. Dharamshala, a part of North-western Himalaya, is one of the fastest-growing tourism hubs with a total population of 30,764 according to the 2011 census and is amongst one of the hundred Indian cities to be developed as a smart city under PM’s Smart Cities Mission. A total of 209 landslide locations were identified in using high-resolution linear imaging self-scanning (LISS IV) data. The thematic maps of parameters influencing landslide occurrence were generated using remote sensing and other ancillary data in the GIS environment. The landslide causative parameters used in the study are slope angle, slope aspect, elevation, curvature, topographic wetness index, relative relief, distance from lineaments, land use land cover, and geology. LSMs were prepared using information value (Info Val), and Fuzzy Expert System (FES) models. Info Val is a statistical bivariate method, in which information values were calculated as the ratio of the landslide pixels per factor class (Si/Ni) to the total landslide pixel per parameter (S/N). Using this information values all parameters were reclassified and then summed in GIS to obtain the landslide susceptibility index (LSI) map. The FES method is a machine learning technique based on ‘mean and neighbour’ strategy for the construction of fuzzifier (input) and defuzzifier (output) membership function (MF) structure, and the FR method is used for formulating if-then rules. Two types of membership structures were utilized for membership function Bell-Gaussian (BG) and Trapezoidal-Triangular (TT). LSI for BG and TT were obtained applying membership function and if-then rules in MATLAB. The final LSMs were spatially and statistically validated. The validation results showed that in terms of accuracy, Info Val (83.4%) is better than BG (83.0%) and TT (82.6%), whereas, in terms of spatial distribution, BG is best. Hence, considering both statistical and spatial accuracy, BG is the most accurate one.

Keywords: bivariate statistical techniques, BG and TT membership structure, fuzzy expert system, information value method, machine learning technique

Procedia PDF Downloads 99
9908 Podemos Party Origin: From Social Protest to Spanish Parliament

Authors: Víctor Manuel Muñoz-Sánchez, Antonio Manuel Pérez-Flores

Abstract:

This paper analyzes the institutionalization of social protest in Spain. In the current crisis Podemos party seems to represent the political positions of the most affected citizens by the economic situation. It studies using quantitative techniques (statistical bivariate analysis), focusing on the exploitation of several bases of statistics data from the Center for Sociological and Research of Spanish Government, 15M movement characterization to its institutionalization in the Podemos party. Making a comparison between the participant's profile by the 15M and the social bases of Podemos votes. Data on the transformation of the socio-demographic profile of the fans, connoisseurs and 15M participants and voters are given.

Keywords: collective action, emerging parties, political parties, social protest

Procedia PDF Downloads 348
9907 Transformations between Bivariate Polynomial Bases

Authors: Dimitris Varsamis, Nicholas Karampetakis

Abstract:

It is well known that any interpolating polynomial P(x,y) on the vector space Pn,m of two-variable polynomials with degree less than n in terms of x and less than m in terms of y has various representations that depends on the basis of Pn,m that we select i.e. monomial, Newton and Lagrange basis etc. The aim of this paper is twofold: a) to present transformations between the coordinates of the polynomial P(x,y) in the aforementioned basis and b) to present transformations between these bases.

Keywords: bivariate interpolation polynomial, polynomial basis, transformations, interpolating polynomial

Procedia PDF Downloads 369
9906 Polynomially Adjusted Bivariate Density Estimates Based on the Saddlepoint Approximation

Authors: S. B. Provost, Susan Sheng

Abstract:

An alternative bivariate density estimation methodology is introduced in this presentation. The proposed approach involves estimating the density function associated with the marginal distribution of each of the two variables by means of the saddlepoint approximation technique and applying a bivariate polynomial adjustment to the product of these density estimates. Since the saddlepoint approximation is utilized in the context of density estimation, such estimates are determined from empirical cumulant-generating functions. In the univariate case, the saddlepoint density estimate is itself adjusted by a polynomial. Given a set of observations, the coefficients of the polynomial adjustments are obtained from the sample moments. Several illustrative applications of the proposed methodology shall be presented. Since this approach relies essentially on a determinate number of sample moments, it is particularly well suited for modeling massive data sets.

Keywords: density estimation, empirical cumulant-generating function, moments, saddlepoint approximation

Procedia PDF Downloads 246
9905 Monte Carlo Methods and Statistical Inference of Multitype Branching Processes

Authors: Ana Staneva, Vessela Stoimenova

Abstract:

A parametric estimation of the MBP with Power Series offspring distribution family is considered in this paper. The MLE for the parameters is obtained in the case when the observable data are incomplete and consist only with the generation sizes of the family tree of MBP. The parameter estimation is calculated by using the Monte Carlo EM algorithm. The estimation for the posterior distribution and for the offspring distribution parameters are calculated by using the Bayesian approach and the Gibbs sampler. The article proposes various examples with bivariate branching processes together with computational results, simulation and an implementation using R.

Keywords: Bayesian, branching processes, EM algorithm, Gibbs sampler, Monte Carlo methods, statistical estimation

Procedia PDF Downloads 387
9904 A Comparative Study on Automatic Feature Classification Methods of Remote Sensing Images

Authors: Lee Jeong Min, Lee Mi Hee, Eo Yang Dam

Abstract:

Geospatial feature extraction is a very important issue in the remote sensing research. In the meantime, the image classification based on statistical techniques, but, in recent years, data mining and machine learning techniques for automated image processing technology is being applied to remote sensing it has focused on improved results generated possibility. In this study, artificial neural network and decision tree technique is applied to classify the high-resolution satellite images, as compared to the MLC processing result is a statistical technique and an analysis of the pros and cons between each of the techniques.

Keywords: remote sensing, artificial neural network, decision tree, maximum likelihood classification

Procedia PDF Downloads 317
9903 Failure Inference and Optimization for Step Stress Model Based on Bivariate Wiener Model

Authors: Soudabeh Shemehsavar

Abstract:

In this paper, we consider the situation under a life test, in which the failure time of the test units are not related deterministically to an observable stochastic time varying covariate. In such a case, the joint distribution of failure time and a marker value would be useful for modeling the step stress life test. The problem of accelerating such an experiment is considered as the main aim of this paper. We present a step stress accelerated model based on a bivariate Wiener process with one component as the latent (unobservable) degradation process, which determines the failure times and the other as a marker process, the degradation values of which are recorded at times of failure. Parametric inference based on the proposed model is discussed and the optimization procedure for obtaining the optimal time for changing the stress level is presented. The optimization criterion is to minimize the approximate variance of the maximum likelihood estimator of a percentile of the products’ lifetime distribution.

Keywords: bivariate normal, Fisher information matrix, inverse Gaussian distribution, Wiener process

Procedia PDF Downloads 293
9902 Transforming Data into Knowledge: Mathematical and Statistical Innovations in Data Analytics

Authors: Zahid Ullah, Atlas Khan

Abstract:

The rapid growth of data in various domains has created a pressing need for effective methods to transform this data into meaningful knowledge. In this era of big data, mathematical and statistical innovations play a crucial role in unlocking insights and facilitating informed decision-making in data analytics. This abstract aims to explore the transformative potential of these innovations and their impact on converting raw data into actionable knowledge. Drawing upon a comprehensive review of existing literature, this research investigates the cutting-edge mathematical and statistical techniques that enable the conversion of data into knowledge. By evaluating their underlying principles, strengths, and limitations, we aim to identify the most promising innovations in data analytics. To demonstrate the practical applications of these innovations, real-world datasets will be utilized through case studies or simulations. This empirical approach will showcase how mathematical and statistical innovations can extract patterns, trends, and insights from complex data, enabling evidence-based decision-making across diverse domains. Furthermore, a comparative analysis will be conducted to assess the performance, scalability, interpretability, and adaptability of different innovations. By benchmarking against established techniques, we aim to validate the effectiveness and superiority of the proposed mathematical and statistical innovations in data analytics. Ethical considerations surrounding data analytics, such as privacy, security, bias, and fairness, will be addressed throughout the research. Guidelines and best practices will be developed to ensure the responsible and ethical use of mathematical and statistical innovations in data analytics. The expected contributions of this research include advancements in mathematical and statistical sciences, improved data analysis techniques, enhanced decision-making processes, and practical implications for industries and policymakers. The outcomes will guide the adoption and implementation of mathematical and statistical innovations, empowering stakeholders to transform data into actionable knowledge and drive meaningful outcomes.

Keywords: data analytics, mathematical innovations, knowledge extraction, decision-making

Procedia PDF Downloads 41
9901 Predicting Recessions with Bivariate Dynamic Probit Model: The Czech and German Case

Authors: Lukas Reznak, Maria Reznakova

Abstract:

Recession of an economy has a profound negative effect on all involved stakeholders. It follows that timely prediction of recessions has been of utmost interest both in the theoretical research and in practical macroeconomic modelling. Current mainstream of recession prediction is based on standard OLS models of continuous GDP using macroeconomic data. This approach is not suitable for two reasons: the standard continuous models are proving to be obsolete and the macroeconomic data are unreliable, often revised many years retroactively. The aim of the paper is to explore a different branch of recession forecasting research theory and verify the findings on real data of the Czech Republic and Germany. In the paper, the authors present a family of discrete choice probit models with parameters estimated by the method of maximum likelihood. In the basic form, the probits model a univariate series of recessions and expansions in the economic cycle for a given country. The majority of the paper deals with more complex model structures, namely dynamic and bivariate extensions. The dynamic structure models the autoregressive nature of recessions, taking into consideration previous economic activity to predict the development in subsequent periods. Bivariate extensions utilize information from a foreign economy by incorporating correlation of error terms and thus modelling the dependencies of the two countries. Bivariate models predict a bivariate time series of economic states in both economies and thus enhance the predictive performance. A vital enabler of timely and successful recession forecasting are reliable and readily available data. Leading indicators, namely the yield curve and the stock market indices, represent an ideal data base, as the pieces of information is available in advance and do not undergo any retroactive revisions. As importantly, the combination of yield curve and stock market indices reflect a range of macroeconomic and financial market investors’ trends which influence the economic cycle. These theoretical approaches are applied on real data of Czech Republic and Germany. Two models for each country were identified – each for in-sample and out-of-sample predictive purposes. All four followed a bivariate structure, while three contained a dynamic component.

Keywords: bivariate probit, leading indicators, recession forecasting, Czech Republic, Germany

Procedia PDF Downloads 225
9900 Relationship between Exercise Activity with Incidence of Overweight-Obesity in Medical Students

Authors: Randy M. Fitratullah, Afriwardi, Nurhayati

Abstract:

Overweight-obesity caused by exercise. The objective of this research is to analyze the relation between exercise with the incidence of overweight-obesity of medical students of medical faculty of Andalas Univesity batch 2013. This is an analytical observational research with case-control method. This research conducted in FK Unand on September-October 2015. The population of this research is medical students batch 2013. 26 samples (13 samples were case, 13 samples were control) were taken by purposive sampling technique and analysed using statistical univariate and bivariate analysis. Exercise questionnaire was used as research instruments. Based on the interview with questionnaire, anaerobic exercise was majority in case group and aerobic exercise was majority in control group. The case and control group have a rare category in exercise. Less category was majority in exercise duration of case and enough category was majority in control group. Bivariate analysis is using chi-square test with cell combining to 2x2 table, obtained p-value=0.097 in sort of exercise, p-value=1,000 in the frequency of exercise, and p-value=0,112 in duration of exercise, which means statistically unsignificant. There is no relation between exercise with the incidence of overweight-obesity of medical students of FK Unand batch 2013. For medical students suffers overweight-obesity is suggested for increase the frequency of exercise.

Keywords: overweight-obesity, exercise, aerobic, anaerobic, frequency, duration

Procedia PDF Downloads 226
9899 Lambda-Levelwise Statistical Convergence of a Sequence of Fuzzy Numbers

Authors: F. Berna Benli, Özgür Keskin

Abstract:

Lately, many mathematicians have been studied the statistical convergence of a sequence of fuzzy numbers. We know that Lambda-statistically convergence is a kind of convergence between ordinary convergence and statistical convergence. In this paper, we will introduce the new kind of convergence such as λ-levelwise statistical convergence. Then, we will define the concept of the λ-levelwise statistical cluster and limit points of a sequence of fuzzy numbers. Also, we will discuss the relations between the sets of λ-levelwise statistical cluster points and λ-levelwise statistical limit points of sequences of fuzzy numbers. This work has been extended in this paper, where some relations have been considered such that when lambda-statistical limit inferior and lambda-statistical limit superior for lambda-statistically convergent sequences of fuzzy numbers are equal. Furthermore, lambda-statistical boundedness condition for different sequences of fuzzy numbers has been studied.

Keywords: fuzzy number, λ-levelwise statistical cluster points, λ-levelwise statistical convergence, λ-levelwise statistical limit points, λ-statistical cluster points, λ-statistical convergence, λ-statistical limit points

Procedia PDF Downloads 433
9898 Analysis of Factors Affecting the Number of Infant and Maternal Mortality in East Java with Geographically Weighted Bivariate Generalized Poisson Regression Method

Authors: Luh Eka Suryani, Purhadi

Abstract:

Poisson regression is a non-linear regression model with response variable in the form of count data that follows Poisson distribution. Modeling for a pair of count data that show high correlation can be analyzed by Poisson Bivariate Regression. Data, the number of infant mortality and maternal mortality, are count data that can be analyzed by Poisson Bivariate Regression. The Poisson regression assumption is an equidispersion where the mean and variance values are equal. However, the actual count data has a variance value which can be greater or less than the mean value (overdispersion and underdispersion). Violations of this assumption can be overcome by applying Generalized Poisson Regression. Characteristics of each regency can affect the number of cases occurred. This issue can be overcome by spatial analysis called geographically weighted regression. This study analyzes the number of infant mortality and maternal mortality based on conditions in East Java in 2016 using Geographically Weighted Bivariate Generalized Poisson Regression (GWBGPR) method. Modeling is done with adaptive bisquare Kernel weighting which produces 3 regency groups based on infant mortality rate and 5 regency groups based on maternal mortality rate. Variables that significantly influence the number of infant and maternal mortality are the percentages of pregnant women visit health workers at least 4 times during pregnancy, pregnant women get Fe3 tablets, obstetric complication handled, clean household and healthy behavior, and married women with the first marriage age under 18 years.

Keywords: adaptive bisquare kernel, GWBGPR, infant mortality, maternal mortality, overdispersion

Procedia PDF Downloads 129
9897 Modeling of Daily Global Solar Radiation Using Ann Techniques: A Case of Study

Authors: Said Benkaciali, Mourad Haddadi, Abdallah Khellaf, Kacem Gairaa, Mawloud Guermoui

Abstract:

In this study, many experiments were carried out to assess the influence of the input parameters on the performance of multilayer perceptron which is one the configuration of the artificial neural networks. To estimate the daily global solar radiation on the horizontal surface, we have developed some models by using seven combinations of twelve meteorological and geographical input parameters collected from a radiometric station installed at Ghardaïa city (southern of Algeria). For selecting of best combination which provides a good accuracy, six statistical formulas (or statistical indicators) have been evaluated, such as the root mean square errors, mean absolute errors, correlation coefficient, and determination coefficient. We noted that multilayer perceptron techniques have the best performance, except when the sunshine duration parameter is not included in the input variables. The maximum of determination coefficient and correlation coefficient are equal to 98.20 and 99.11%. On the other hand, some empirical models were developed to compare their performances with those of multilayer perceptron neural networks. Results obtained show that the neural networks techniques give the best performance compared to the empirical models.

Keywords: empirical models, multilayer perceptron neural network, solar radiation, statistical formulas

Procedia PDF Downloads 311
9896 Exploratory Study of the Influencing Factors for Hotels' Competitors

Authors: Asma Ameur, Dhafer Malouche

Abstract:

Hotel competitiveness research is an essential phase of the marketing strategy for any hotel. Certainly, knowing the hotels' competitors helps the hotelier to grasp its position in the market and the citizen to make the right choice in picking a hotel. Thus, competitiveness is an important indicator that can be influenced by various factors. In fact, the issue of competitiveness, this ability to cope with competition, remains a difficult and complex concept to define and to exploit. Therefore, the purpose of this article is to make an exploratory study to calculate a competitiveness indicator for hotels. Further on, this paper makes it possible to determine the criteria of direct or indirect effect on the image and the perception of a hotel. The actual research is used to look into the right model for hotel ‘competitiveness. For this reason, we exploit different theoretical contributions in the field of machine learning. Thus, we use some statistical techniques such as the Principal Component Analysis (PCA) to reduce the dimensions, as well as other techniques of statistical modeling. This paper presents a survey covering of the techniques and methods in hotel competitiveness research. Furthermore, this study allows us to deduct the significant variables that influence the determination of hotel’s competitors. Lastly, the discussed experiences in this article found that the hotel competitors are influenced by several factors with different rates.

Keywords: competitiveness, e-reputation, hotels' competitors, online hotel’ review, principal component analysis, statistical modeling

Procedia PDF Downloads 83
9895 Acoustic Emission Techniques in Monitoring Low-Speed Bearing Conditions

Authors: Faisal AlShammari, Abdulmajid Addali, Mosab Alrashed

Abstract:

It is widely acknowledged that bearing failures are the primary reason for breakdowns in rotating machinery. These failures are extremely costly, particularly in terms of lost production. Roller bearings are widely used in industrial machinery and need to be maintained in good condition to ensure the continuing efficiency, effectiveness, and profitability of the production process. The research presented here is an investigation of the use of acoustic emission (AE) to monitor bearing conditions at low speeds. Many machines, particularly large, expensive machines operate at speeds below 100 rpm, and such machines are important to the industry. However, the overwhelming proportion of studies have investigated the use of AE techniques for condition monitoring of higher-speed machines (typically several hundred rpm, or even higher). Few researchers have investigated the application of these techniques to low-speed machines ( < 100 rpm). This paper addressed this omission and has established which, of the available, AE techniques are suitable for the detection of incipient faults and measurement of fault growth in low-speed bearings. The first objective of this paper program was to assess the applicability of AE techniques to monitor low-speed bearings. It was found that the measured statistical parameters successfully monitored bearing conditions at low speeds (10-100 rpm). The second objective was to identify which commonly used statistical parameters derived from the AE signal (RMS, kurtosis, amplitude and counts) could identify the onset of a fault in the out race. It was found that these parameters effectually identify the presence of a small fault seeded into the outer races. Also, it is concluded that rotational speed has a strong influence on the measured AE parameters but that they are entirely independent of the load under such load and speed conditions.

Keywords: acoustic emission, condition monitoring, NDT, statistical analysis

Procedia PDF Downloads 220
9894 Different Data-Driven Bivariate Statistical Approaches to Landslide Susceptibility Mapping (Uzundere, Erzurum, Turkey)

Authors: Azimollah Aleshzadeh, Enver Vural Yavuz

Abstract:

The main goal of this study is to produce landslide susceptibility maps using different data-driven bivariate statistical approaches; namely, entropy weight method (EWM), evidence belief function (EBF), and information content model (ICM), at Uzundere county, Erzurum province, in the north-eastern part of Turkey. Past landslide occurrences were identified and mapped from an interpretation of high-resolution satellite images, and earlier reports as well as by carrying out field surveys. In total, 42 landslide incidence polygons were mapped using ArcGIS 10.4.1 software and randomly split into a construction dataset 70 % (30 landslide incidences) for building the EWM, EBF, and ICM models and the remaining 30 % (12 landslides incidences) were used for verification purposes. Twelve layers of landslide-predisposing parameters were prepared, including total surface radiation, maximum relief, soil groups, standard curvature, distance to stream/river sites, distance to the road network, surface roughness, land use pattern, engineering geological rock group, topographical elevation, the orientation of slope, and terrain slope gradient. The relationships between the landslide-predisposing parameters and the landslide inventory map were determined using different statistical models (EWM, EBF, and ICM). The model results were validated with landslide incidences, which were not used during the model construction. In addition, receiver operating characteristic curves were applied, and the area under the curve (AUC) was determined for the different susceptibility maps using the success (construction data) and prediction (verification data) rate curves. The results revealed that the AUC for success rates are 0.7055, 0.7221, and 0.7368, while the prediction rates are 0.6811, 0.6997, and 0.7105 for EWM, EBF, and ICM models, respectively. Consequently, landslide susceptibility maps were classified into five susceptibility classes, including very low, low, moderate, high, and very high. Additionally, the portion of construction and verification landslides incidences in high and very high landslide susceptibility classes in each map was determined. The results showed that the EWM, EBF, and ICM models produced satisfactory accuracy. The obtained landslide susceptibility maps may be useful for future natural hazard mitigation studies and planning purposes for environmental protection.

Keywords: entropy weight method, evidence belief function, information content model, landslide susceptibility mapping

Procedia PDF Downloads 101
9893 Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering

Authors: Yunus Doğan, Ahmet Durap

Abstract:

Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.

Keywords: clustering algorithms, coastal engineering, data mining, data summarization, statistical methods

Procedia PDF Downloads 334
9892 Quantum Statistical Machine Learning and Quantum Time Series

Authors: Omar Alzeley, Sergey Utev

Abstract:

Minimizing a constrained multivariate function is the fundamental of Machine learning, and these algorithms are at the core of data mining and data visualization techniques. The decision function that maps input points to output points is based on the result of optimization. This optimization is the central of learning theory. One approach to complex systems where the dynamics of the system is inferred by a statistical analysis of the fluctuations in time of some associated observable is time series analysis. The purpose of this paper is a mathematical transition from the autoregressive model of classical time series to the matrix formalization of quantum theory. Firstly, we have proposed a quantum time series model (QTS). Although Hamiltonian technique becomes an established tool to detect a deterministic chaos, other approaches emerge. The quantum probabilistic technique is used to motivate the construction of our QTS model. The QTS model resembles the quantum dynamic model which was applied to financial data. Secondly, various statistical methods, including machine learning algorithms such as the Kalman filter algorithm, are applied to estimate and analyses the unknown parameters of the model. Finally, simulation techniques such as Markov chain Monte Carlo have been used to support our investigations. The proposed model has been examined by using real and simulated data. We establish the relation between quantum statistical machine and quantum time series via random matrix theory. It is interesting to note that the primary focus of the application of QTS in the field of quantum chaos was to find a model that explain chaotic behaviour. Maybe this model will reveal another insight into quantum chaos.

Keywords: machine learning, simulation techniques, quantum probability, tensor product, time series

Procedia PDF Downloads 428
9891 Statistical Tools for SFRA Diagnosis in Power Transformers

Authors: Rahul Srivastava, Priti Pundir, Y. R. Sood, Rajnish Shrivastava

Abstract:

For the interpretation of the signatures of sweep frequency response analysis(SFRA) of transformer different types of statistical techniques serves as an effective tool for doing either phase to phase comparison or sister unit comparison. In this paper with the discussion on SFRA several statistics techniques like cross correlation coefficient (CCF), root square error (RSQ), comparative standard deviation (CSD), Absolute difference, mean square error(MSE),Min-Max ratio(MM) are presented through several case studies. These methods require sample data size and spot frequencies of SFRA signatures that are being compared. The techniques used are based on power signal processing tools that can simplify result and limits can be created for the severity of the fault occurring in the transformer due to several short circuit forces or due to ageing. The advantages of using statistics techniques for analyzing of SFRA result are being indicated through several case studies and hence the results are obtained which determines the state of the transformer.

Keywords: absolute difference (DABS), cross correlation coefficient (CCF), mean square error (MSE), min-max ratio (MM-ratio), root square error (RSQ), standard deviation (CSD), sweep frequency response analysis (SFRA)

Procedia PDF Downloads 664
9890 Use of Multivariate Statistical Techniques for Water Quality Monitoring Network Assessment, Case of Study: Jequetepeque River Basin

Authors: Jose Flores, Nadia Gamboa

Abstract:

A proper water quality management requires the establishment of a monitoring network. Therefore, evaluation of the efficiency of water quality monitoring networks is needed to ensure high-quality data collection of critical quality chemical parameters. Unfortunately, in some Latin American countries water quality monitoring programs are not sustainable in terms of recording historical data or environmentally representative sites wasting time, money and valuable information. In this study, multivariate statistical techniques, such as principal components analysis (PCA) and hierarchical cluster analysis (HCA), are applied for identifying the most significant monitoring sites as well as critical water quality parameters in the monitoring network of the Jequetepeque River basin, in northern Peru. The Jequetepeque River basin, like others in Peru, shows socio-environmental conflicts due to economical activities developed in this area. Water pollution by trace elements in the upper part of the basin is mainly related with mining activity, and agricultural land lost due to salinization is caused by the extensive use of groundwater in the lower part of the basin. Since the 1980s, the water quality in the basin has been non-continuously assessed by public and private organizations, and recently the National Water Authority had established permanent water quality networks in 45 basins in Peru. Despite many countries use multivariate statistical techniques for assessing water quality monitoring networks, those instruments have never been applied for that purpose in Peru. For this reason, the main contribution of this study is to demonstrate that application of the multivariate statistical techniques could serve as an instrument that allows the optimization of monitoring networks using least number of monitoring sites as well as the most significant water quality parameters, which would reduce costs concerns and improve the water quality management in Peru. Main socio-economical activities developed and the principal stakeholders related to the water management in the basin are also identified. Finally, water quality management programs will also be discussed in terms of their efficiency and sustainability.

Keywords: PCA, HCA, Jequetepeque, multivariate statistical

Procedia PDF Downloads 325
9889 Remittances and Water Access: A Cross-Sectional Study of Sub Saharan Africa Countries

Authors: Narges Ebadi, Davod Ahmadi, Hiliary Monteith, Hugo Melgar-Quinonez

Abstract:

Migration cannot necessarily relieve pressure on water resources in origin communities, and male out-migration can increase the water management burden of women. However, inflows of financial remittances seem to offer possibilities of investing in improving drinking-water access. Therefore, remittances may be an important pathway for migrants to support water security. This paper explores the association between water access and the receipt of remittances in households in sub-Saharan Africa. Data from round 6 of the 'Afrobarometer' surveys in 2016 were used (n= 49,137). Descriptive, bivariate and multivariate statistical analyses were carried out in this study. Regardless of country, findings from descriptive analyses showed that approximately 80% of the respondents never received remittance, and 52% had enough clean water. Only one-fifth of the respondents had piped water supply inside the house (19.9%), and approximately 25% had access to a toilet inside the house. Bivariate analyses revealed that even though receiving remittances was significantly associated with water supply, the strength of association was very weak. However, other factors such as the area of residence (rural vs. urban), cash income frequencies, electricity access, and asset ownership were strongly associated with water access. Results from unadjusted multinomial logistic regression revealed that the probability of having no access to piped water increased among remittance recipients who received financial support at least once a month (OR=1.324) (p < 0.001). In contrast, those not receiving remittances were more likely to regularly have a water access concern (OR=1.294) (p < 0.001), and not have access to a latrine (OR=1.665) (p < 0.001). In conclusion, receiving remittances is significantly related to water access as the strength of odds ratios for socio-demographic factors was stronger.

Keywords: remittances, water access, SSA, migration

Procedia PDF Downloads 136
9888 On Modeling Data Sets by Means of a Modified Saddlepoint Approximation

Authors: Serge B. Provost, Yishan Zhang

Abstract:

A moment-based adjustment to the saddlepoint approximation is introduced in the context of density estimation. First applied to univariate distributions, this methodology is extended to the bivariate case. It then entails estimating the density function associated with each marginal distribution by means of the saddlepoint approximation and applying a bivariate adjustment to the product of the resulting density estimates. The connection to the distribution of empirical copulas will be pointed out. As well, a novel approach is proposed for estimating the support of distribution. As these results solely rely on sample moments and empirical cumulant-generating functions, they are particularly well suited for modeling massive data sets. Several illustrative applications will be presented.

Keywords: empirical cumulant-generating function, endpoints identification, saddlepoint approximation, sample moments, density estimation

Procedia PDF Downloads 124
9887 Examining Statistical Monitoring Approach against Traditional Monitoring Techniques in Detecting Data Anomalies during Conduct of Clinical Trials

Authors: Sheikh Omar Sillah

Abstract:

Introduction: Monitoring is an important means of ensuring the smooth implementation and quality of clinical trials. For many years, traditional site monitoring approaches have been critical in detecting data errors but not optimal in identifying fabricated and implanted data as well as non-random data distributions that may significantly invalidate study results. The objective of this paper was to provide recommendations based on best statistical monitoring practices for detecting data-integrity issues suggestive of fabrication and implantation early in the study conduct to allow implementation of meaningful corrective and preventive actions. Methodology: Electronic bibliographic databases (Medline, Embase, PubMed, Scopus, and Web of Science) were used for the literature search, and both qualitative and quantitative studies were sought. Search results were uploaded into Eppi-Reviewer Software, and only publications written in the English language from 2012 were included in the review. Gray literature not considered to present reproducible methods was excluded. Results: A total of 18 peer-reviewed publications were included in the review. The publications demonstrated that traditional site monitoring techniques are not efficient in detecting data anomalies. By specifying project-specific parameters such as laboratory reference range values, visit schedules, etc., with appropriate interactive data monitoring, statistical monitoring can offer early signals of data anomalies to study teams. The review further revealed that statistical monitoring is useful to identify unusual data patterns that might be revealing issues that could impact data integrity or may potentially impact study participants' safety. However, subjective measures may not be good candidates for statistical monitoring. Conclusion: The statistical monitoring approach requires a combination of education, training, and experience sufficient to implement its principles in detecting data anomalies for the statistical aspects of a clinical trial.

Keywords: statistical monitoring, data anomalies, clinical trials, traditional monitoring

Procedia PDF Downloads 40
9886 Investigating the Effects of Data Transformations on a Bi-Dimensional Chi-Square Test

Authors: Alexandru George Vaduva, Adriana Vlad, Bogdan Badea

Abstract:

In this research, we conduct a Monte Carlo analysis on a two-dimensional χ2 test, which is used to determine the minimum distance required for independent sampling in the context of chaotic signals. We investigate the impact of transforming initial data sets from any probability distribution to new signals with a uniform distribution using the Spearman rank correlation on the χ2 test. This transformation removes the randomness of the data pairs, and as a result, the observed distribution of χ2 test values differs from the expected distribution. We propose a solution to this problem and evaluate it using another chaotic signal.

Keywords: chaotic signals, logistic map, Pearson’s test, Chi Square test, bivariate distribution, statistical independence

Procedia PDF Downloads 50
9885 Influence of Atmospheric Pollutants on Child Respiratory Disease in Cartagena De Indias, Colombia

Authors: Jose A. Alvarez Aldegunde, Adrian Fernandez Sanchez, Matthew D. Menden, Bernardo Vila Rodriguez

Abstract:

Up to five statistical pre-processings have been carried out considering the pollutant records of the stations present in Cartagena de Indias, Colombia, also taking into account the childhood asthma incidence surveys conducted in hospitals in the city by the Health Ministry of Colombia for this study. These pre-processings have consisted of different techniques such as the determination of the quality of data collection, determination of the quality of the registration network, identification and debugging of errors in data collection, completion of missing data and purified data, as well as the improvement of the time scale of records. The characterization of the quality of the data has been conducted by means of density analysis of the pollutant registration stations using ArcGis Software and through mass balance techniques, making it possible to determine inconsistencies in the records relating the registration data between stations following the linear regression. The results obtained in this process have highlighted the positive quality in the pollutant registration process. Consequently, debugging of errors has allowed us to identify certain data as statistically non-significant in the incidence and series of contamination. This data, together with certain missing records in the series recorded by the measuring stations, have been completed by statistical imputation equations. Following the application of these prior processes, the basic series of incidence data for respiratory disease and pollutant records have allowed the characterization of the influence of pollutants on respiratory diseases such as, for example, childhood asthma. This characterization has been carried out using statistical correlation methods, including visual correlation, simple linear regression correlation and spectral analysis with PAST Software which identifies maximum periodicity cycles and minimums under the formula of the Lomb periodgram. In relation to part of the results obtained, up to eleven maximums and minimums considered contemporary between the incidence records and the particles have been identified taking into account the visual comparison. The spectral analyses that have been performed on the incidence and the PM2.5 have returned a series of similar maximum periods in both registers, which are at a maximum during a period of one year and another every 25 days (0.9 and 0.07 years). The bivariate analysis has managed to characterize the variable "Daily Vehicular Flow" in the ninth position of importance of a total of 55 variables. However, the statistical correlation has not obtained a favorable result, having obtained a low value of the R2 coefficient. The series of analyses conducted has demonstrated the importance of the influence of pollutants such as PM2.5 in the development of childhood asthma in Cartagena. The quantification of the influence of the variables has been able to determine that there is a 56% probability of dependence between PM2.5 and childhood respiratory asthma in Cartagena. Considering this justification, the study could be completed through the application of the BenMap Software, throwing a series of spatial results of interpolated values of the pollutant contamination records that exceeded the established legal limits (represented by homogeneous units up to the neighborhood level) and results of the impact on the exacerbation of pediatric asthma. As a final result, an economic estimate (in Colombian Pesos) of the monthly and individual savings derived from the percentage reduction of the influence of pollutants in relation to visits to the Hospital Emergency Room due to asthma exacerbation in pediatric patients has been granted.

Keywords: Asthma Incidence, BenMap, PM2.5, Statistical Analysis

Procedia PDF Downloads 83