Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 19878

Search results for: statistical model

19818 Statistic Regression and Open Data Approach for Identifying Economic Indicators That Influence e-Commerce

Authors: Apollinaire Barme, Simon Tamayo, Arthur Gaudron

Abstract:

This paper presents a statistical approach to identify explanatory variables linearly related to e-commerce sales. The proposed methodology allows specifying a regression model in order to quantify the relevance between openly available data (economic and demographic) and national e-commerce sales. The proposed methodology consists in collecting data, preselecting input variables, performing regressions for choosing variables and models, testing and validating. The usefulness of the proposed approach is twofold: on the one hand, it allows identifying the variables that influence e- commerce sales with an accessible approach. And on the other hand, it can be used to model future sales from the input variables. Results show that e-commerce is linearly dependent on 11 economic and demographic indicators.

Keywords: e-commerce, statistical modeling, regression, empirical research

Procedia PDF Downloads 227

19817 On Hyperbolic Gompertz Growth Model (HGGM)

Authors: S. O. Oyamakin, A. U. Chukwu,

Abstract:

We proposed a Hyperbolic Gompertz Growth Model (HGGM), which was developed by introducing a stabilizing parameter called θ using hyperbolic sine function into the classical gompertz growth equation. The resulting integral solution obtained deterministically was reprogrammed into a statistical model and used in modeling the height and diameter of Pines (Pinus caribaea). Its ability in model prediction was compared with the classical gompertz growth model, an approach which mimicked the natural variability of height/diameter increment with respect to age and therefore provides a more realistic height/diameter predictions using goodness of fit tests and model selection criteria. The Kolmogorov-Smirnov test and Shapiro-Wilk test was also used to test the compliance of the error term to normality assumptions while using testing the independence of the error term using the runs test. The mean function of top height/Dbh over age using the two models under study predicted closely the observed values of top height/Dbh in the hyperbolic gompertz growth models better than the source model (classical gompertz growth model) while the results of R2, Adj. R2, MSE, and AIC confirmed the predictive power of the Hyperbolic Monomolecular growth models over its source model.

Keywords: height, Dbh, forest, Pinus caribaea, hyperbolic, gompertz

Procedia PDF Downloads 443

19816 Prosody Generation in Neutral Speech Storytelling Application Using Tilt Model

Authors: Manjare Chandraprabha A., S. D. Shirbahadurkar, Manjare Anil S., Paithne Ajay N.

Abstract:

This paper proposes Intonation Modeling for Prosody generation in Neutral speech for Marathi (language spoken in Maharashtra, India) story telling applications. Nowadays audio story telling devices are very eminent for children. In this paper, we proposed tilt model for stressed words in Marathi for speech modification. Tilt model predicts modification in tone of neutral speech. GMM is used to identify stressed words for modification.

Keywords: tilt model, fundamental frequency, statistical parametric speech synthesis, GMM

Procedia PDF Downloads 393

19815 Estimating Knowledge Flow Patterns of Business Method Patents with a Hidden Markov Model

Authors: Yoonjung An, Yongtae Park

Abstract:

Knowledge flows are a critical source of faster technological progress and stouter economic growth. Knowledge flows have been accelerated dramatically with the establishment of a patent system in which each patent is required by law to disclose sufficient technical information for the invention to be recreated. Patent analysis, thus, has been widely used to help investigate technological knowledge flows. However, the existing research is limited in terms of both subject and approach. Particularly, in most of the previous studies, business method (BM) patents were not covered although they are important drivers of knowledge flows as other patents. In addition, these studies usually focus on the static analysis of knowledge flows. Some use approaches that incorporate the time dimension, yet they still fail to trace a true dynamic process of knowledge flows. Therefore, we investigate dynamic patterns of knowledge flows driven by BM patents using a Hidden Markov Model (HMM). An HMM is a popular statistical tool for modeling a wide range of time series data, with no general theoretical limit in regard to statistical pattern classification. Accordingly, it enables characterizing knowledge patterns that may differ by patent, sector, country and so on. We run the model in sets of backward citations and forward citations to compare the patterns of knowledge utilization and knowledge dissemination.

Keywords: business method patents, dynamic pattern, Hidden-Markov Model, knowledge flow

Procedia PDF Downloads 328

19814 Predicting National Football League (NFL) Match with Score-Based System

Authors: Marcho Setiawan Handok, Samuel S. Lemma, Abdoulaye Fofana, Naseef Mansoor

Abstract:

This paper is proposing a method to predict the outcome of the National Football League match with data from 2019 to 2022 and compare it with other popular models. The model uses open-source statistical data of each team, such as passing yards, rushing yards, fumbles lost, and scoring. Each statistical data has offensive and defensive. For instance, a data set of anticipated values for a specific matchup is created by comparing the offensive passing yards obtained by one team to the defensive passing yards given by the opposition. We evaluated the model’s performance by contrasting its result with those of established prediction algorithms. This research is using a neural network to predict the score of a National Football League match and then predict the winner of the game.

Keywords: game prediction, NFL, football, artificial neural network

Procedia PDF Downloads 84

19813 Group Sequential Covariate-Adjusted Response Adaptive Designs for Survival Outcomes

Authors: Yaxian Chen, Yeonhee Park

Abstract:

Driven by evolving FDA recommendations, modern clinical trials demand innovative designs that strike a balance between statistical rigor and ethical considerations. Covariate-adjusted response-adaptive (CARA) designs bridge this gap by utilizing patient attributes and responses to skew treatment allocation in favor of the treatment that is best for an individual patient’s profile. However, existing CARA designs for survival outcomes often hinge on specific parametric models, constraining their applicability in clinical practice. In this article, we address this limitation by introducing a CARA design for survival outcomes (CARAS) based on the Cox model and a variance estimator. This method addresses issues of model misspecification and enhances the flexibility of the design. We also propose a group sequential overlapweighted log-rank test to preserve type I error rate in the context of group sequential trials using extensive simulation studies to demonstrate the clinical benefit, statistical efficiency, and robustness to model misspecification of the proposed method compared to traditional randomized controlled trial designs and response-adaptive randomization designs.

Keywords: cox model, log-rank test, optimal allocation ratio, overlap weight, survival outcome

Procedia PDF Downloads 65

19812 Analysis of the Statistical Characterization of Significant Wave Data Exceedances for Designing Offshore Structures

Authors: Rui Teixeira, Alan O’Connor, Maria Nogal

Abstract:

The statistical theory of extreme events is progressively a topic of growing interest in all the fields of science and engineering. The changes currently experienced by the world, economic and environmental, emphasized the importance of dealing with extreme occurrences with improved accuracy. When it comes to the design of offshore structures, particularly offshore wind turbines, the importance of efficiently characterizing extreme events is of major relevance. Extreme events are commonly characterized by extreme values theory. As an alternative, the accurate modeling of the tails of statistical distributions and the characterization of the low occurrence events can be achieved with the application of the Peak-Over-Threshold (POT) methodology. The POT methodology allows for a more refined fit of the statistical distribution by truncating the data with a minimum value of a predefined threshold u. For mathematically approximating the tail of the empirical statistical distribution the Generalised Pareto is widely used. Although, in the case of the exceedances of significant wave data (H_s) the 2 parameters Weibull and the Exponential distribution, which is a specific case of the Generalised Pareto distribution, are frequently used as an alternative. The Generalized Pareto, despite the existence of practical cases where it is applied, is not completely recognized as the adequate solution to model exceedances over a certain threshold u. References that set the Generalised Pareto distribution as a secondary solution in the case of significant wave data can be identified in the literature. In this framework, the current study intends to tackle the discussion of the application of statistical models to characterize exceedances of wave data. Comparison of the application of the Generalised Pareto, the 2 parameters Weibull and the Exponential distribution are presented for different values of the threshold u. Real wave data obtained in four buoys along the Irish coast was used in the comparative analysis. Results show that the application of the statistical distributions to characterize significant wave data needs to be addressed carefully and in each particular case one of the statistical models mentioned fits better the data than the others. Depending on the value of the threshold u different results are obtained. Other variables of the fit, as the number of points and the estimation of the model parameters, are analyzed and the respective conclusions were drawn. Some guidelines on the application of the POT method are presented. Modeling the tail of the distributions shows to be, for the present case, a highly non-linear task and, due to its growing importance, should be addressed carefully for an efficient estimation of very low occurrence events.

Keywords: extreme events, offshore structures, peak-over-threshold, significant wave data

Procedia PDF Downloads 274

19811 Modelling Sudden Deaths from Myocardial Infarction and Stroke

Authors: Y. S. Yusoff, G. Streftaris, H. R Waters

Abstract:

Death within 30 days is an important factor to be looked into, as there is a significant risk of deaths immediately following or soon after, Myocardial Infarction (MI) or stroke. In this paper, we will model the deaths within 30 days following a Myocardial Infarction (MI) or stroke in the UK. We will see how the probabilities of sudden deaths from MI or stroke have changed over the period 1981-2000. We will model the sudden deaths using a Generalized Linear Model (GLM), fitted using the R statistical package, under a Binomial distribution for the number of sudden deaths. We parameterize our model using the extensive and detailed data from the Framingham Heart Study, adjusted to match UK rates. The results show that there is a reduction for the sudden deaths following a MI over time but no significant improvement for sudden deaths following a stroke.

Keywords: sudden deaths, myocardial infarction, stroke, ischemic heart disease

Procedia PDF Downloads 289

19810 Statistical and Land Planning Study of Tourist Arrivals in Greece during 2005-2016

Authors: Dimitra Alexiou

Abstract:

During the last 10 years, in spite of the economic crisis, the number of tourists arriving in Greece has increased, particularly during the tourist season from April to October. In this paper, the number of annual tourist arrivals is studied to explore their preferences with regard to the month of travel, the selected destinations, as well the amount of money spent. The collected data are processed with statistical methods, yielding numerical and graphical results. From the computation of statistical parameters and the forecasting with exponential smoothing, useful conclusions are arrived at that can be used by the Greek tourism authorities, as well as by tourist organizations, for planning purposes for the coming years. The results of this paper and the computed forecast can also be used for decision making by private tourist enterprises that are investing in Greece. With regard to the statistical methods, the method of Simple Exponential Smoothing of time series of data is employed. The search for a best forecast for 2017 and 2018 provides the value of the smoothing coefficient. For all statistical computations and graphics Microsoft Excel is used.

Keywords: tourism, statistical methods, exponential smoothing, land spatial planning, economy

Procedia PDF Downloads 265

19809 A Novel Method for Face Detection

Authors: H. Abas Nejad, A. R. Teymoori

Abstract:

Facial expression recognition is one of the open problems in computer vision. Robust neutral face recognition in real time is a major challenge for various supervised learning based facial expression recognition methods. This is due to the fact that supervised methods cannot accommodate all appearance variability across the faces with respect to race, pose, lighting, facial biases, etc. in the limited amount of training data. Moreover, processing each and every frame to classify emotions is not required, as the user stays neutral for the majority of the time in usual applications like video chat or photo album/web browsing. Detecting neutral state at an early stage, thereby bypassing those frames from emotion classification would save the computational power. In this work, we propose a light-weight neutral vs. emotion classification engine, which acts as a preprocessor to the traditional supervised emotion classification approaches. It dynamically learns neutral appearance at Key Emotion (KE) points using a textural statistical model, constructed by a set of reference neutral frames for each user. The proposed method is made robust to various types of user head motions by accounting for affine distortions based on a textural statistical model. Robustness to dynamic shift of KE points is achieved by evaluating the similarities on a subset of neighborhood patches around each KE point using the prior information regarding the directionality of specific facial action units acting on the respective KE point. The proposed method, as a result, improves ER accuracy and simultaneously reduces the computational complexity of ER system, as validated on multiple databases.

Keywords: neutral vs. emotion classification, Constrained Local Model, procrustes analysis, Local Binary Pattern Histogram, statistical model

Procedia PDF Downloads 340

19808 Immobilization of Lipase Enzyme by Low Cost Material: A Statistical Approach

Authors: Md. Z. Alam, Devi R. Asih, Md. N. Salleh

Abstract:

Immobilization of lipase enzyme produced from palm oil mill effluent (POME) by the activated carbon (AC) among the low cost support materials was optimized. The results indicated that immobilization of 94% was achieved by AC as the most suitable support material. A sequential optimization strategy based on a statistical experimental design, including one-factor-at-a-time (OFAT) method was used to determine the equilibrium time. Three components influencing lipase immobilization were optimized by the response surface methodology (RSM) based on the face-centered central composite design (FCCCD). On the statistical analysis of the results, the optimum enzyme concentration loading, agitation rate and carbon active dosage were found to be 30 U/ml, 300 rpm and 8 g/L respectively, with a maximum immobilization activity of 3732.9 U/g-AC after 2 hrs of immobilization. Analysis of variance (ANOVA) showed a high regression coefficient (R2) of 0.999, which indicated a satisfactory fit of the model with the experimental data. The parameters were statistically significant at p<0.05.

Keywords: activated carbon, POME based lipase, immobilization, adsorption

Procedia PDF Downloads 245

19807 Confidence Intervals for Process Capability Indices for Autocorrelated Data

Authors: Jane A. Luke

Abstract:

Persistent pressure passed on to manufacturers from escalating consumer expectations and the ever growing global competitiveness have produced a rapidly increasing interest in the development of various manufacturing strategy models. Academic and industrial circles are taking keen interest in the field of manufacturing strategy. Many manufacturing strategies are currently centered on the traditional concepts of focused manufacturing capabilities such as quality, cost, dependability and innovation. Process capability indices was conducted assuming that the process under study is in statistical control and independent observations are generated over time. However, in practice, it is very common to come across processes which, due to their inherent natures, generate autocorrelated observations. The degree of autocorrelation affects the behavior of patterns on control charts. Even, small levels of autocorrelation between successive observations can have considerable effects on the statistical properties of conventional control charts. When observations are autocorrelated the classical control charts exhibit nonrandom patterns and lack of control. Many authors have considered the effect of autocorrelation on the performance of statistical process control charts. In this paper, the effect of autocorrelation on confidence intervals for different PCIs was included. Stationary Gaussian processes is explained. Effect of autocorrelation on PCIs is described in detail. Confidence intervals for Cp and Cpk are constructed for PCIs when data are both independent and autocorrelated. Confidence intervals for Cp and Cpk are computed. Approximate lower confidence limits for various Cpk are computed assuming AR(1) model for the data. Simulation studies and industrial examples are considered to demonstrate the results.

Keywords: autocorrelation, AR(1) model, Bissell’s approximation, confidence intervals, statistical process control, specification limits, stationary Gaussian processes

Procedia PDF Downloads 391

19806 A Development of Creative Instruction Model through Digital Media

Authors: Kathaleeya Chanda, Panupong Chanplin, Suppara Charoenpoom

Abstract:

This purposes of the development of creative instruction model through digital media are to: 1) enable learners to learn from instruction media application; 2) help learners implementing instruction media correctly and appropriately; and 3) facilitate learners to apply technology for searching information and practicing skills to implement technology creatively. The sample group consists of 130 cases of secondary students studying in Bo Kluea School, Bo Kluea Nuea Sub-district, Bo Kluea District, Nan Province. The probability sampling was selected through the simple random sampling and the statistics used in this research are percentage, mean, standard deviation and one group pretest – posttest design. The findings are summarized as follows: The congruence index of instruction media for occupation and technology subjects is appropriate. By comparing between learning achievements before implementing the instruction media and learning achievements after implementing the instruction media, it is found that the posttest achievements are higher than the pretest achievements with statistical significance at the level of .05. For the learning achievements from instruction media implementation, pretest mean is 16.24 while posttest mean is 26.28. Besides, pretest and posttest results are compared and differences of mean are tested, the test results show that the posttest achievements are higher than the pretest achievements with statistical significance at the level of .05. This can be interpreted that the learners achieve better learning progress.

Keywords: teaching learning model, digital media, creative instruction model, Bo Kluea school

Procedia PDF Downloads 143

19805 A Cognitive Schema of Architectural Designing Activity

Authors: Abdelmalek Arrouf

Abstract:

This article sets up a cognitive schema of the architectural designing activity. It begins by outlining, theoretically, an a priori model of its general cognitive mechanisms. The obtained theoretical framework represents the designing activity as a complex system composed of three interrelated subsystems of cognitive actions: a subsystem of meaning production, one of morphology production and finally a subsystem of navigation between the two formers. A protocol analysis that uses statistical and informational tools is then used to measure the validity of the built schema. The model thus achieved shows that the designer begins by conceiving abstract meanings, which he then translates into shapes. That’s why we call it a semio-morphic model of the designing activity.

Keywords: designing actions, model of the design process, morphosis, protocol analysis, semiosis

Procedia PDF Downloads 172

19804 Spatially Downscaling Land Surface Temperature with a Non-Linear Model

Authors: Kai Liu

Abstract:

Remote sensing-derived land surface temperature (LST) can provide an indication of the temporal and spatial patterns of surface evapotranspiration (ET). However, the spatial resolution achieved by existing commonly satellite products is ~1 km, which remains too coarse for ET estimations. This paper proposed a model that can disaggregate coarse resolution MODIS LST at 1 km scale to fine spatial resolutions at the scale of 250 m. Our approach attempted to weaken the impacts of soil moisture and growing statues on LST variations. The proposed model spatially disaggregates the coarse thermal data by using a non-linear model involving Bowen ratio, normalized difference vegetation index (NDVI) and photochemical reflectance index (PRI). This LST disaggregation model was tested on two heterogeneous landscapes in central Iowa, USA and Heihe River, China, during the growing seasons. Statistical results demonstrated that our model achieved better than the two classical methods (DisTrad and TsHARP). Furthermore, using the surface energy balance model, it was observed that the estimated ETs using the disaggregated LST from our model were more accurate than those using the disaggregated LST from DisTrad and TsHARP.

Keywords: Bowen ration, downscaling, evapotranspiration, land surface temperature

Procedia PDF Downloads 330

19803 Economic Design of a Quality Control Chart for the Proportion of Defective Items

Authors: Encarnación Álvarez-Verdejo, Raúl Amor-Pulido, Pablo J. Moya-Fernández, Juan F. Muñoz-Rosas, Francisco J. Blanco-Encomienda

Abstract:

Many companies use the statistical tool named as statistical quality control, and which can have a high cost for the companies interested on these statistical tools. The evaluation of the quality of products and services is an important topic, but the reduction of the cost of the implantation of the statistical quality control also has important benefits for the companies. For this reason, it is important to implement a economic design for the various steps included into the statistical quality control. In this paper, we describe some relevant aspects related to the economic design of a quality control chart for the proportion of defective items. They are very important because the suggested issues can reduce the cost of implementing a quality control chart for the proportion of defective items. Note that the main purpose of this chart is to evaluate and control the proportion of defective items of a production process.

Keywords: proportion, type I error, economic plan, distribution function

Procedia PDF Downloads 444

19802 Implementation of Statistical Parameters to Form an Entropic Mathematical Models

Authors: Gurcharan Singh Buttar

Abstract:

It has been discovered that although these two areas, statistics, and information theory, are independent in their nature, they can be combined to create applications in multidisciplinary mathematics. This is due to the fact that where in the field of statistics, statistical parameters (measures) play an essential role in reference to the population (distribution) under investigation. Information measure is crucial in the study of ambiguity, assortment, and unpredictability present in an array of phenomena. The following communication is a link between the two, and it has been demonstrated that the well-known conventional statistical measures can be used as a measure of information.

Keywords: probability distribution, entropy, concavity, symmetry, variance, central tendency

Procedia PDF Downloads 157

19801 Quantum Statistical Mechanical Formulations of Three-Body Problems via Non-Local Potentials

Authors: A. Maghari, V. M. Maleki

Abstract:

In this paper, we present a quantum statistical mechanical formulation from our recently analytical expressions for partial-wave transition matrix of a three-particle system. We report the quantum reactive cross sections for three-body scattering processes 1 + (2,3)-> 1 + (2,3) as well as recombination 1 + (2,3) -> 2 + (3,1) between one atom and a weakly-bound dimer. The analytical expressions of three-particle transition matrices and their corresponding cross-sections were obtained from the three-dimensional Faddeev equations subjected to the rank-two non-local separable potentials of the generalized Yamaguchi form. The equilibrium quantum statistical mechanical properties such partition function and equation of state as well as non-equilibrium quantum statistical properties such as transport cross-sections and their corresponding transport collision integrals were formulated analytically. This leads to obtain the transport properties, such as viscosity and diffusion coefficient of a moderate dense gas.

Keywords: statistical mechanics, nonlocal separable potential, three-body interaction, faddeev equations

Procedia PDF Downloads 401

19800 Statistical and Analytical Comparison of GIS Overlay Modelings: An Appraisal on Groundwater Prospecting in Precambrian Metamorphics

Authors: Tapas Acharya, Monalisa Mitra

Abstract:

Overlay modeling is the most widely used conventional analysis for spatial decision support system. Overlay modeling requires a set of themes with different weightage computed in varied manners, which gives a resultant input for further integrated analysis. In spite of the popularity and most widely used technique; it gives inconsistent and erroneous results for similar inputs while processed in various GIS overlay techniques. This study is an attempt to compare and analyse the differences in the outputs of different overlay methods using GIS platform with same set of themes of the Precambrian metamorphic to obtain groundwater prospecting in Precambrian metamorphic rocks. The objective of the study is to emphasize the most suitable overlay method for groundwater prospecting in older Precambrian metamorphics. Seven input thematic layers like slope, Digital Elevation Model (DEM), soil thickness, lineament intersection density, average groundwater table fluctuation, stream density and lithology have been used in the spatial overlay models of fuzzy overlay, weighted overlay and weighted sum overlay methods to yield the suitable groundwater prospective zones. Spatial concurrence analysis with high yielding wells of the study area and the statistical comparative studies among the outputs of various overlay models using RStudio reveal that the Weighted Overlay model is the most efficient GIS overlay model to delineate the groundwater prospecting zones in the Precambrian metamorphic rocks.

Keywords: fuzzy overlay, GIS overlay model, groundwater prospecting, Precambrian metamorphics, weighted overlay, weighted sum overlay

Procedia PDF Downloads 128

19799 The Development of Statistical Analysis in Agriculture Experimental Design Using R

Authors: Somruay Apichatibutarapong, Chookiat Pudprommart

Abstract:

The purpose of this study was to develop of statistical analysis by using R programming via internet applied for agriculture experimental design. Data were collected from 65 items in completely randomized design, randomized block design, Latin square design, split plot design, factorial design and nested design. The quantitative approach was used to investigate the quality of learning media on statistical analysis by using R programming via Internet by six experts and the opinions of 100 students who interested in experimental design and applied statistics. It was revealed that the experts’ opinions were good in all contents except a usage of web board and the students’ opinions were good in overall and all items.

Keywords: experimental design, r programming, applied statistics, statistical analysis

Procedia PDF Downloads 369

19798 Prediction of Marijuana Use among Iranian Early Youth: an Application of Integrative Model of Behavioral Prediction

Authors: Mehdi Mirzaei Alavijeh, Farzad Jalilian

Abstract:

Background: Marijuana is the most widely used illicit drug worldwide, especially among adolescents and young adults, which can cause numerous complications. The aim of this study was to determine the pattern, motivation use, and factors related to marijuana use among Iranian youths based on the integrative model of behavioral prediction Methods: A cross-sectional study was conducted among 174 youths marijuana user in Kermanshah County and Isfahan County, during summer 2014 which was selected with the convenience sampling for participation in this study. A self-reporting questionnaire was applied for collecting data. Data were analyzed by SPSS version 21 using bivariate correlations and linear regression statistical tests. Results: The mean marijuana use of respondents was 4.60 times at during week [95% CI: 4.06, 5.15]. Linear regression statistical showed, the structures of integrative model of behavioral prediction accounted for 36% of the variation in the outcome measure of the marijuana use at during week (R2 = 36% & P < 0.001); and among them attitude, marijuana refuse, and subjective norms were a stronger predictors. Conclusion: Comprehensive health education and prevention programs need to emphasize on cognitive factors that predict youth’s health-related behaviors. Based on our findings it seems, designing educational and behavioral intervention for reducing positive belief about marijuana, marijuana self-efficacy refuse promotion and reduce subjective norms encourage marijuana use has an effective potential to protect youths marijuana use.

Keywords: marijuana, youth, integrative model of behavioral prediction, Iran

Procedia PDF Downloads 554

19797 A Framework for Auditing Multilevel Models Using Explainability Methods

Authors: Debarati Bhaumik, Diptish Dey

Abstract:

Multilevel models, increasingly deployed in industries such as insurance, food production, and entertainment within functions such as marketing and supply chain management, need to be transparent and ethical. Applications usually result in binary classification within groups or hierarchies based on a set of input features. Using open-source datasets, we demonstrate that popular explainability methods, such as SHAP and LIME, consistently underperform inaccuracy when interpreting these models. They fail to predict the order of feature importance, the magnitudes, and occasionally even the nature of the feature contribution (negative versus positive contribution to the outcome). Besides accuracy, the computational intractability of SHAP for binomial classification is a cause of concern. For transparent and ethical applications of these hierarchical statistical models, sound audit frameworks need to be developed. In this paper, we propose an audit framework for technical assessment of multilevel regression models focusing on three aspects: (i) model assumptions & statistical properties, (ii) model transparency using different explainability methods, and (iii) discrimination assessment. To this end, we undertake a quantitative approach and compare intrinsic model methods with SHAP and LIME. The framework comprises a shortlist of KPIs, such as PoCE (Percentage of Correct Explanations) and MDG (Mean Discriminatory Gap) per feature, for each of these three aspects. A traffic light risk assessment method is furthermore coupled to these KPIs. The audit framework will assist regulatory bodies in performing conformity assessments of AI systems using multilevel binomial classification models at businesses. It will also benefit businesses deploying multilevel models to be future-proof and aligned with the European Commission’s proposed Regulation on Artificial Intelligence.

Keywords: audit, multilevel model, model transparency, model explainability, discrimination, ethics

Procedia PDF Downloads 95

19796 Methodology for Obtaining Static Alignment Model

Authors: Lely A. Luengas, Pedro R. Vizcaya, Giovanni Sánchez

Abstract:

In this paper, a methodology is presented to obtain the Static Alignment Model for any transtibial amputee person. The proposed methodology starts from experimental data collected on the Hospital Militar Central, Bogotá, Colombia. The effects of transtibial prosthesis malalignment on amputees were measured in terms of joint angles, center of pressure (COP) and weight distribution. Some statistical tools are used to obtain the model parameters. Mathematical predictive models of prosthetic alignment were created. The proposed models are validated in amputees and finding promising results for the prosthesis Static Alignment. Static alignment process is unique to each subject; nevertheless the proposed methodology can be used in each transtibial amputee.

Keywords: information theory, prediction model, prosthetic alignment, transtibial prosthesis

Procedia PDF Downloads 257

19795 Economic Valuation of Forest Landscape Function Using a Conditional Logit Model

Authors: A. J. Julius, E. Imoagene, O. A. Ganiyu

Abstract:

The purpose of this study is to estimate the economic value of the services and functions rendered by the forest landscape using a conditional logit model. For this study, attributes and levels of forest landscape were chosen; specifically, attributes include topographical forest type, forest type, forest density, recreational factor (side trip, accessibility of valley), and willingness to participate (WTP). Based on these factors, 48 choices sets with balanced and orthogonal form using statistical analysis system (SAS) 9.1 was adopted. The efficiency of the questionnaire was 6.02 (D-Error. 0.1), and choice set and socio-economic variables were analyzed. To reduce the cognitive load of respondents, the 48 choice sets were divided into 4 types in the questionnaire, so that respondents could respond to 12 choice sets, respectively. The study populations were citizens from seven metropolitan cities including Ibadan, Ilorin, Osogbo, etc. and annual WTP per household was asked by using the interview questionnaire, a total of 267 copies were recovered. As a result, Oshogbo had 0.45, and the statistical similarities could not be found except for urban forests, forest density, recreational factor, and level of WTP. Average annual WTP per household for forest landscape was 104,758 Naira (Nigerian currency) based on the outcome from this model, total economic value of the services and functions enjoyed from Nigerian forest landscape has reached approximately 1.6 trillion Naira.

Keywords: economic valuation, urban cities, services, forest landscape, logit model, nigeria

Procedia PDF Downloads 133

19794 Transportation Accidents Mortality Modeling in Thailand

Authors: W. Sriwattanapongse, S. Prasitwattanaseree, S. Wongtrangan

Abstract:

The transportation accidents mortality is a major problem that leads to loss of human lives, and economic. The objective was to identify patterns of statistical modeling for estimating mortality rates due to transportation accidents in Thailand by using data from 2000 to 2009. The data was taken from the death certificate, vital registration database. The number of deaths and mortality rates were computed classifying by gender, age, year and region. There were 114,790 cases of transportation accidents deaths. The highest average age-specific transport accident mortality rate is 3.11 per 100,000 per year in males, Southern region and the lowest average age-specific transport accident mortality rate is 1.79 per 100,000 per year in females, North-East region. Linear, poisson and negative binomial models were chosen for fitting statistical model. Among the models fitted, the best was chosen based on the analysis of deviance and AIC. The negative binomial model was clearly appropriate fitted.

Keywords: transportation accidents, mortality, modeling, analysis of deviance

Procedia PDF Downloads 245

19793 Metamorphic Computer Virus Classification Using Hidden Markov Model

Authors: Babak Bashari Rad

Abstract:

A metamorphic computer virus uses different code transformation techniques to mutate its body in duplicated instances. Characteristics and function of new instances are mostly similar to their parents, but they cannot be easily detected by the majority of antivirus in market, as they depend on string signature-based detection techniques. The purpose of this research is to propose a Hidden Markov Model for classification of metamorphic viruses in executable files. In the proposed solution, portable executable files are inspected to extract the instructions opcodes needed for the examination of code. A Hidden Markov Model trained on portable executable files is employed to classify the metamorphic viruses of the same family. The proposed model is able to generate and recognize common statistical features of mutated code. The model has been evaluated by examining the model on a test data set. The performance of the model has been practically tested and evaluated based on False Positive Rate, Detection Rate and Overall Accuracy. The result showed an acceptable performance with high average of 99.7% Detection Rate.

Keywords: malware classification, computer virus classification, metamorphic virus, metamorphic malware, Hidden Markov Model

Procedia PDF Downloads 315

19792 Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering

Authors: Yunus Doğan, Ahmet Durap

Abstract:

Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.

Keywords: clustering algorithms, coastal engineering, data mining, data summarization, statistical methods

Procedia PDF Downloads 361

19791 Content-Based Color Image Retrieval Based on the 2-D Histogram and Statistical Moments

Authors: El Asnaoui Khalid, Aksasse Brahim, Ouanan Mohammed

Abstract:

In this paper, we are interested in the problem of finding similar images in a large database. For this purpose we propose a new algorithm based on a combination of the 2-D histogram intersection in the HSV space and statistical moments. The proposed histogram is based on a 3x3 window and not only on the intensity of the pixel. This approach can overcome the drawback of the conventional 1-D histogram which is ignoring the spatial distribution of pixels in the image, while the statistical moments are used to escape the effects of the discretisation of the color space which is intrinsic to the use of histograms. We compare the performance of our new algorithm to various methods of the state of the art and we show that it has several advantages. It is fast, consumes little memory and requires no learning. To validate our results, we apply this algorithm to search for similar images in different image databases.

Keywords: 2-D histogram, statistical moments, indexing, similarity distance, histograms intersection

Procedia PDF Downloads 457

19790 Detection of Change Points in Earthquakes Data: A Bayesian Approach

Authors: F. A. Al-Awadhi, D. Al-Hulail

Abstract:

In this study, we applied the Bayesian hierarchical model to detect single and multiple change points for daily earthquake body wave magnitude. The change point analysis is used in both backward (off-line) and forward (on-line) statistical research. In this study, it is used with the backward approach. Different types of change parameters are considered (mean, variance or both). The posterior model and the conditional distributions for single and multiple change points are derived and implemented using BUGS software. The model is applicable for any set of data. The sensitivity of the model is tested using different prior and likelihood functions. Using Mb data, we concluded that during January 2002 and December 2003, three changes occurred in the mean magnitude of Mb in Kuwait and its vicinity.

Keywords: multiple change points, Markov Chain Monte Carlo, earthquake magnitude, hierarchical Bayesian mode

Procedia PDF Downloads 458

19789 Evaluation of the Weight-Based and Fat-Based Indices in Relation to Basal Metabolic Rate-to-Weight Ratio

Authors: Orkide Donma, Mustafa M. Donma

Abstract:

Basal metabolic rate is questioned as a risk factor for weight gain. The relations between basal metabolic rate and body composition have not been cleared yet. The impact of fat mass on basal metabolic rate is also uncertain. Within this context, indices based upon total body mass as well as total body fat mass are available. In this study, the aim is to investigate the potential clinical utility of these indices in the adult population. 287 individuals, aged from 18 to 79 years, were included into the scope of the study. Based upon body mass index values, 10 underweight, 88 normal, 88 overweight, 81 obese, and 20 morbid obese individuals participated. Anthropometric measurements including height (m), and weight (kg) were performed. Body mass index, diagnostic obesity notation model assessment index I, diagnostic obesity notation model assessment index II, basal metabolic rate-to-weight ratio were calculated. Total body fat mass (kg), fat percent (%), basal metabolic rate, metabolic age, visceral adiposity, fat mass of upper as well as lower extremities and trunk, obesity degree were measured by TANITA body composition monitor using bioelectrical impedance analysis technology. Statistical evaluations were performed by statistical package (SPSS) for Windows Version 16.0. Scatterplots of individual measurements for the parameters concerning correlations were drawn. Linear regression lines were displayed. The statistical significance degree was accepted as p < 0.05. The strong correlations between body mass index and diagnostic obesity notation model assessment index I as well as diagnostic obesity notation model assessment index II were obtained (p < 0.001). A much stronger correlation was detected between basal metabolic rate and diagnostic obesity notation model assessment index I in comparison with that calculated for basal metabolic rate and body mass index (p < 0.001). Upon consideration of the associations between basal metabolic rate-to-weight ratio and these three indices, the best association was observed between basal metabolic rate-to-weight and diagnostic obesity notation model assessment index II. In a similar manner, this index was highly correlated with fat percent (p < 0.001). Being independent of the indices, a strong correlation was found between fat percent and basal metabolic rate-to-weight ratio (p < 0.001). Visceral adiposity was much strongly correlated with metabolic age when compared to that with chronological age (p < 0.001). In conclusion, all three indices were associated with metabolic age, but not with chronological age. Diagnostic obesity notation model assessment index II values were highly correlated with body mass index values throughout all ranges starting with underweight going towards morbid obesity. This index is the best in terms of its association with basal metabolic rate-to-weight ratio, which can be interpreted as basal metabolic rate unit.

Keywords: basal metabolic rate, body mass index, children, diagnostic obesity notation model assessment index, obesity

Procedia PDF Downloads 150