Search results for: statistical regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 6455

Search results for: statistical regression

6215 From Theory to Practice: Harnessing Mathematical and Statistical Sciences in Data Analytics

Authors: Zahid Ullah, Atlas Khan

Abstract:

The rapid growth of data in diverse domains has created an urgent need for effective utilization of mathematical and statistical sciences in data analytics. This abstract explores the journey from theory to practice, emphasizing the importance of harnessing mathematical and statistical innovations to unlock the full potential of data analytics. Drawing on a comprehensive review of existing literature and research, this study investigates the fundamental theories and principles underpinning mathematical and statistical sciences in the context of data analytics. It delves into key mathematical concepts such as optimization, probability theory, statistical modeling, and machine learning algorithms, highlighting their significance in analyzing and extracting insights from complex datasets. Moreover, this abstract sheds light on the practical applications of mathematical and statistical sciences in real-world data analytics scenarios. Through case studies and examples, it showcases how mathematical and statistical innovations are being applied to tackle challenges in various fields such as finance, healthcare, marketing, and social sciences. These applications demonstrate the transformative power of mathematical and statistical sciences in data-driven decision-making. The abstract also emphasizes the importance of interdisciplinary collaboration, as it recognizes the synergy between mathematical and statistical sciences and other domains such as computer science, information technology, and domain-specific knowledge. Collaborative efforts enable the development of innovative methodologies and tools that bridge the gap between theory and practice, ultimately enhancing the effectiveness of data analytics. Furthermore, ethical considerations surrounding data analytics, including privacy, bias, and fairness, are addressed within the abstract. It underscores the need for responsible and transparent practices in data analytics, and highlights the role of mathematical and statistical sciences in ensuring ethical data handling and analysis. In conclusion, this abstract highlights the journey from theory to practice in harnessing mathematical and statistical sciences in data analytics. It showcases the practical applications of these sciences, the importance of interdisciplinary collaboration, and the need for ethical considerations. By bridging the gap between theory and practice, mathematical and statistical sciences contribute to unlocking the full potential of data analytics, empowering organizations and decision-makers with valuable insights for informed decision-making.

Keywords: data analytics, mathematical sciences, optimization, machine learning, interdisciplinary collaboration, practical applications

Procedia PDF Downloads 65
6214 Robustified Asymmetric Logistic Regression Model for Global Fish Stock Assessment

Authors: Osamu Komori, Shinto Eguchi, Hiroshi Okamura, Momoko Ichinokawa

Abstract:

The long time-series data on population assessments are essential for global ecosystem assessment because the temporal change of biomass in such a database reflects the status of global ecosystem properly. However, the available assessment data usually have limited sample sizes and the ratio of populations with low abundance of biomass (collapsed) to those with high abundance (non-collapsed) is highly imbalanced. To allow for the imbalance and uncertainty involved in the ecological data, we propose a binary regression model with mixed effects for inferring ecosystem status through an asymmetric logistic model. In the estimation equation, we observe that the weights for the non-collapsed populations are relatively reduced, which in turn puts more importance on the small number of observations of collapsed populations. Moreover, we extend the asymmetric logistic regression model using propensity score to allow for the sample biases observed in the labeled and unlabeled datasets. It robustified the estimation procedure and improved the model fitting.

Keywords: double robust estimation, ecological binary data, mixed effect logistic regression model, propensity score

Procedia PDF Downloads 237
6213 Comparing Groundwater Fluoride Level with WHO Guidelines and Classifying At-Risk Age Groups; Based on Health Risk Assessment

Authors: Samaneh Abolli, Kamyar Yaghmaeian, Ali Arab Aradani, Mahmood Alimohammadi

Abstract:

The main route of fluoride uptake is drinking water. Fluoride absorption in the acceptable range (0.5-1.5 mg L-¹) is suitable for the body, but it's too much consumption can have irreversible health effects. To compare fluoride concentration with the WHO guidelines, 112 water samples were taken from groundwater aquifers in 22 villages of Garmsar County, the central part of Iran, during 2018 to 2019.Fluoride concentration was measured by the SPANDS method, and its non-carcinogenic impacts were calculated using EDI and HQ. The statistical population was divided into four categories of infant, children, teenagers, and adults. Linear regression and Spearman rank correlation coefficient tests were used to investigate the relationships between the well's depth and fluoride concentration in the water samples. The annual mean concentrations of fluoride in 2018 and2019 were 0.75 and 0.64 mg -¹ and, the fluoride mean concentration in the samples classifying the cold and hot seasons of the studied years was 0.709 and 0.689 mg L-¹, respectively. The amount of fluoride in 27% of the samples in both years was less than the acceptable minimum (0.5 mg L-¹). Also, 11% of the samples in2018 (6 samples) had fluoride levels higher than 1.5 mg L-¹. The HQ showed that the children were vulnerable; teenagers and adults were in the next ranks, respectively. Statistical tests showed a reverse and significant correlation (R2 = 0.02, < 0.0001) between well depth and fluoride content. The border between the usefulness/harmfulness of fluoride is very narrow and requires extensive studies.

Keywords: fluoride, groundwater, health risk assessment, hazard quotient, Garmsar

Procedia PDF Downloads 44
6212 Regional Hydrological Extremes Frequency Analysis Based on Statistical and Hydrological Models

Authors: Hadush Kidane Meresa

Abstract:

The hydrological extremes frequency analysis is the foundation for the hydraulic engineering design, flood protection, drought management and water resources management and planning to utilize the available water resource to meet the desired objectives of different organizations and sectors in a country. This spatial variation of the statistical characteristics of the extreme flood and drought events are key practice for regional flood and drought analysis and mitigation management. For different hydro-climate of the regions, where the data set is short, scarcity, poor quality and insufficient, the regionalization methods are applied to transfer at-site data to a region. This study aims in regional high and low flow frequency analysis for Poland River Basins. Due to high frequent occurring of hydrological extremes in the region and rapid water resources development in this basin have caused serious concerns over the flood and drought magnitude and frequencies of the river in Poland. The magnitude and frequency result of high and low flows in the basin is needed for flood and drought planning, management and protection at present and future. Hydrological homogeneous high and low flow regions are formed by the cluster analysis of site characteristics, using the hierarchical and C- mean clustering and PCA method. Statistical tests for regional homogeneity are utilized, by Discordancy and Heterogeneity measure tests. In compliance with results of the tests, the region river basin has been divided into ten homogeneous regions. In this study, frequency analysis of high and low flows using AM for high flow and 7-day minimum low flow series is conducted using six statistical distributions. The use of L-moment and LL-moment method showed a homogeneous region over entire province with Generalized logistic (GLOG), Generalized extreme value (GEV), Pearson type III (P-III), Generalized Pareto (GPAR), Weibull (WEI) and Power (PR) distributions as the regional drought and flood frequency distributions. The 95% percentile and Flow duration curves of 1, 7, 10, 30 days have been plotted for 10 stations. However, the cluster analysis performed two regions in west and east of the province where L-moment and LL-moment method demonstrated the homogeneity of the regions and GLOG and Pearson Type III (PIII) distributions as regional frequency distributions for each region, respectively. The spatial variation and regional frequency distribution of flood and drought characteristics for 10 best catchment from the whole region was selected and beside the main variable (streamflow: high and low) we used variables which are more related to physiographic and drainage characteristics for identify and delineate homogeneous pools and to derive best regression models for ungauged sites. Those are mean annual rainfall, seasonal flow, average slope, NDVI, aspect, flow length, flow direction, maximum soil moisture, elevation, and drainage order. The regional high-flow or low-flow relationship among one streamflow characteristics with (AM or 7-day mean annual low flows) some basin characteristics is developed using Generalized Linear Mixed Model (GLMM) and Generalized Least Square (GLS) regression model, providing a simple and effective method for estimation of flood and drought of desired return periods for ungauged catchments.

Keywords: flood , drought, frequency, magnitude, regionalization, stochastic, ungauged, Poland

Procedia PDF Downloads 566
6211 A Cross-Gender Statistical Analysis of Tuvinian Intonation Features in Comparison With Uzbek and Azerbaijani

Authors: Daria Beziakina, Elena Bulgakova

Abstract:

The paper deals with cross-gender and cross-linguistic comparison of pitch characteristics for Tuvinian with two other Turkic languages - Uzbek and Azerbaijani, based on the results of statistical analysis of pitch parameter values and intonation patterns used by male and female speakers. The main goal of our work is to obtain the ranges of pitch parameter values typical for Tuvinian speakers for the purpose of automatic language identification. We also propose a cross-gender analysis of declarative intonation in the poorly studied Tuvinian language. The ranges of pitch parameter values were obtained by means of specially developed software that deals with the distribution of pitch values and allows us to obtain statistical language-specific pitch intervals.

Keywords: speech analysis, statistical analysis, speaker recognition, identification of person

Procedia PDF Downloads 320
6210 Urban-Rural Inequality in Mexico after Nafta: A Quantile Regression Analysis

Authors: Rene Valdiviezo-Issa

Abstract:

In this paper, we use Mexico’s Households Income and Expenditures (ENIGH) survey to explain the behaviour that the urban-rural expenditure gap has had since Mexico’s incorporation to the North American Free Trade Agreement (NAFTA) in 1994 and we compare it with the latest available survey, which took place in 2014. We use real trimestral expenditure per capita (RTEPC) as the measure of welfare. We use quantile regressions and a quantile regression decomposition to describe the gap between urban and rural distributions of log RTEPC. We discover that the decrease in the difference between the urban and rural distributions of log RTEPC, or inequality, is motivated because of a deprivation of the urban areas, in very specific characteristics, rather than an improvement of the urban areas. When using the decomposition we observe that the gap is primarily brought about because differences in returns to covariates between the urban and rural areas.

Keywords: quantile regression, urban-rural inequality, inequality in Mexico, income decompositon

Procedia PDF Downloads 256
6209 A Comparative Study of Cognitive Factors Affecting Social Distancing among Vaccinated and Unvaccinated Filipinos

Authors: Emmanuel Carlo Belara, Albert John Dela Merced, Mark Anthony Dominguez, Diomari Erasga, Jerome Ferrer, Bernard Ombrog

Abstract:

Social distancing errors are a common prevalence between vaccinated and unvaccinated in the Filipino community. This study aims to identify and relate the factors on how they affect our daily lives. Observed factors include memory, attention, anxiety, decision-making, and stress. Upon applying the ergonomic tools and statistical treatment such as t-test and multiple linear regression, stress and attention turned out to have the most impact to the errors of social distancing.

Keywords: vaccinated, unvaccinated, socoal distancing, filipinos

Procedia PDF Downloads 165
6208 Statistical Analysis of Parameters Effects on Maximum Strain and Torsion Angle of FRP Honeycomb Sandwich Panels Subjected to Torsion

Authors: Mehdi Modabberifar, Milad Roodi, Ehsan Souri

Abstract:

In recent years, honeycomb fiber reinforced plastic (FRP) sandwich panels have been increasingly used in various industries. Low weight, low price, and high mechanical strength are the benefits of these structures. However, their mechanical properties and behavior have not been fully explored. The objective of this study is to conduct a combined numerical-statistical investigation of honeycomb FRP sandwich beams subject to torsion load. In this paper, the effect of geometric parameters of the sandwich panel on the maximum shear strain in both face and core and angle of torsion in a honeycomb FRP sandwich structures in torsion is investigated. The effect of Parameters including core thickness, face skin thickness, cell shape, cell size, and cell thickness on mechanical behavior of the structure were numerically investigated. Main effects of factors were considered in this paper and regression equations were derived. Taguchi method was employed as experimental design and an optimum parameter combination for the maximum structure stiffness has been obtained. The results showed that cell size and face skin thickness have the most significant impacts on torsion angle, maximum shear strain in face and core.

Keywords: finite element, honeycomb FRP sandwich panel, torsion, civil engineering

Procedia PDF Downloads 388
6207 Comparative Study to Evaluate Chronological Age and Dental Age in North Indian Population Using Cameriere Method

Authors: Ranjitkumar Patil

Abstract:

Age estimation has its importance in forensic dentistry. Dental age estimation has emerged as an alternative to skeletal age determination. The methods based on stages of tooth formation, as appreciated on radiographs, seems to be more appropriate in the assessment of age than those based on skeletal development. The study was done to evaluate dental age in north Indian population using Cameriere’s method. Aims/Objectives: The study was conducted to assess the dental age of North Indian children using Cameriere’smethodand to compare the chronological age and dental age for validation of the Cameriere’smethod in the north Indian population. A comparative study of 02 year duration on the OPG (using PLANMECA Promax 3D) data of 497 individuals with age ranging from 5 to 15 years was done based on simple random technique ethical approval obtained from the institutional ethical committee. The data was obtained based on inclusion and exclusion criteria was analyzed by a software for dental age estimation. Statistical analysis: Student’s t test was used to compare the morphological variables of males with those of females and to compare observed age with estimated age. Regression formula was also calculated. Results: Present study was a comparative study of 497 subjects with a distribution between male and female, with their dental age assessed by using Panoramic radiograph, following the method described by Cameriere, which is widely accepted. Statistical analysis in our study indicated that gender does not have a significant influence on age estimation. (R2= 0.787). Conclusion: This infers that cameriere’s method can be effectively applied in north Indianpopulation.

Keywords: Forensic, Chronological Age, Dental Age, Skeletal Age

Procedia PDF Downloads 64
6206 Factor Affecting Decision Making for Tourism in Thailand by ASEAN Tourists

Authors: Sakul Jariyachansit

Abstract:

The purposes of this research were to investigate and to compare the factors affecting the decision for Tourism in Thailand by ASEAN Tourists and among ASEAN community tourists. Samples in this research were 400 ASEAN Community Tourists who travel in Thailand at Suvarnabhumi Airport during November 2016 - February 2016. The researchers determined the sample size by using the formula Taro Yamane at 95% confidence level tolerances 0.05. The English questionnaire, research instrument, was distributed by convenience sampling, for gathering data. Descriptive statistics was applied to analyze percentages, mean and standard deviation and used for hypothesis testing. The statistical analysis by multiple regression analysis (Multiple Regression) was employed to prove the relationship hypotheses at the significant level of 0.01. The results showed that majority of the respondents indicated the factors affecting the decision for Tourism in Thailand by ASEAN Tourists, in general there were a moderate effects and the mean of each side is moderate. Transportation was the most influential factor for tourism in Thailand. Therefore, the mode of transport, information, infrastructure and personnel are very important to factor affecting decision making for tourism in Thailand by ASEAN tourists. From the hypothesis testing, it can be predicted that the decision for choosing Tourism in Thailand is at R2 = 0.449. The predictive equation is decision for choosing Tourism in Thailand = 1.195 (constant value) + 0.425 (tourist attraction) +0.217 (information received) and transportation factors, tourist attraction, information, human resource and infrastructure at the significant level of 0.01.

Keywords: factor, decision making, ASEAN tourists, tourism in Thailand

Procedia PDF Downloads 179
6205 Development of Sleep Quality Index Using Heart Rate

Authors: Dongjoo Kim, Chang-Sik Son, Won-Seok Kang

Abstract:

Adequate sleep affects various parts of one’s overall physical and mental life. As one of the methods in determining the appropriate amount of sleep, this research presents a heart rate based sleep quality index. In order to evaluate sleep quality using the heart rate, sleep data from 280 subjects taken over one month are used. Their sleep data are categorized by a three-part heart rate range. After categorizing, some features are extracted, and the statistical significances are verified for these features. The results show that some features of this sleep quality index model have statistical significance. Thus, this heart rate based sleep quality index may be a useful discriminator of sleep.

Keywords: sleep, sleep quality, heart rate, statistical analysis

Procedia PDF Downloads 311
6204 Antibacterial Evaluation, in Silico ADME and QSAR Studies of Some Benzimidazole Derivatives

Authors: Strahinja Kovačević, Lidija Jevrić, Miloš Kuzmanović, Sanja Podunavac-Kuzmanović

Abstract:

In this paper, various derivatives of benzimidazole have been evaluated against Gram-negative bacteria Escherichia coli. For all investigated compounds the minimum inhibitory concentration (MIC) was determined. Quantitative structure-activity relationships (QSAR) attempts to find consistent relationships between the variations in the values of molecular properties and the biological activity for a series of compounds so that these rules can be used to evaluate new chemical entities. The correlation between MIC and some absorption, distribution, metabolism and excretion (ADME) parameters was investigated, and the mathematical models for predicting the antibacterial activity of this class of compounds were developed. The quality of the multiple linear regression (MLR) models was validated by the leave-one-out (LOO) technique, as well as by the calculation of the statistical parameters for the developed models and the results are discussed on the basis of the statistical data. The results of this study indicate that ADME parameters have a significant effect on the antibacterial activity of this class of compounds. Principal component analysis (PCA) and agglomerative hierarchical clustering algorithms (HCA) confirmed that the investigated molecules can be classified into groups on the basis of the ADME parameters: Madin-Darby Canine Kidney cell permeability (MDCK), Plasma protein binding (PPB%), human intestinal absorption (HIA%) and human colon carcinoma cell permeability (Caco-2).

Keywords: benzimidazoles, QSAR, ADME, in silico

Procedia PDF Downloads 348
6203 Developing Variable Repetitive Group Sampling Control Chart Using Regression Estimator

Authors: Liaquat Ahmad, Muhammad Aslam, Muhammad Azam

Abstract:

In this article, we propose a control chart based on repetitive group sampling scheme for the location parameter. This charting scheme is based on the regression estimator; an estimator that capitalize the relationship between the variables of interest to provide more sensitive control than the commonly used individual variables. The control limit coefficients have been estimated for different sample sizes for less and highly correlated variables. The monitoring of the production process is constructed by adopting the procedure of the Shewhart’s x-bar control chart. Its performance is verified by the average run length calculations when the shift occurs in the average value of the estimator. It has been observed that the less correlated variables have rapid false alarm rate.

Keywords: average run length, control charts, process shift, regression estimators, repetitive group sampling

Procedia PDF Downloads 536
6202 Statistical Characteristics of Code Formula for Design of Concrete Structures

Authors: Inyeol Paik, Ah-Ryang Kim

Abstract:

In this research, a statistical analysis is carried out to examine the statistical properties of the formula given in the design code for concrete structures. The design formulas of the Korea highway bridge design code - the limit state design method (KHBDC) which is the current national bridge design code and the design code for concrete structures by Korea Concrete Institute (KCI) are applied for the analysis. The safety levels provided by the strength formulas of the design codes are defined based on the probabilistic and statistical theory.KHBDC is a reliability-based design code. The load and resistance factors of this code were calibrated to attain the target reliability index. It is essential to define the statistical properties for the design formulas in this calibration process. In general, the statistical characteristics of a member strength are due to the following three factors. The first is due to the difference between the material strength of the actual construction and that used in the design calculation. The second is the difference between the actual dimensions of the constructed sections and those used in design calculation. The third is the difference between the strength of the actual member and the formula simplified for the design calculation. In this paper, the statistical study is focused on the third difference. The formulas for calculating the shear strength of concrete members are presented in different ways in KHBDC and KCI. In this study, the statistical properties of design formulas were obtained through comparison with the database which comprises the experimental results from the reference publications. The test specimen was either reinforced with the shear stirrup or not. For an applied database, the bias factor was about 1.12 and the coefficient of variation was about 0.18. By applying the statistical properties of the design formula to the reliability analysis, it is shown that the resistance factors of the current design codes satisfy the target reliability indexes of both codes. Also, the minimum resistance factors of the KHBDC which is written in the material resistance factor format and KCE which is in the member resistance format are obtained and the results are presented. A further research is underway to calibrate the resistance factors of the high strength and high-performance concrete design guide.

Keywords: concrete design code, reliability analysis, resistance factor, shear strength, statistical property

Procedia PDF Downloads 289
6201 Factors Affecting Expectations and Intentions of University Students’ Mobile Phone Use in Educational Contexts

Authors: Davut Disci

Abstract:

Objective: to measure the factors affecting expectations and intentions of using mobile phone in educational contexts by university students, using advanced equations and modeling techniques. Design and Methodology: According to the literature, Mobile Addiction, Parental Surveillance- Safety/Security, Social Relations, and Mobile Behavior are most used terms of defining mobile use of people. Therefore these variables are tried to be measured to find and estimate their effects on expectations and intentions of using mobile phone in educational context. 421 university students participated in this study and there are 229 Female and 192 Male students. For the purpose of examining the mobile behavior and educational expectations and intentions, a questionnaire is prepared and applied to the participants who had to answer all the questions online. Furthermore, responses to close-ended questions are analyzed by using The Statistical Package for Social Sciences(SPSS) software, reliabilities are measured by Cronbach’s Alpha analysis and hypothesis are examined via using Multiple Regression and Linear Regression analysis and the model is tested with Structural Equation Modeling(SEM) technique which is important for testing the model scientifically. Besides these responses, open-ended questions are taken into consideration. Results: When analyzing data gathered from close-ended questions, it is found that Mobile Addiction, Parental Surveillance, Social Relations and Frequency of Using Mobile Phone Applications are affecting the mobile behavior of the participants in different levels, helping them to use mobile phone in educational context. Moreover, as for open-ended questions, participants stated that they use many mobile applications in their learning environment in terms of contacting with friends, watching educational videos, finding course material via internet. They also agree in that mobile phone brings greater flexibility to their lives. According to the SEM results the model is not evaluated and it can be said that it may be improved to show in SEM besides in multiple regression. Conclusion: This study shows that the specified model can be used by educationalist, school authorities to improve their learning environment.

Keywords: education, mobile behavior, mobile learning, technology, Turkey

Procedia PDF Downloads 395
6200 Factors Affecting Expectations and Intentions of University Students in Educational Context

Authors: Davut Disci

Abstract:

Objective: to measure the factors affecting expectations and intentions of using mobile phone in educational contexts by university students, using advanced equations and modeling techniques. Design and Methodology: According to the literature, Mobile Addiction, Parental Surveillance-Safety/Security, Social Relations, and Mobile Behavior are most used terms of defining mobile use of people. Therefore, these variables are tried to be measured to find and estimate their effects on expectations and intentions of using mobile phone in educational context. 421 university students participated in this study and there are 229 Female and 192 Male students. For the purpose of examining the mobile behavior and educational expectations and intentions, a questionnaire is prepared and applied to the participants who had to answer all the questions online. Furthermore, responses to close-ended questions are analyzed by using The Statistical Package for Social Sciences(SPSS) software, reliabilities are measured by Cronbach’s Alpha analysis and hypothesis are examined via using Multiple Regression and Linear Regression analysis and the model is tested with Structural Equation Modeling (SEM) technique which is important for testing the model scientifically. Besides these responses, open-ended questions are taken into consideration. Results: When analyzing data gathered from close-ended questions, it is found that Mobile Addiction, Parental Surveillance, Social Relations and Frequency of Using Mobile Phone Applications are affecting the mobile behavior of the participants in different levels, helping them to use mobile phone in educational context. Moreover, as for open-ended questions, participants stated that they use many mobile applications in their learning environment in terms of contacting with friends, watching educational videos, finding course material via internet. They also agree in that mobile phone brings greater flexibility to their lives. According to the SEM results the model is not evaluated and it can be said that it may be improved to show in SEM besides in multiple regression. Conclusion: This study shows that the specified model can be used by educationalist, school authorities to improve their learning environment.

Keywords: learning technology, instructional technology, mobile learning, technology

Procedia PDF Downloads 432
6199 The Relationship Between Hourly Compensation and Unemployment Rate Using the Panel Data Regression Analysis

Authors: S. K. Ashiquer Rahman

Abstract:

the paper concentrations on the importance of hourly compensation, emphasizing the significance of the unemployment rate. There are the two most important factors of a nation these are its unemployment rate and hourly compensation. These are not merely statistics but they have profound effects on individual, families, and the economy. They are inversely related to one another. When we consider the unemployment rate that will probably decline as hourly compensations in manufacturing rise. But when we reduced the unemployment rates and increased job prospects could result from higher compensation. That’s why, the increased hourly compensation in the manufacturing sector that could have a favorable effect on job changing issues. Moreover, the relationship between hourly compensation and unemployment is complex and influenced by broader economic factors. In this paper, we use panel data regression models to evaluate the expected link between hourly compensation and unemployment rate in order to determine the effect of hourly compensation on unemployment rate. We estimate the fixed effects model, evaluate the error components, and determine which model (the FEM or ECM) is better by pooling all 60 observations. We then analysis and review the data by comparing 3 several countries (United States, Canada and the United Kingdom) using panel data regression models. Finally, we provide result, analysis and a summary of the extensive research on how the hourly compensation effects on the unemployment rate. Additionally, this paper offers relevant and useful informational to help the government and academic community use an econometrics and social approach to lessen on the effect of the hourly compensation on Unemployment rate to eliminate the problem.

Keywords: hourly compensation, Unemployment rate, panel data regression models, dummy variables, random effects model, fixed effects model, the linear regression model

Procedia PDF Downloads 42
6198 Performance Comparison of Different Regression Methods for a Polymerization Process with Adaptive Sampling

Authors: Florin Leon, Silvia Curteanu

Abstract:

Developing complete mechanistic models for polymerization reactors is not easy, because complex reactions occur simultaneously; there is a large number of kinetic parameters involved and sometimes the chemical and physical phenomena for mixtures involving polymers are poorly understood. To overcome these difficulties, empirical models based on sampled data can be used instead, namely regression methods typical of machine learning field. They have the ability to learn the trends of a process without any knowledge about its particular physical and chemical laws. Therefore, they are useful for modeling complex processes, such as the free radical polymerization of methyl methacrylate achieved in a batch bulk process. The goal is to generate accurate predictions of monomer conversion, numerical average molecular weight and gravimetrical average molecular weight. This process is associated with non-linear gel and glass effects. For this purpose, an adaptive sampling technique is presented, which can select more samples around the regions where the values have a higher variation. Several machine learning methods are used for the modeling and their performance is compared: support vector machines, k-nearest neighbor, k-nearest neighbor and random forest, as well as an original algorithm, large margin nearest neighbor regression. The suggested method provides very good results compared to the other well-known regression algorithms.

Keywords: batch bulk methyl methacrylate polymerization, adaptive sampling, machine learning, large margin nearest neighbor regression

Procedia PDF Downloads 270
6197 A Comparative Study to Evaluate Chronological Age and Dental Age in the North Indian Population Using Cameriere's Method

Authors: Ranjitkumar Patil

Abstract:

Age estimation has importance in forensic dentistry. Dental age estimation has emerged as an alternative to skeletal age determination. The methods based on stages of tooth formation, as appreciated on radiographs, seem to be more appropriate in the assessment of age than those based on skeletal development. The study was done to evaluate dental age in the north Indian population using Cameriere’s method. Aims/Objectives: The study was conducted to assess the dental age of North Indian children using Cameriere’s method and to compare the chronological age and dental age for validation of the Cameriere’s method in the north Indian population. A comparative study of 02-year duration on the OPG (using PLANMECA Promax 3D) data of 497 individuals with ages ranging from 5 to 15 years was done based on simple random technique ethical approval obtained from institutional ethical committee. The data was obtained based on inclusion and exclusion criteria and was analyzed by software for dental age estimation. Statistical analysis: The student’s t-test was used to compare the morphological variables of males with those of females and to compare observed age with estimated age. The regression formula was also calculated. Results: Present study was a comparative study of 497 subjects with a distribution between males and females, with their dental age assessed by using a Panoramic radiograph, following the method described by Cameriere, which is widely accepted. Statistical analysis in our study indicated that gender does not have a significant influence on age estimation. (R2= 0.787). Conclusion: This infers that Cameriere’s method can be effectively applied to the north Indian population.

Keywords: forensic, dental age, skeletal age, chronological age, Cameriere’s method

Procedia PDF Downloads 93
6196 Teachers’ Intention to Leave: Educational Policies as External Stress Factor

Authors: A. Myrzabekova, D. Nurmukhamed, K. Nurumov, A. Zhulbarissova

Abstract:

It is widely believed that stress can affect teachers’ intention to change the workplace. While existing research primarily focuses on the intrinsic sources of stress stemming from the school climate, the current attempt analyzes educational policies as one of the determinants of teacher’s intention to leave schools. In this respect, Kazakhstan presents a unique case since the country endorsed several educational policies which directly impacted teaching and administrative practices within schools. Using Teaching and Learning International Survey 2018 (TALIS) data with the country specific questionnaire, we construct a statistical measure of stress caused by the implementation of educational policies and test its impact on teacher’s intention to leave through the logistic regression. In addition, we control for sociodemographic, professional, and students related covariates while considering the intrinsic dimension of stress stemming from the school climate. Overall, our results suggest that stress caused by the educational policies has a statistically significant positive effect on teachers’ intentions to transfer between schools. Both policy makers and educational scholars could find these results beneficial. For the former careful planning and addressing the negative effects of the educational policies is critical for the sustainability of the educational process. For the latter, accounting for exogenous sources of stress can lead to a more complete understanding of why teachers decide to change their schools.

Keywords: educational policies, Kazakhstani teachers, logistic regression factor analysis, sustainability education TALIS, teacher turnover intention, work stress

Procedia PDF Downloads 79
6195 Effects of Process Parameters on the Yield of Oil from Coconut Fruit

Authors: Ndidi F. Amulu, Godian O. Mbah, Maxwel I. Onyiah, Callistus N. Ude

Abstract:

Analysis of the properties of coconut (Cocos nucifera) and its oil was evaluated in this work using standard analytical techniques. The analyses carried out include proximate composition of the fruit, extraction of oil from the fruit using different process parameters and physicochemical analysis of the extracted oil. The results showed the percentage (%) moisture, crude lipid, crude protein, ash, and carbohydrate content of the coconut as 7.59, 55.15, 5.65, 7.35, and 19.51 respectively. The oil from the coconut fruit was odourless and yellowish liquid at room temperature (30oC). The treatment combinations used (leaching time, leaching temperature and solute: solvent ratio) showed significant differences (P˂0.05) in the yield of oil from coconut flour. The oil yield ranged between 36.25%-49.83%. Lipid indices of the coconut oil indicated the acid value (AV) as 10.05 Na0H/g of oil, free fatty acid (FFA) as 5.03%, saponification values (SV) as 183.26 mgKOH-1 g of oil, iodine value (IV) as 81.00 I2/g of oil, peroxide value (PV) as 5.00 ml/ g of oil and viscosity (V) as 0.002. A standard statistical package minitab version 16.0 program was used in the regression analysis and analysis of variance (ANOVA). The statistical software mentioned above was also used to generate various plots such as single effect plot, interactions effect plot and contour plot. The response or yield of oil from the coconut flour was used to develop a mathematical model that correlates the yield to the process variables studied. The maximum conditions obtained that gave the highest yield of coconut oil were leaching time of 2 hrs, leaching temperature of 50 oC and solute/solvent ratio of 0.05 g/ml.

Keywords: coconut, oil-extraction, optimization, physicochemical, proximate

Procedia PDF Downloads 322
6194 Predictive Analysis of the Stock Price Market Trends with Deep Learning

Authors: Suraj Mehrotra

Abstract:

The stock market is a volatile, bustling marketplace that is a cornerstone of economics. It defines whether companies are successful or in spiral. A thorough understanding of it is important - many companies have whole divisions dedicated to analysis of both their stock and of rivaling companies. Linking the world of finance and artificial intelligence (AI), especially the stock market, has been a relatively recent development. Predicting how stocks will do considering all external factors and previous data has always been a human task. With the help of AI, however, machine learning models can help us make more complete predictions in financial trends. Taking a look at the stock market specifically, predicting the open, closing, high, and low prices for the next day is very hard to do. Machine learning makes this task a lot easier. A model that builds upon itself that takes in external factors as weights can predict trends far into the future. When used effectively, new doors can be opened up in the business and finance world, and companies can make better and more complete decisions. This paper explores the various techniques used in the prediction of stock prices, from traditional statistical methods to deep learning and neural networks based approaches, among other methods. It provides a detailed analysis of the techniques and also explores the challenges in predictive analysis. For the accuracy of the testing set, taking a look at four different models - linear regression, neural network, decision tree, and naïve Bayes - on the different stocks, Apple, Google, Tesla, Amazon, United Healthcare, Exxon Mobil, J.P. Morgan & Chase, and Johnson & Johnson, the naïve Bayes model and linear regression models worked best. For the testing set, the naïve Bayes model had the highest accuracy along with the linear regression model, followed by the neural network model and then the decision tree model. The training set had similar results except for the fact that the decision tree model was perfect with complete accuracy in its predictions, which makes sense. This means that the decision tree model likely overfitted the training set when used for the testing set.

Keywords: machine learning, testing set, artificial intelligence, stock analysis

Procedia PDF Downloads 61
6193 Assessment of Personal Level Exposures to Particulate Matter among Children in Rural Preliminary Schools as an Indoor Air Pollution Monitoring

Authors: Seyedtaghi Mirmohammadi, J. Yazdani, S. M. Asadi, M. Rokni, A. Toosi

Abstract:

There are many indoor air quality studies with an emphasis on indoor particulate matters (PM2.5) monitoring. Whereas, there is a lake of data about indoor PM2.5 concentrations in rural area schools (especially in classrooms), since preliminary children are assumed to be more defenseless to health hazards and spend a large part of their time in classrooms. The objective of this study was indoor PM2.5 concentration quality assessment. Fifteen preliminary schools by time-series sampling were selected to evaluate the indoor air quality in the rural district of Sari city, Iran. Data on indoor air climate parameters (temperature, relative humidity and wind speed) were measured by a hygrometer and thermometer. Particulate matters (PM2.5) were collected and assessed by Real Time Dust Monitor, (MicroDust Pro, Casella, UK). The mean indoor PM2.5 concentration in the studied classrooms was 135µg/m3 in average. The multiple linear regression revealed that a correlation between PM2.5 concentration and relative humidity, distance from city center and classroom size. Classroom size yields reasonable negative relationship, the PM2.5 concentration was ranged from 65 to 540μg/m3 and statistically significant at 0.05 level and the relative humidity was ranged from 70 to 85% and dry bulb temperature ranged from 28 to 29°C were statistically significant at 0.035 and 0.05 level, respectively. A statistical predictive model was obtained from multiple regressions modeling for PM2.5 and indoor psychrometric parameters.

Keywords: particulate matters, classrooms, regression, concentration, humidity

Procedia PDF Downloads 289
6192 Modeling Karachi Dengue Outbreak and Exploration of Climate Structure

Authors: Syed Afrozuddin Ahmed, Junaid Saghir Siddiqi, Sabah Quaiser

Abstract:

Various studies have reported that global warming causes unstable climate and many serious impact to physical environment and public health. The increasing incidence of dengue incidence is now a priority health issue and become a health burden of Pakistan. In this study it has been investigated that spatial pattern of environment causes the emergence or increasing rate of dengue fever incidence that effects the population and its health. The climatic or environmental structure data and the Dengue Fever (DF) data was processed by coding, editing, tabulating, recoding, restructuring in terms of re-tabulating was carried out, and finally applying different statistical methods, techniques, and procedures for the evaluation. Five climatic variables which we have studied are precipitation (P), Maximum temperature (Mx), Minimum temperature (Mn), Humidity (H) and Wind speed (W) collected from 1980-2012. The dengue cases in Karachi from 2010 to 2012 are reported on weekly basis. Principal component analysis is applied to explore the climatic variables and/or the climatic (structure) which may influence in the increase or decrease in the number of dengue fever cases in Karachi. PC1 for all the period is General atmospheric condition. PC2 for dengue period is contrast between precipitation and wind speed. PC3 is the weighted difference between maximum temperature and wind speed. PC4 for dengue period contrast between maximum and wind speed. Negative binomial and Poisson regression model are used to correlate the dengue fever incidence to climatic variable and principal component score. Relative humidity is estimated to positively influence on the chances of dengue occurrence by 1.71% times. Maximum temperature positively influence on the chances dengue occurrence by 19.48% times. Minimum temperature affects positively on the chances of dengue occurrence by 11.51% times. Wind speed is effecting negatively on the weekly occurrence of dengue fever by 7.41% times.

Keywords: principal component analysis, dengue fever, negative binomial regression model, poisson regression model

Procedia PDF Downloads 409
6191 Transforming Data into Knowledge: Mathematical and Statistical Innovations in Data Analytics

Authors: Zahid Ullah, Atlas Khan

Abstract:

The rapid growth of data in various domains has created a pressing need for effective methods to transform this data into meaningful knowledge. In this era of big data, mathematical and statistical innovations play a crucial role in unlocking insights and facilitating informed decision-making in data analytics. This abstract aims to explore the transformative potential of these innovations and their impact on converting raw data into actionable knowledge. Drawing upon a comprehensive review of existing literature, this research investigates the cutting-edge mathematical and statistical techniques that enable the conversion of data into knowledge. By evaluating their underlying principles, strengths, and limitations, we aim to identify the most promising innovations in data analytics. To demonstrate the practical applications of these innovations, real-world datasets will be utilized through case studies or simulations. This empirical approach will showcase how mathematical and statistical innovations can extract patterns, trends, and insights from complex data, enabling evidence-based decision-making across diverse domains. Furthermore, a comparative analysis will be conducted to assess the performance, scalability, interpretability, and adaptability of different innovations. By benchmarking against established techniques, we aim to validate the effectiveness and superiority of the proposed mathematical and statistical innovations in data analytics. Ethical considerations surrounding data analytics, such as privacy, security, bias, and fairness, will be addressed throughout the research. Guidelines and best practices will be developed to ensure the responsible and ethical use of mathematical and statistical innovations in data analytics. The expected contributions of this research include advancements in mathematical and statistical sciences, improved data analysis techniques, enhanced decision-making processes, and practical implications for industries and policymakers. The outcomes will guide the adoption and implementation of mathematical and statistical innovations, empowering stakeholders to transform data into actionable knowledge and drive meaningful outcomes.

Keywords: data analytics, mathematical innovations, knowledge extraction, decision-making

Procedia PDF Downloads 45
6190 An Application of Quantile Regression to Large-Scale Disaster Research

Authors: Katarzyna Wyka, Dana Sylvan, JoAnn Difede

Abstract:

Background and significance: The following disaster, population-based screening programs are routinely established to assess physical and psychological consequences of exposure. These data sets are highly skewed as only a small percentage of trauma-exposed individuals develop health issues. Commonly used statistical methodology in post-disaster mental health generally involves population-averaged models. Such models aim to capture the overall response to the disaster and its aftermath; however, they may not be sensitive enough to accommodate population heterogeneity in symptomatology, such as post-traumatic stress or depressive symptoms. Methods: We use an archival longitudinal data set from Weill-Cornell 9/11 Mental Health Screening Program established following the World Trade Center (WTC) terrorist attacks in New York in 2001. Participants are rescue and recovery workers who participated in the site cleanup and restoration (n=2960). The main outcome is the post-traumatic stress symptoms (PTSD) severity score assessed via clinician interviews (CAPS). For a detailed understanding of response to the disaster and its aftermath, we are adapting quantile regression methodology with particular focus on predictors of extreme distress and resilience to trauma. Results: The response variable was defined as the quantile of the CAPS score for each individual under two different scenarios specifying the unconditional quantiles based on: 1) clinically meaningful CAPS cutoff values and 2) CAPS distribution in the population. We present graphical summaries of the differential effects. For instance, we found that the effect of the WTC exposures, namely seeing bodies and feeling that life was in danger during rescue/recovery work was associated with very high PTSD symptoms. A similar effect was apparent in individuals with prior psychiatric history. Differential effects were also present for age and education level of the individuals. Conclusion: We evaluate the utility of quantile regression in disaster research in contrast to the commonly used population-averaged models. We focused on assessing the distribution of risk factors for post-traumatic stress symptoms across quantiles. This innovative approach provides a comprehensive understanding of the relationship between dependent and independent variables and could be used for developing tailored training programs and response plans for different vulnerability groups.

Keywords: disaster workers, post traumatic stress, PTSD, quantile regression

Procedia PDF Downloads 259
6189 Statistical Time-Series and Neural Architecture of Malaria Patients Records in Lagos, Nigeria

Authors: Akinbo Razak Yinka, Adesanya Kehinde Kazeem, Oladokun Oluwagbenga Peter

Abstract:

Time series data are sequences of observations collected over a period of time. Such data can be used to predict health outcomes, such as disease progression, mortality, hospitalization, etc. The Statistical approach is based on mathematical models that capture the patterns and trends of the data, such as autocorrelation, seasonality, and noise, while Neural methods are based on artificial neural networks, which are computational models that mimic the structure and function of biological neurons. This paper compared both parametric and non-parametric time series models of patients treated for malaria in Maternal and Child Health Centres in Lagos State, Nigeria. The forecast methods considered linear regression, Integrated Moving Average, ARIMA and SARIMA Modeling for the parametric approach, while Multilayer Perceptron (MLP) and Long Short-Term Memory (LSTM) Network were used for the non-parametric model. The performance of each method is evaluated using the Mean Absolute Error (MAE), R-squared (R2) and Root Mean Square Error (RMSE) as criteria to determine the accuracy of each model. The study revealed that the best performance in terms of error was found in MLP, followed by the LSTM and ARIMA models. In addition, the Bootstrap Aggregating technique was used to make robust forecasts when there are uncertainties in the data.

Keywords: ARIMA, bootstrap aggregation, MLP, LSTM, SARIMA, time-series analysis

Procedia PDF Downloads 35
6188 The Relationship between Self-Injurious Behavior and Manner of Death

Authors: Sait Ozsoy, Hacer Yasar Teke, Mustafa Dalgic, Cetin Ketenci, Ertugrul Gok, Kenan Karbeyaz, Azem Irez, Mesut Akyol

Abstract:

Self-mutilating behavior or self-injury behavior (SIB) is defined as: intentional harm to one’s body without intends to commit suicide”. SIB cases are commonly seen in psychiatry and forensic medicine practices. Despite variety of SIB methods, cuts in the skin is the most common (70-97%) injury in this group of patients. Subjects with SIB have one or more other comorbidities which include depression, anxiety, depersonalization, and feeling of worthlessness, borderline personality disorder, antisocial behaviors, and histrionic personality. These individuals feel a high level of hostility towards themselves and their surroundings. Researches have also revealed a strong relationship between antisocial personality disorder, criminal behavior, and SIB. This study has retrospectively evaluated 6,599 autopsy cases performed at forensic medicine institutes of six major cities (Ankara, Izmir, Diyarbakir, Erzurum, Trabzon, Eskisehir) of Turkey in 2013. The study group consisted of all cases with SIB findings (psychopathic cuts, cigarette burns, scars, and etc.). The relationship between causes of death in the study group (SIB subjects) and the control group was investigated. The control group was created from subjects without signs of SIB. Mann-Whitney U test was used for age variables and Chi-square test for categorical variables. Multinomial logistic regression analysis was used in order to analyze group differences in respect to manner of death (natural, accident, homicide, suicide) and analysis of risk factors associated with each group was determined by the Binomial logistic regression analysis. This study used SPSS statistics 15.0 for all its statistical and calculation needs. The statistical significance was p <0.05. There was no significant difference between accidental and natural death among the groups (p=0.737). Also there was a unit increase in number of cuts in psychopathic group while number of accidental death decreased (95% CI: 0.941-0.993) by 0.967 times (p=0.015). In contrast, there was a significant difference between suicidal and natural death (p<0.001), and also between homicidal and natural death (p=0.025). SIB is often seen with borderline and antisocial personality disorder but may be associated with many psychiatric illnesses. Studies have shown a relationship between antisocial personality disorders with criminal behavior and SIB with suicidal behavior. In our study, rate of suicide, murder and intoxication was higher compared to the control group. It could be concluded that SIB can be used as a predictor of possibility of one’s harm to him/herself and other people.

Keywords: autopsy, cause of death, forensic science, self-injury behaviour

Procedia PDF Downloads 484
6187 Statistical Analysis of Interferon-γ for the Effectiveness of an Anti-Tuberculous Treatment

Authors: Shishen Xie, Yingda L. Xie

Abstract:

Tuberculosis (TB) is a potentially serious infectious disease that remains a health concern. The Interferon Gamma Release Assay (IGRA) is a blood test to find out if an individual is tuberculous positive or negative. This study applies statistical analysis to the clinical data of interferon-gamma levels of seventy-three subjects who diagnosed pulmonary TB in an anti-tuberculous treatment. Data analysis is performed to determine if there is a significant decline in interferon-gamma levels for the subjects during a period of six months, and to infer if the anti-tuberculous treatment is effective.

Keywords: data analysis, interferon gamma release assay, statistical methods, tuberculosis infection

Procedia PDF Downloads 278
6186 The Relationship between Depression, HIV Stigma and Adherence to Antiretroviral Therapy among Adult Patients Living with HIV at a Tertiary Hospital in Durban, South Africa: The Mediating Roles of Self-Efficacy and Social Support

Authors: Muziwandile Luthuli

Abstract:

Although numerous factors predicting adherence to antiretroviral therapy (ART) among people living with HIV/AIDS (PLWHA) have been broadly studied on both regional and global level, up-to-date adherence of patients to ART remains an overarching, dynamic and multifaceted problem that needs to be investigated over time and across various contexts. There is a rarity of empirical data in the literature on interactive mechanisms by which psychosocial factors influence adherence to ART among PLWHA within the South African context. Therefore, this study was designed to investigate the relationship between depression, HIV stigma, and adherence to ART among adult patients living with HIV at a tertiary hospital in Durban, South Africa, and the mediating roles of self-efficacy and social support. The health locus of control theory and the social support theory were the underlying theoretical frameworks for this study. Using a cross-sectional research design, a total of 201 male and female adult patients aged between 18-75 years receiving ART at a tertiary hospital in Durban, KwaZulu-Natal were sampled, using time location sampling (TLS). A self-administered questionnaire was employed to collect the data in this study. Data were analysed through SPSS version 27. Several statistical analyses were conducted in this study, namely univariate statistical analysis, correlational analysis, Pearson’s chi-square analysis, cross-tabulation analysis, binary logistic regression analysis, and mediational analysis. Univariate analysis indicated that the sample mean age was 39.28 years (SD=12.115), while most participants were females 71.0% (n=142), never married 74.2% (n=147), and most were also secondary school educated 48.3% (n=97), as well as unemployed 65.7% (n=132). The prevalence rate of participants who had high adherence to ART was 53.7% (n=108), and 46.3% (n=93) of participants had low adherence to ART. Chi-square analysis revealed that employment status was the only statistically significant socio-demographic influence of adherence to ART in this study (χ2 (3) = 8.745; p < .033). Chi-square analysis showed that there was a statistically significant difference found between depression and adherence to ART (χ2 (4) = 16.140; p < .003), while between HIV stigma and adherence to ART, no statistically significant difference was found (χ2 (1) = .323; p >.570). Binary logistic regression indicated that depression was statistically associated with adherence to ART (OR= .853; 95% CI, .789–.922, P < 001), while the association between self-efficacy and adherence to ART was statistically significant (OR= 1.04; 95% CI, 1.001– 1.078, P < .045) after controlling for the effect of depression. However, the findings showed that the effect of depression on adherence to ART was not significantly mediated by self-efficacy (Sobel test for indirect effect, Z= 1.01, P > 0.31). Binary logistic regression showed that the effect of HIV stigma on adherence to ART was not statistically significant (OR= .980; 95% CI, .937– 1.025, P > .374), but the effect of social support on adherence to ART was statistically significant, only after the effect of HIV stigma was controlled for (OR= 1.017; 95% CI, 1.000– 1.035, P < .046). This study promotes behavioral and social change effected through evidence-based interventions by emphasizing the need for additional research that investigates the interactive mechanisms by which psychosocial factors influence adherence to ART. Depression is a significant predictor of adherence to ART. Thus, to alleviate the psychosocial impact of depression on adherence to ART, effective interventions must be devised, along with special consideration of self-efficacy and social support. Therefore, this study is helpful in informing and effecting change in health policy and healthcare services through its findings

Keywords: ART adherence, depression, HIV/AIDS, PLWHA

Procedia PDF Downloads 160