Search results for: regression estimators
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3306

Search results for: regression estimators

2916 Major Variables Influencing Marketed Surplus of Seed Cotton in District Khanewal, Pakistan

Authors: Manan Aslam, Shafqat Rasool

Abstract:

This paper attempts to examine impact of major factors affecting marketed surplus of seed cotton in district Khanewal (Punjab) using primary source of data. A representative sample of 40 cotton farmers was selected using stratified random sampling technique. The impact of major factors on marketed surplus of seed cotton growers was estimated by employing double log form of regression analysis. The value of adjusted R2 was 0.64 whereas the F-value was 10.81. The findings of analysis revealed that experience of farmers, education of farmers, area under cotton crop and distance from wholesale market were the significant variables affecting marketed surplus of cotton whereas the variables (marketing cost and sale price) showed insignificant impact. The study suggests improving prevalent marketing practices to increase volume of marketed surplus of cotton in district Khanewal.

Keywords: seed cotton, marketed surplus, double log regression analysis

Procedia PDF Downloads 307
2915 Modelling of Factors Affecting Bond Strength of Fibre Reinforced Polymer Externally Bonded to Timber and Concrete

Authors: Abbas Vahedian, Rijun Shrestha, Keith Crews

Abstract:

In recent years, fibre reinforced polymers as applications of strengthening materials have received significant attention by civil engineers and environmentalists because of their excellent characteristics. Currently, these composites have become a mainstream technology for strengthening of infrastructures such as steel, concrete and more recently, timber and masonry structures. However, debonding is identified as the main problem which limit the full utilisation of the FRP material. In this paper, a preliminary analysis of factors affecting bond strength of FRP-to-concrete and timber bonded interface has been conducted. A novel theoretical method through regression analysis has been established to evaluate these factors. Results of proposed model are then assessed with results of pull-out tests and satisfactory comparisons are achieved between measured failure loads (R2 = 0.83, P < 0.0001) and the predicted loads (R2 = 0.78, P < 0.0001).

Keywords: debonding, fibre reinforced polymers (FRP), pull-out test, stepwise regression analysis

Procedia PDF Downloads 248
2914 Modified Clusterwise Regression for Pavement Management

Authors: Mukesh Khadka, Alexander Paz, Hanns de la Fuente-Mella

Abstract:

Typically, pavement performance models are developed in two steps: (i) pavement segments with similar characteristics are grouped together to form a cluster, and (ii) the corresponding performance models are developed using statistical techniques. A challenge is to select the characteristics that define clusters and the segments associated with them. If inappropriate characteristics are used, clusters may include homogeneous segments with different performance behavior or heterogeneous segments with similar performance behavior. Prediction accuracy of performance models can be improved by grouping the pavement segments into more uniform clusters by including both characteristics and a performance measure. This grouping is not always possible due to limited information. It is impractical to include all the potential significant factors because some of them are potentially unobserved or difficult to measure. Historical performance of pavement segments could be used as a proxy to incorporate the effect of the missing potential significant factors in clustering process. The current state-of-the-art proposes Clusterwise Linear Regression (CLR) to determine the pavement clusters and the associated performance models simultaneously. CLR incorporates the effect of significant factors as well as a performance measure. In this study, a mathematical program was formulated for CLR models including multiple explanatory variables. Pavement data collected recently over the entire state of Nevada were used. International Roughness Index (IRI) was used as a pavement performance measure because it serves as a unified standard that is widely accepted for evaluating pavement performance, especially in terms of riding quality. Results illustrate the advantage of the using CLR. Previous studies have used CLR along with experimental data. This study uses actual field data collected across a variety of environmental, traffic, design, and construction and maintenance conditions.

Keywords: clusterwise regression, pavement management system, performance model, optimization

Procedia PDF Downloads 251
2913 Detection of High Fructose Corn Syrup in Honey by Near Infrared Spectroscopy and Chemometrics

Authors: Mercedes Bertotto, Marcelo Bello, Hector Goicoechea, Veronica Fusca

Abstract:

The National Service of Agri-Food Health and Quality (SENASA), controls honey to detect contamination by synthetic or natural chemical substances and establishes and controls the traceability of the product. The utility of near-infrared spectroscopy for the detection of adulteration of honey with high fructose corn syrup (HFCS) was investigated. First of all, a mixture of different authentic artisanal Argentinian honey was prepared to cover as much heterogeneity as possible. Then, mixtures were prepared by adding different concentrations of high fructose corn syrup (HFCS) to samples of the honey pool. 237 samples were used, 108 of them were authentic honey and 129 samples corresponded to honey adulterated with HFCS between 1 and 10%. They were stored unrefrigerated from time of production until scanning and were not filtered after receipt in the laboratory. Immediately prior to spectral collection, honey was incubated at 40°C overnight to dissolve any crystalline material, manually stirred to achieve homogeneity and adjusted to a standard solids content (70° Brix) with distilled water. Adulterant solutions were also adjusted to 70° Brix. Samples were measured by NIR spectroscopy in the range of 650 to 7000 cm⁻¹. The technique of specular reflectance was used, with a lens aperture range of 150 mm. Pretreatment of the spectra was performed by Standard Normal Variate (SNV). The ant colony optimization genetic algorithm sample selection (ACOGASS) graphical interface was used, using MATLAB version 5.3, to select the variables with the greatest discriminating power. The data set was divided into a validation set and a calibration set, using the Kennard-Stone (KS) algorithm. A combined method of Potential Functions (PF) was chosen together with Partial Least Square Linear Discriminant Analysis (PLS-DA). Different estimators of the predictive capacity of the model were compared, which were obtained using a decreasing number of groups, which implies more demanding validation conditions. The optimal number of latent variables was selected as the number associated with the minimum error and the smallest number of unassigned samples. Once the optimal number of latent variables was defined, we proceeded to apply the model to the training samples. With the calibrated model for the training samples, we proceeded to study the validation samples. The calibrated model that combines the potential function methods and PLSDA can be considered reliable and stable since its performance in future samples is expected to be comparable to that achieved for the training samples. By use of Potential Functions (PF) and Partial Least Square Linear Discriminant Analysis (PLS-DA) classification, authentic honey and honey adulterated with HFCS could be identified with a correct classification rate of 97.9%. The results showed that NIR in combination with the PT and PLS-DS methods can be a simple, fast and low-cost technique for the detection of HFCS in honey with high sensitivity and power of discrimination.

Keywords: adulteration, multivariate analysis, potential functions, regression

Procedia PDF Downloads 125
2912 Lean Implementation Analysis on the Safety Performance of Construction Projects in the Philippines

Authors: Kim Lindsay F. Restua, Jeehan Kyra A. Rivero, Joneka Myles D. Taguba

Abstract:

Lean construction is defined as an approach in construction with the purpose of reducing waste in the process without compromising the value of the project. There are numerous lean construction tools that are applied in the construction process, which maximizes the efficiency of work and satisfaction of customers while minimizing waste. However, the complexity and differences of construction projects cause a rise in challenges on achieving the lean benefits construction can give, such as improvement in safety performance. The objective of this study is to determine the relationship between lean construction tools and their effects on safety performance. The relationship between construction tools applied in construction and safety performance is identified through Logistic Regression Analysis, and Correlation Analysis was conducted thereafter. Based on the findings, it was concluded that almost 60% of the factors listed in the study, which are different tools and effects of lean construction, were determined to have a significant relationship with the level of safety in construction projects.

Keywords: correlation analysis, lean construction tools, lean construction, logistic regression analysis, risk management, safety

Procedia PDF Downloads 186
2911 Mutual Fund Anchoring Bias with its Parent Firm Performance: Evidence from Mutual Fund Industry of Pakistan

Authors: Muhammad Tahir

Abstract:

Purpose The purpose of the study is to find anchoring bias behavior in mutual fund return with its parent firm performance in Pakistan. Research Methodology The paper used monthly returns of equity funds whose parent firm exist from 2011 to 2021, along with parent firm return. Proximity to 52-week highest return calculated by dividing fund return by parent firm 52-week highest return. Control variables are also taken and used pannel regression model to estimate our results. For robust results, we also used feasible generalize least square (FGLS) model. Findings The results showed that there exist anchoring biased in mutual fund return with its parent firm performance. The FGLS results reaffirms the same results as obtained from panner regression results. Proximity to 52-week highest Xc is significant in both models. Research Implication Since most of mutual funds has a parent firm, anchoring behavior biased found in mutual fund with its parent firm performance. Practical Implication Mutual fund investors in Pakistan invest in equity funds in which behavioral bias exist, although there might be better opportunity in market. Originality/Value Addition Our research is a pioneer study to investigate anchoring bias in mutual fund return with its parent firm performance. Research limitations Our sample is limited to only 23 equity funds, which has a parent firm and data was available from 2011 to 2021.

Keywords: mutual fund, anchoring bias, 52-week high return, proximity to 52-week high, parent firm performance, pannel regression, FGLS

Procedia PDF Downloads 119
2910 Topical Nonsteroidal Anti-Inflammatory Eye Drops and Oral Acetazolamide for Macular Edema after Uncomplicated Phacoemulsification: Outcome and Predictors of Non-Response

Authors: Wissam Aljundi, Loay Daas, Yaser Abu Dail, Barbara Käsmann-Kellner, Berthold Seitz, Alaa Din Abdin

Abstract:

Purpose: To investigate the effectiveness of nonsteroidal anti-inflammatory eye drops (NSAIDs) combined with oral acetazolamide for postoperative macular edema (PME) after uncomplicated phacoemulsification (PE) and to identify predictors of non-response. Methods: We analyzed data of uncomplicated PE and identified eyes with PME. First-line therapy included topical NSAIDs combined with oral acetazolamide. In case of non-response, triamcinolone was administered subtenonally. Outcome measures included best-corrected visual acuity (BCVA) and central macular thickness (CMT). Results: 94 eyes out of 9750 uncomplicated PE developed PME, of which 60 eyes were included. Follow-ups occurred 6.4±1.8, 12.5±3.7, and 18.6±6.0 weeks after diagnosis. BCVA and CMT improved significantly in all follow-ups. 40 eyes showed response to first-line therapy at first follow-up (G1). The remaining 20 eyes showed no response and required subtenon triamcinolone (G2), of which 11 eyes showed complete regression at the second follow-up and 4 eyes at the third follow-up. 5 eyes showed no response and required intravitreal injection. Multivariate linear regression model showed that diabetes mellitus (DM) and increased cumulative dissipated energy (CDE) are predictors of non-response. Conclusion: Topical NSAIDs with acetazolamide resulted in complete regression of PME in 67% of all cases. DM and increased CDE might be considered as predictors of nonresponse to this treatment.

Keywords: postoperative macular edema, intravitreal injection, cumulative energy, irvine gass syndrome, pseudophakie

Procedia PDF Downloads 117
2909 Functional Decomposition Based Effort Estimation Model for Software-Intensive Systems

Authors: Nermin Sökmen

Abstract:

An effort estimation model is needed for software-intensive projects that consist of hardware, embedded software or some combination of the two, as well as high level software solutions. This paper first focuses on functional decomposition techniques to measure functional complexity of a computer system and investigates its impact on system development effort. Later, it examines effects of technical difficulty and design team capability factors in order to construct the best effort estimation model. With using traditional regression analysis technique, the study develops a system development effort estimation model which takes functional complexity, technical difficulty and design team capability factors as input parameters. Finally, the assumptions of the model are tested.

Keywords: functional complexity, functional decomposition, development effort, technical difficulty, design team capability, regression analysis

Procedia PDF Downloads 293
2908 Heart Ailment Prediction Using Machine Learning Methods

Authors: Abhigyan Hedau, Priya Shelke, Riddhi Mirajkar, Shreyash Chaple, Mrunali Gadekar, Himanshu Akula

Abstract:

The heart is the coordinating centre of the major endocrine glandular structure of the body, which produces hormones that profoundly affect the operations of the body, and diagnosing cardiovascular disease is a difficult but critical task. By extracting knowledge and information about the disease from patient data, data mining is a more practical technique to help doctors detect disorders. We use a variety of machine learning methods here, including logistic regression and support vector classifiers (SVC), K-nearest neighbours Classifiers (KNN), Decision Tree Classifiers, Random Forest classifiers and Gradient Boosting classifiers. These algorithms are applied to patient data containing 13 different factors to build a system that predicts heart disease in less time with more accuracy.

Keywords: logistic regression, support vector classifier, k-nearest neighbour, decision tree, random forest and gradient boosting

Procedia PDF Downloads 50
2907 A Case Comparative Study of Infant Mortality Rate in North-West Nigeria

Authors: G. I. Onwuka, A. Danbaba, S. U. Gulumbe

Abstract:

This study investigated of Infant Mortality Rate as observed at a general hospital in Kaduna-South, Kaduna State, North West Nigeria. The causes of infant Mortality were examined. The data used for this analysis were collected at the statistics unit of the Hospital. The analysis was carried out on the data using Multiple Linear regression Technique and this showed that there is linear relationship between the dependent variable (death) and the independent variables (malaria, measles, anaemia, and coronary heart disease). The resultant model also revealed that a unit increment in each of these diseases would result to a unit increment in death recorded, 98.7% of the total variation in mortality is explained by the given model. The highest number of mortality was recorded in July, 2005 and the lowest mortality recorded in October, 2009.Recommendations were however made based on the results of the study.

Keywords: infant mortality rate, multiple linear regression, diseases, serial correlation

Procedia PDF Downloads 329
2906 Investigating the Effect of Study Plan and Homework on Student's Performance by Using Web Based Learning MyMathLab

Authors: Mohamed Chabi, Mahmoud I. Syam, Sarah Aw

Abstract:

In Summer 2012, the Foundation Program Unit of Qatar University has started implementing new ways of teaching Math by introducing MML (MyMathLab) as an innovative interactive tool to support standard teaching. In this paper, we focused on the effect of proper use of the Study Plan component of MML on student’s performance. Authors investigated the results of students of pre-calculus course during Fall 2013 in Foundation Program at Qatar University. The results showed that there is a strong correlation between study plan results and final exam results, also a strong relation between homework results and final exam results. In addition, the attendance average affected on the student’s results in general. Multiple regression is determined between passing rate dependent variable and study plan, homework as independent variable.

Keywords: MyMathLab, study plan, assessment, homework, attendance, correlation, regression

Procedia PDF Downloads 419
2905 Mediterranean Diet, Duration of Admission and Mortality in Elderly, Hospitalized Patients: A Cross-Sectional Study

Authors: Christos Lampropoulos, Maria Konsta, Ifigenia Apostolou, Vicky Dradaki, Tamta Sirbilatze, Irini Dri, Christina Kordali, Vaggelis Lambas, Kostas Argyros, Georgios Mavras

Abstract:

Objectives: Mediterranean diet has been associated with lower incidence of cardiovascular disease and cancer. The purpose of our study was to examine the hypothesis that Mediterranean diet may protect against mortality and reduce admission duration in elderly, hospitalized patients. Methods: Sample population included 150 patients (78 men, 72 women, mean age 80±8.2). The following data were taken into account in analysis: anthropometric and laboratory data, dietary habits (MedDiet score), patients’ nutritional status [Mini Nutritional Assessment (MNA) score], physical activity (International Physical Activity Questionnaires, IPAQ), smoking status, cause and duration of current admission, medical history (co-morbidities, previous admissions). Primary endpoints were mortality (from admission until 6 months afterwards) and duration of admission, compared to national guidelines for closed consolidated medical expenses. Logistic regression and linear regression analysis were performed in order to identify independent predictors for mortality and admission duration difference respectively. Results: According to MNA, nutrition was normal in 54/150 (36%) of patients, 46/150 (30.7%) of them were at risk of malnutrition and the rest 50/150 (33.3%) were malnourished. After performing multivariate logistic regression analysis we found that the odds of death decreased 30% per each unit increase of MedDiet score (OR=0.7, 95% CI:0.6-0.8, p < 0.0001). Patients with cancer-related admission were 37.7 times more likely to die, compared to those with infection (OR=37.7, 95% CI:4.4-325, p=0.001). According to multivariate linear regression analysis, admission duration was inversely related to Mediterranean diet, since it is decreased 0.18 days on average for each unit increase of MedDiet score (b:-0.18, 95% CI:-0.33 - -0.035, p=0.02). Additionally, the duration of current admission increased on average 0.83 days for each previous hospital admission (b:0.83, 95% CI:0.5-1.16, p<0.0001). The admission duration of patients with cancer was on average 4.5 days higher than the patients who admitted due to infection (b:4.5, 95% CI:0.9-8, p=0.015). Conclusion: Mediterranean diet adequately protects elderly, hospitalized patients against mortality and reduces the duration of hospitalization.

Keywords: Mediterranean diet, malnutrition, nutritional status, prognostic factors for mortality

Procedia PDF Downloads 313
2904 Quasi-Photon Monte Carlo on Radiative Heat Transfer: An Importance Sampling and Learning Approach

Authors: Utkarsh A. Mishra, Ankit Bansal

Abstract:

At high temperature, radiative heat transfer is the dominant mode of heat transfer. It is governed by various phenomena such as photon emission, absorption, and scattering. The solution of the governing integrodifferential equation of radiative transfer is a complex process, more when the effect of participating medium and wavelength properties are taken into consideration. Although a generic formulation of such radiative transport problem can be modeled for a wide variety of problems with non-gray, non-diffusive surfaces, there is always a trade-off between simplicity and accuracy of the problem. Recently, solutions of complicated mathematical problems with statistical methods based on randomization of naturally occurring phenomena have gained significant importance. Photon bundles with discrete energy can be replicated with random numbers describing the emission, absorption, and scattering processes. Photon Monte Carlo (PMC) is a simple, yet powerful technique, to solve radiative transfer problems in complicated geometries with arbitrary participating medium. The method, on the one hand, increases the accuracy of estimation, and on the other hand, increases the computational cost. The participating media -generally a gas, such as CO₂, CO, and H₂O- present complex emission and absorption spectra. To model the emission/absorption accurately with random numbers requires a weighted sampling as different sections of the spectrum carries different importance. Importance sampling (IS) was implemented to sample random photon of arbitrary wavelength, and the sampled data provided unbiased training of MC estimators for better results. A better replacement to uniform random numbers is using deterministic, quasi-random sequences. Halton, Sobol, and Faure Low-Discrepancy Sequences are used in this study. They possess better space-filling performance than the uniform random number generator and gives rise to a low variance, stable Quasi-Monte Carlo (QMC) estimators with faster convergence. An optimal supervised learning scheme was further considered to reduce the computation costs of the PMC simulation. A one-dimensional plane-parallel slab problem with participating media was formulated. The history of some randomly sampled photon bundles is recorded to train an Artificial Neural Network (ANN), back-propagation model. The flux was calculated using the standard quasi PMC and was considered to be the training target. Results obtained with the proposed model for the one-dimensional problem are compared with the exact analytical and PMC model with the Line by Line (LBL) spectral model. The approximate variance obtained was around 3.14%. Results were analyzed with respect to time and the total flux in both cases. A significant reduction in variance as well a faster rate of convergence was observed in the case of the QMC method over the standard PMC method. However, the results obtained with the ANN method resulted in greater variance (around 25-28%) as compared to the other cases. There is a great scope of machine learning models to help in further reduction of computation cost once trained successfully. Multiple ways of selecting the input data as well as various architectures will be tried such that the concerned environment can be fully addressed to the ANN model. Better results can be achieved in this unexplored domain.

Keywords: radiative heat transfer, Monte Carlo Method, pseudo-random numbers, low discrepancy sequences, artificial neural networks

Procedia PDF Downloads 223
2903 The Effect of User Comments on Traffic Application Usage

Authors: I. Gokasar, G. Bakioglu

Abstract:

With the unprecedented rates of technological improvements, people start to solve their problems with the help of technological tools. According to application stores and websites in which people evaluate and comment on the traffic apps, there are more than 100 traffic applications which have different features with respect to their purpose of usage ranging from the features of traffic apps for public transit modes to the features of traffic apps for private cars. This study focuses on the top 30 traffic applications which were chosen with respect to their download counts. All data about the traffic applications were obtained from related websites. The purpose of this study is to analyze traffic applications in terms of their categorical attributes with the help of developing a regression model. The analysis results suggest that negative interpretations (e.g., being deficient) does not lead to lower star ratings of the applications. However, those negative interpretations result in a smaller increase in star rate. In addition, women use higher star rates than men for the evaluation of traffic applications.

Keywords: traffic app, real–time information, traffic congestion, regression analysis, dummy variables

Procedia PDF Downloads 429
2902 Impact of Trade Cooperation of BRICS Countries on Economic Growth

Authors: Svetlana Gusarova

Abstract:

The essential role in the recent development of world economy has led to the developing countries, notably to BRICS countries (Brazil, Russia, India, China, South Africa). Over the next 50 years the BRICS countries are expected to be the engines of global trade and economic growth. Trade cooperation of BRICS countries can enhance their economic development. BRICS countries were among Top 10 world exporters of office and telecom equipment, of textiles, of clothing, of iron and steel, of chemicals, of agricultural products, of automotive products, of fuel and mining products. China was one of the main trading partners of all BRICS countries, maintaining close relationship with all BRICS countries in the development of trade. Author analyzed trade complementarity of BRICS countries and revealed the high level of complementarity of their trade flows in connection with availability of specialization in different types of goods. The correlation and regression analysis of communication of Intra-BRICS merchandise turnover and their GDP (PPP) revealed very strong impact on the development of their economies.

Keywords: BRICS countries, trade cooperation, complementarity, regression analysis

Procedia PDF Downloads 281
2901 Adaptive Neuro Fuzzy Inference System Model Based on Support Vector Regression for Stock Time Series Forecasting

Authors: Anita Setianingrum, Oki S. Jaya, Zuherman Rustam

Abstract:

Forecasting stock price is a challenging task due to the complex time series of the data. The complexity arises from many variables that affect the stock market. Many time series models have been proposed before, but those previous models still have some problems: 1) put the subjectivity of choosing the technical indicators, and 2) rely upon some assumptions about the variables, so it is limited to be applied to all datasets. Therefore, this paper studied a novel Adaptive Neuro-Fuzzy Inference System (ANFIS) time series model based on Support Vector Regression (SVR) for forecasting the stock market. In order to evaluate the performance of proposed models, stock market transaction data of TAIEX and HIS from January to December 2015 is collected as experimental datasets. As a result, the method has outperformed its counterparts in terms of accuracy.

Keywords: ANFIS, fuzzy time series, stock forecasting, SVR

Procedia PDF Downloads 246
2900 An Inquiry of the Impact of Flood Risk on Housing Market with Enhanced Geographically Weighted Regression

Authors: Lin-Han Chiang Hsieh, Hsiao-Yi Lin

Abstract:

This study aims to determine the impact of the disclosure of flood potential map on housing prices. The disclosure is supposed to mitigate the market failure by reducing information asymmetry. On the other hand, opponents argue that the official disclosure of simulated results will only create unnecessary disturbances on the housing market. This study identifies the impact of the disclosure of the flood potential map by comparing the hedonic price of flood potential before and after the disclosure. The flood potential map used in this study is published by Taipei municipal government in 2015, which is a result of a comprehensive simulation based on geographical, hydrological, and meteorological factors. The residential property sales data of 2013 to 2016 is used in this study, which is collected from the actual sales price registration system by the Department of Land Administration (DLA). The result shows that the impact of flood potential on residential real estate market is statistically significant both before and after the disclosure. But the trend is clearer after the disclosure, suggesting that the disclosure does have an impact on the market. Also, the result shows that the impact of flood potential differs by the severity and frequency of precipitation. The negative impact for a relatively mild, high frequency flood potential is stronger than that for a heavy, low possibility flood potential. The result indicates that home buyers are of more concern to the frequency, than the intensity of flood. Another contribution of this study is in the methodological perspective. The classic hedonic price analysis with OLS regression suffers from two spatial problems: the endogeneity problem caused by omitted spatial-related variables, and the heterogeneity concern to the presumption that regression coefficients are spatially constant. These two problems are seldom considered in a single model. This study tries to deal with the endogeneity and heterogeneity problem together by combining the spatial fixed-effect model and geographically weighted regression (GWR). A series of literature indicates that the hedonic price of certain environmental assets varies spatially by applying GWR. Since the endogeneity problem is usually not considered in typical GWR models, it is arguable that the omitted spatial-related variables might bias the result of GWR models. By combing the spatial fixed-effect model and GWR, this study concludes that the effect of flood potential map is highly sensitive by location, even after controlling for the spatial autocorrelation at the same time. The main policy application of this result is that it is improper to determine the potential benefit of flood prevention policy by simply multiplying the hedonic price of flood risk by the number of houses. The effect of flood prevention might vary dramatically by location.

Keywords: flood potential, hedonic price analysis, endogeneity, heterogeneity, geographically-weighted regression

Procedia PDF Downloads 290
2899 Discrete State Prediction Algorithm Design with Self Performance Enhancement Capacity

Authors: Smail Tigani, Mohamed Ouzzif

Abstract:

This work presents a discrete quantitative state prediction algorithm with intelligent behavior making it able to self-improve some performance aspects. The specificity of this algorithm is the capacity of self-rectification of the prediction strategy before the final decision. The auto-rectification mechanism is based on two parallel mathematical models. In one hand, the algorithm predicts the next state based on event transition matrix updated after each observation. In the other hand, the algorithm extracts its residues trend with a linear regression representing historical residues data-points in order to rectify the first decision if needs. For a normal distribution, the interactivity between the two models allows the algorithm to self-optimize its performance and then make better prediction. Designed key performance indicator, computed during a Monte Carlo simulation, shows the advantages of the proposed approach compared with traditional one.

Keywords: discrete state, Markov Chains, linear regression, auto-adaptive systems, decision making, Monte Carlo Simulation

Procedia PDF Downloads 498
2898 Life in Bequia in the Era of Climate Change: Societal Perception of Adaptation and Vulnerability

Authors: Sherry Ann Ganase, Sandra Sookram

Abstract:

This study examines adaptation measures and factors that influence adaptation decisions in Bequia by using multiple linear regression and a structural equation model. Using survey data, the results suggest that households are knowledgeable and concerned about climate change but lack knowledge about the measures needed to adapt. The findings from the SEM suggest that a positive relationship exist between vulnerability and adaptation, vulnerability and perception, along with a negative relationship between perception and adaptation. This suggests that being aware of the terms associated with climate change and knowledge about climate change is insufficient for implementing adaptation measures; instead the risk and importance placed on climate change, vulnerability experienced with household flooding, drainage and expected threat of future sea level are the main factors that influence the adaptation decision. The results obtained in this study are beneficial to all as adaptation requires a collective effort by stakeholders.

Keywords: adaptation, Bequia, multiple linear regression, structural equation model

Procedia PDF Downloads 463
2897 Early Impact Prediction and Key Factors Study of Artificial Intelligence Patents: A Method Based on LightGBM and Interpretable Machine Learning

Authors: Xingyu Gao, Qiang Wu

Abstract:

Patents play a crucial role in protecting innovation and intellectual property. Early prediction of the impact of artificial intelligence (AI) patents helps researchers and companies allocate resources and make better decisions. Understanding the key factors that influence patent impact can assist researchers in gaining a better understanding of the evolution of AI technology and innovation trends. Therefore, identifying highly impactful patents early and providing support for them holds immeasurable value in accelerating technological progress, reducing research and development costs, and mitigating market positioning risks. Despite the extensive research on AI patents, accurately predicting their early impact remains a challenge. Traditional methods often consider only single factors or simple combinations, failing to comprehensively and accurately reflect the actual impact of patents. This paper utilized the artificial intelligence patent database from the United States Patent and Trademark Office and the Len.org patent retrieval platform to obtain specific information on 35,708 AI patents. Using six machine learning models, namely Multiple Linear Regression, Random Forest Regression, XGBoost Regression, LightGBM Regression, Support Vector Machine Regression, and K-Nearest Neighbors Regression, and using early indicators of patents as features, the paper comprehensively predicted the impact of patents from three aspects: technical, social, and economic. These aspects include the technical leadership of patents, the number of citations they receive, and their shared value. The SHAP (Shapley Additive exPlanations) metric was used to explain the predictions of the best model, quantifying the contribution of each feature to the model's predictions. The experimental results on the AI patent dataset indicate that, for all three target variables, LightGBM regression shows the best predictive performance. Specifically, patent novelty has the greatest impact on predicting the technical impact of patents and has a positive effect. Additionally, the number of owners, the number of backward citations, and the number of independent claims are all crucial and have a positive influence on predicting technical impact. In predicting the social impact of patents, the number of applicants is considered the most critical input variable, but it has a negative impact on social impact. At the same time, the number of independent claims, the number of owners, and the number of backward citations are also important predictive factors, and they have a positive effect on social impact. For predicting the economic impact of patents, the number of independent claims is considered the most important factor and has a positive impact on economic impact. The number of owners, the number of sibling countries or regions, and the size of the extended patent family also have a positive influence on economic impact. The study primarily relies on data from the United States Patent and Trademark Office for artificial intelligence patents. Future research could consider more comprehensive data sources, including artificial intelligence patent data, from a global perspective. While the study takes into account various factors, there may still be other important features not considered. In the future, factors such as patent implementation and market applications may be considered as they could have an impact on the influence of patents.

Keywords: patent influence, interpretable machine learning, predictive models, SHAP

Procedia PDF Downloads 50
2896 Progressive Type-I Interval Censoring with Binomial Removal-Estimation and Its Properties

Authors: Sonal Budhiraja, Biswabrata Pradhan

Abstract:

This work considers statistical inference based on progressive Type-I interval censored data with random removal. The scheme of progressive Type-I interval censoring with random removal can be described as follows. Suppose n identical items are placed on a test at time T0 = 0 under k pre-fixed inspection times at pre-specified times T1 < T2 < . . . < Tk, where Tk is the scheduled termination time of the experiment. At inspection time Ti, Ri of the remaining surviving units Si, are randomly removed from the experiment. The removal follows a binomial distribution with parameters Si and pi for i = 1, . . . , k, with pk = 1. In this censoring scheme, the number of failures in different inspection intervals and the number of randomly removed items at pre-specified inspection times are observed. Asymptotic properties of the maximum likelihood estimators (MLEs) are established under some regularity conditions. A β-content γ-level tolerance interval (TI) is determined for two parameters Weibull lifetime model using the asymptotic properties of MLEs. The minimum sample size required to achieve the desired β-content γ-level TI is determined. The performance of the MLEs and TI is studied via simulation.

Keywords: asymptotic normality, consistency, regularity conditions, simulation study, tolerance interval

Procedia PDF Downloads 249
2895 An Assessment of Self-Perceived Health after the Death of a Spouse among the Elderly

Authors: Shu-Hsi Ho

Abstract:

The problems of aging and number of widowed peers gradually rise in Taiwan. It is worth to concern the related issues for elderly after the death of a spouse. Hence, this study is to examine the impact of spousal death on the surviving spouse’s self-perceived health and mental health for the elderly in Taiwan. A cross section data design and ordered logistic regression models are applied to investigate whether marriage is associated significantly to self-perceived health and mental health for the widowed older Taiwanese. The results indicate that widowed marriage shows significant negative effects on self-perceived health and mental health regardless of widows or widowers. Among them, widows might be more likely to show worse mental health than widowers. The belief confirms that marriage provides effective sources to promote self-perceived health and mental health, particularly for females. In addition, since the social welfare system is not perfect in Taiwan, the findings also suggest that family and social support reveal strongly association with the self-perceived health and mental health for the widows and widowers elderly.

Keywords: logistic regression models, self-perceived health, widow, widower

Procedia PDF Downloads 463
2894 Examining the Cognitive Abilities and Financial Literacy Among Street Entrepreneurs: Evidence From North-East, India

Authors: Aayushi Lyngwa, Bimal Kishore Sahoo

Abstract:

The study discusses the relationship between cognitive ability and the level of education attained by the tribal street entrepreneurs on their financial literacy. It is driven by the objective of examining the effect of cognitive ability on financial ability on the one hand and determining the effect of the same on financial literacy on the other. A field experiment was conducted on 203 tribal street vendors in the north-eastern Indian state of Mizoram. This experiment's calculations are conditioned by providing each question scores like math score (cognitive ability), financial score and debt score (financial ability). After that, categories for each of the variables, like math category (math score), financial category (financial score) and debt category (debt score), are generated to run the regression model. Since the dependent variable is ordinal, an ordered logit regression model was applied. The study shows that street vendors' cognitive and financial abilities are highly correlated. It, therefore, confirms that cognitive ability positively affects the financial literacy of street vendors through the increase in attainment of educational levels. It is also found that concerning the type of street vendors, regular street vendors are more likely to have better cognitive abilities than temporary street vendors. Additionally, street vendors with more cognitive and financial abilities gained better monthly profits and performed habits of bookkeeping. The study attempts to draw a particular focus on a set-up which is economically and socially marginalized in the Indian economy. Its finding contributes to understanding financial literacy in an understudied area and provides policy implications through inclusive financial systems solutions in an economy limited to tribal street vendors.

Keywords: financial literacy, education, street entrepreneurs, tribals, cognitive ability, financial ability, ordered logit regression.

Procedia PDF Downloads 110
2893 Experimental Design and Optimization of Diesel Oil Desulfurization Process by Adsorption Processes

Authors: M. Firoz Kalam, Wilfried Schuetz, Jan Hendrik Bredehoeft

Abstract:

Thiophene sulfur compounds' removal from diesel oil by batch adsorption process using commercial powdered activated carbon was designed and optimized in two-level factorial design method. This design analysis was used to find out the effects of operating parameters directing the adsorption process, such as amount of adsorbent, temperature and stirring time. The desulfurization efficiency was considered the response or output variable. Results showed that the stirring time had the largest effects on sulfur removal efficiency as compared with other operating parameters and their interactions under the experimental ranges studied. A regression model was generated to observe the closeness between predicted and experimental values. The three-dimensional plots and contour plots of main factors were generated according to the regression results to observe the optimal points.

Keywords: activated carbon, adsorptive desulfurization, factorial design, process optimization

Procedia PDF Downloads 162
2892 Development of a Turbulent Boundary Layer Wall-pressure Fluctuations Power Spectrum Model Using a Stepwise Regression Algorithm

Authors: Zachary Huffman, Joana Rocha

Abstract:

Wall-pressure fluctuations induced by the turbulent boundary layer (TBL) developed over aircraft are a significant source of aircraft cabin noise. Since the power spectral density (PSD) of these pressure fluctuations is directly correlated with the amount of sound radiated into the cabin, the development of accurate empirical models that predict the PSD has been an important ongoing research topic. The sound emitted can be represented from the pressure fluctuations term in the Reynoldsaveraged Navier-Stokes equations (RANS). Therefore, early TBL empirical models (including those from Lowson, Robertson, Chase, and Howe) were primarily derived by simplifying and solving the RANS for pressure fluctuation and adding appropriate scales. Most subsequent models (including Goody, Efimtsov, Laganelli, Smol’yakov, and Rackl and Weston models) were derived by making modifications to these early models or by physical principles. Overall, these models have had varying levels of accuracy, but, in general, they are most accurate under the specific Reynolds and Mach numbers they were developed for, while being less accurate under other flow conditions. Despite this, recent research into the possibility of using alternative methods for deriving the models has been rather limited. More recent studies have demonstrated that an artificial neural network model was more accurate than traditional models and could be applied more generally, but the accuracy of other machine learning techniques has not been explored. In the current study, an original model is derived using a stepwise regression algorithm in the statistical programming language R, and TBL wall-pressure fluctuations PSD data gathered at the Carleton University wind tunnel. The theoretical advantage of a stepwise regression approach is that it will automatically filter out redundant or uncorrelated input variables (through the process of feature selection), and it is computationally faster than machine learning. The main disadvantage is the potential risk of overfitting. The accuracy of the developed model is assessed by comparing it to independently sourced datasets.

Keywords: aircraft noise, machine learning, power spectral density models, regression models, turbulent boundary layer wall-pressure fluctuations

Procedia PDF Downloads 135
2891 Assessment of the Impact of Traffic Safety Policy in Barcelona, 2010-2019

Authors: Lluís Bermúdez, Isabel Morillo

Abstract:

Road safety involves carrying out a determined and explicit policy to reduce accidents. In the city of Barcelona, through the Local Road Safety Plan 2013-2018, in line with the framework that has been established at the European and state level, a series of preventive, corrective and technical measures are specified, with the priority objective of reducing the number of serious injuries and fatalities. In this work, based on the data from the accidents managed by the local police during the period 2010-2019, an analysis is carried out to verify whether the measures established in the Plan to reduce the accident rate have had an effect or not and to what extent. The analysis focuses on the type of accident and the type of vehicles involved. Different count regression models have been fitted, from which it can be deduced that the number of serious and fatal victims of the accidents that have occurred in the city of Barcelona has been reduced as the measures approved by the authorities.

Keywords: accident reduction, count regression models, road safety, urban traffic

Procedia PDF Downloads 133
2890 Effects of Video Games and Online Chat on Mathematics Performance in High School: An Approach of Multivariate Data Analysis

Authors: Lina Wu, Wenyi Lu, Ye Li

Abstract:

Regarding heavy video game players for boys and super online chat lovers for girls as a symbolic phrase in the current adolescent culture, this project of data analysis verifies the displacement effect on deteriorating mathematics performance. To evaluate correlation or regression coefficients between a factor of playing video games or chatting online and mathematics performance compared with other factors, we use multivariate analysis technique and take gender difference into account. We find the most important reason for the negative sign of the displacement effect on mathematics performance due to students’ poor academic background. Statistical analysis methods in this project could be applied to study internet users’ academic performance from the high school education to the college education.

Keywords: correlation coefficients, displacement effect, multivariate analysis technique, regression coefficients

Procedia PDF Downloads 364
2889 Understanding the Impact of Climate-Induced Rural-Urban Migration on the Technical Efficiency of Maize Production in Malawi

Authors: Innocent Pangapanga-Phiri, Eric Dada Mungatana

Abstract:

This study estimates the effect of climate-induced rural-urban migrants (RUM) on maize productivity. It uses panel data gathered by the National Statistics Office and the World Bank to understand the effect of RUM on the technical efficiency of maize production in rural Malawi. The study runs the two-stage Tobit regression to isolate the real effect of rural-urban migration on the technical efficiency of maize production. The results show that RUM significantly reduces the technical efficiency of maize production. However, the interaction of RUM and climate-smart agriculture has a positive and significant influence on the technical efficiency of maize production, suggesting the need for re-investing migrants’ remittances in agricultural activities.

Keywords: climate-smart agriculture, farm productivity, rural-urban migration, panel stochastic frontier models, two-stage Tobit regression

Procedia PDF Downloads 132
2888 A Regression Model for Predicting Sugar Crystal Size in a Fed-Batch Vacuum Evaporative Crystallizer

Authors: Sunday B. Alabi, Edikan P. Felix, Aniediong M. Umo

Abstract:

Crystal size distribution is of great importance in the sugar factories. It determines the market value of granulated sugar and also influences the cost of production of sugar crystals. Typically, sugar is produced using fed-batch vacuum evaporative crystallizer. The crystallization quality is examined by crystal size distribution at the end of the process which is quantified by two parameters: the average crystal size of the distribution in the mean aperture (MA) and the width of the distribution of the coefficient of variation (CV). Lack of real-time measurement of the sugar crystal size hinders its feedback control and eventual optimisation of the crystallization process. An attractive alternative is to use a soft sensor (model-based method) for online estimation of the sugar crystal size. Unfortunately, the available models for sugar crystallization process are not suitable as they do not contain variables that can be measured easily online. The main contribution of this paper is the development of a regression model for estimating the sugar crystal size as a function of input variables which are easy to measure online. This has the potential to provide real-time estimates of crystal size for its effective feedback control. Using 7 input variables namely: initial crystal size (Lo), temperature (T), vacuum pressure (P), feed flowrate (Ff), steam flowrate (Fs), initial super-saturation (S0) and crystallization time (t), preliminary studies were carried out using Minitab 14 statistical software. Based on the existing sugar crystallizer models, and the typical ranges of these 7 input variables, 128 datasets were obtained from a 2-level factorial experimental design. These datasets were used to obtain a simple but online-implementable 6-input crystal size model. It seems the initial crystal size (Lₒ) does not play a significant role. The goodness of the resulting regression model was evaluated. The coefficient of determination, R² was obtained as 0.994, and the maximum absolute relative error (MARE) was obtained as 4.6%. The high R² (~1.0) and the reasonably low MARE values are an indication that the model is able to predict sugar crystal size accurately as a function of the 6 easy-to-measure online variables. Thus, the model can be used as a soft sensor to provide real-time estimates of sugar crystal size during sugar crystallization process in a fed-batch vacuum evaporative crystallizer.

Keywords: crystal size, regression model, soft sensor, sugar, vacuum evaporative crystallizer

Procedia PDF Downloads 208
2887 An Approach for Estimation in Hierarchical Clustered Data Applicable to Rare Diseases

Authors: Daniel C. Bonzo

Abstract:

Practical considerations lead to the use of unit of analysis within subjects, e.g., bleeding episodes or treatment-related adverse events, in rare disease settings. This is coupled with data augmentation techniques such as extrapolation to enlarge the subject base. In general, one can think about extrapolation of data as extending information and conclusions from one estimand to another estimand. This approach induces hierarchichal clustered data with varying cluster sizes. Extrapolation of clinical trial data is being accepted increasingly by regulatory agencies as a means of generating data in diverse situations during drug development process. Under certain circumstances, data can be extrapolated to a different population, a different but related indication, and different but similar product. We consider here the problem of estimation (point and interval) using a mixed-models approach under an extrapolation. It is proposed that estimators (point and interval) be constructed using weighting schemes for the clusters, e.g., equally weighted and with weights proportional to cluster size. Simulated data generated under varying scenarios are then used to evaluate the performance of this approach. In conclusion, the evaluation result showed that the approach is a useful means for improving statistical inference in rare disease settings and thus aids not only signal detection but risk-benefit evaluation as well.

Keywords: clustered data, estimand, extrapolation, mixed model

Procedia PDF Downloads 136