Search results for: robust penalized regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4380

Search results for: robust penalized regression

4140 Using the Bootstrap for Problems Statistics

Authors: Brahim Boukabcha, Amar Rebbouh

Abstract:

The bootstrap method based on the idea of exploiting all the information provided by the initial sample, allows us to study the properties of estimators. In this article we will present a theoretical study on the different methods of bootstrapping and using the technique of re-sampling in statistics inference to calculate the standard error of means of an estimator and determining a confidence interval for an estimated parameter. We apply these methods tested in the regression models and Pareto model, giving the best approximations.

Keywords: bootstrap, error standard, bias, jackknife, mean, median, variance, confidence interval, regression models

Procedia PDF Downloads 354
4139 Enhancing Predictive Accuracy in Pharmaceutical Sales through an Ensemble Kernel Gaussian Process Regression Approach

Authors: Shahin Mirshekari, Mohammadreza Moradi, Hossein Jafari, Mehdi Jafari, Mohammad Ensaf

Abstract:

This research employs Gaussian Process Regression (GPR) with an ensemble kernel, integrating Exponential Squared, Revised Matern, and Rational Quadratic kernels to analyze pharmaceutical sales data. Bayesian optimization was used to identify optimal kernel weights: 0.76 for Exponential Squared, 0.21 for Revised Matern, and 0.13 for Rational Quadratic. The ensemble kernel demonstrated superior performance in predictive accuracy, achieving an R² score near 1.0, and significantly lower values in MSE, MAE, and RMSE. These findings highlight the efficacy of ensemble kernels in GPR for predictive analytics in complex pharmaceutical sales datasets.

Keywords: Gaussian process regression, ensemble kernels, bayesian optimization, pharmaceutical sales analysis, time series forecasting, data analysis

Procedia PDF Downloads 28
4138 Solutions of Fuzzy Transportation Problem Using Best Candidates Method and Different Ranking Techniques

Authors: M. S. Annie Christi

Abstract:

Transportation Problem (TP) is based on supply and demand of commodities transported from one source to the different destinations. Usual methods for finding solution of TPs are North-West Corner Rule, Least Cost Method Vogel’s Approximation Method etc. The transportation costs tend to vary at each time. We can use fuzzy numbers which would give solution according to this situation. In this study the Best Candidate Method (BCM) is applied. For ranking Centroid Ranking Technique (CRT) and Robust Ranking Technique have been adopted to transform the fuzzy TP and the above methods are applied to EDWARDS Vacuum Company, Crawley, in West Sussex in the United Kingdom. A Comparative study is also given. We see that the transportation cost can be minimized by the application of CRT under BCM.

Keywords: best candidate method, centroid ranking technique, fuzzy transportation problem, robust ranking technique, transportation problem

Procedia PDF Downloads 258
4137 A Meta Regression Analysis to Detect Price Premium Threshold for Eco-Labeled Seafood

Authors: Cristina Giosuè, Federica Biondo, Sergio Vitale

Abstract:

In the last years, the consumers' awareness for environmental concerns has been increasing, and seafood eco-labels are considered as a possible instrument to improve both seafood markets and sustainable fishing management. In this direction, the aim of this study was to carry out a meta-analysis on consumers’ willingness to pay (WTP) for eco-labeled wild seafood, by a meta-regression. Therefore, only papers published on ISI journals were searched on “Web of Knowledge” and “SciVerse Scopus” platforms, using the combinations of the following key words: seafood, ecolabel, eco-label, willingness, WTP and premium. The dataset was built considering: paper’s and survey’s codes, year of publication, first author’s nationality, species’ taxa and family, sample size, survey’s continent and country, data collection (where and how), gender and age of consumers, brand and ΔWTP. From analysis the interest on eco labeled seafood emerged clearly, in particular in developed countries. In general, consumers declared greater willingness to pay than that actually applied for eco-label products, with difference related to taxa and brand.

Keywords: eco label, meta regression, seafood, willingness to pay

Procedia PDF Downloads 94
4136 Estimating Bridge Deterioration for Small Data Sets Using Regression and Markov Models

Authors: Yina F. Muñoz, Alexander Paz, Hanns De La Fuente-Mella, Joaquin V. Fariña, Guilherme M. Sales

Abstract:

The primary approach for estimating bridge deterioration uses Markov-chain models and regression analysis. Traditional Markov models have problems in estimating the required transition probabilities when a small sample size is used. Often, reliable bridge data have not been taken over large periods, thus large data sets may not be available. This study presents an important change to the traditional approach by using the Small Data Method to estimate transition probabilities. The results illustrate that the Small Data Method and traditional approach both provide similar estimates; however, the former method provides results that are more conservative. That is, Small Data Method provided slightly lower than expected bridge condition ratings compared with the traditional approach. Considering that bridges are critical infrastructures, the Small Data Method, which uses more information and provides more conservative estimates, may be more appropriate when the available sample size is small. In addition, regression analysis was used to calculate bridge deterioration. Condition ratings were determined for bridge groups, and the best regression model was selected for each group. The results obtained were very similar to those obtained when using Markov chains; however, it is desirable to use more data for better results.

Keywords: concrete bridges, deterioration, Markov chains, probability matrix

Procedia PDF Downloads 316
4135 Sliding Mode Power System Stabilizer for Synchronous Generator Stability Improvement

Authors: J. Ritonja, R. Brezovnik, M. Petrun, B. Polajžer

Abstract:

Many modern synchronous generators in power systems are extremely weakly damped. The reasons are cost optimization of the machine building and introduction of the additional control equipment into power systems. Oscillations of the synchronous generators and related stability problems of the power systems are harmful and can lead to failures in operation and to damages. The only useful solution to increase damping of the unwanted oscillations represents the implementation of the power system stabilizers. Power system stabilizers generate the additional control signal which changes synchronous generator field excitation voltage. Modern power system stabilizers are integrated into static excitation systems of the synchronous generators. Available commercial power system stabilizers are based on linear control theory. Due to the nonlinear dynamics of the synchronous generator, current stabilizers do not assure optimal damping of the synchronous generator’s oscillations in the entire operating range. For that reason the use of the robust power system stabilizers which are convenient for the entire operating range is reasonable. There are numerous robust techniques applicable for the power system stabilizers. In this paper the use of sliding mode control for synchronous generator stability improvement is studied. On the basis of the sliding mode theory, the robust power system stabilizer was developed. The main advantages of the sliding mode controller are simple realization of the control algorithm, robustness to parameter variations and elimination of disturbances. The advantage of the proposed sliding mode controller against conventional linear controller was tested for damping of the synchronous generator oscillations in the entire operating range. Obtained results show the improved damping in the entire operating range of the synchronous generator and the increase of the power system stability. The proposed study contributes to the progress in the development of the advanced stabilizer, which will replace conventional linear stabilizers and improve damping of the synchronous generators.

Keywords: control theory, power system stabilizer, robust control, sliding mode control, stability, synchronous generator

Procedia PDF Downloads 199
4134 Optical-Based Lane-Assist System for Rowing Boats

Authors: Stephen Tullis, M. David DiDonato, Hong Sung Park

Abstract:

Rowing boats (shells) are often steered by a small rudder operated by one of the backward-facing rowers; the attention required of that athlete then slightly decreases the power that that athlete can provide. Reducing the steering distraction would then increase the overall boat speed. Races are straight 2000 m courses with each boat in a 13.5 m wide lane marked by small (~15 cm) widely-spaced (~10 m) buoys, and the boat trajectory is affected by both cross-currents and winds. An optical buoy recognition and tracking system has been developed that provides the boat’s location and orientation with respect to the lane edges. This information is provided to the steering athlete as either: a simple overlay on a video display, or fed to a simplified autopilot system giving steering directions to the athlete or directly controlling the rudder. The system is then effectively a “lane-assist” device but with small, widely-spaced lane markers viewed from a very shallow angle due to constraints on camera height. The image is captured with a lightweight 1080p webcam, and most of the image analysis is done in OpenCV. The colour RGB-image is converted to a grayscale using the difference of the red and blue channels, which provides good contrast between the red/yellow buoys and the water, sky, land background and white reflections and noise. Buoy detection is done with thresholding within a tight mask applied to the image. Robust linear regression using Tukey’s biweight estimator of the previously detected buoy locations is used to develop the mask; this avoids the false detection of noise such as waves (reflections) and, in particular, buoys in other lanes. The robust regression also provides the current lane edges in the camera frame that are used to calculate the displacement of the boat from the lane centre (lane location), and its yaw angle. The interception of the detected lane edges provides a lane vanishing point, and yaw angle can be calculated simply based on the displacement of this vanishing point from the camera axis and the image plane distance. Lane location is simply based on the lateral displacement of the vanishing point from any horizontal cut through the lane edges. The boat lane position and yaw are currently fed what is essentially a stripped down marine auto-pilot system. Currently, only the lane location is used in a PID controller of a rudder actuator with integrator anti-windup to deal with saturation of the rudder angle. Low Kp and Kd values decrease unnecessarily fast return to lane centrelines and response to noise, and limiters can be used to avoid lane departure and disqualification. Yaw is not used as a control input, as cross-winds and currents can cause a straight course with considerable yaw or crab angle. Mapping of the controller with rudder angle “overall effectiveness” has not been finalized - very large rudder angles stall and have decreased turning moments, but at less extreme angles the increased rudder drag slows the boat and upsets boat balance. The full system has many features similar to automotive lane-assist systems, but with the added constraints of the lane markers, camera positioning, control response and noise increasing the challenge.

Keywords: auto-pilot, lane-assist, marine, optical, rowing

Procedia PDF Downloads 104
4133 A Robust Software for Advanced Analysis of Space Steel Frames

Authors: Viet-Hung Truong, Seung-Eock Kim

Abstract:

This paper presents a robust software package for practical advanced analysis of space steel framed structures. The pre- and post-processors of the presented software package are coded in the C++ programming language while the solver is written by using the FORTRAN programming language. A user-friendly graphical interface of the presented software is developed to facilitate the modeling process and result interpretation of the problem. The solver employs the stability functions for capturing the second-order effects to minimize modeling and computational time. Both the plastic-hinge and fiber-hinge beam-column elements are available in the presented software. The generalized displacement control method is adopted to solve the nonlinear equilibrium equations.

Keywords: advanced analysis, beam-column, fiber-hinge, plastic hinge, steel frame

Procedia PDF Downloads 283
4132 Robust and Transparent Spread Spectrum Audio Watermarking

Authors: Ali Akbar Attari, Ali Asghar Beheshti Shirazi

Abstract:

In this paper, we propose a blind and robust audio watermarking scheme based on spread spectrum in Discrete Wavelet Transform (DWT) domain. Watermarks are embedded in the low-frequency coefficients, which is less audible. The key idea is dividing the audio signal into small frames, and magnitude of the 6th level of DWT approximation coefficients is modifying based upon the Direct Sequence Spread Spectrum (DSSS) technique. Also, the psychoacoustic model for enhancing in imperceptibility, as well as Savitsky-Golay filter for increasing accuracy in extraction, is used. The experimental results illustrate high robustness against most common attacks, i.e. Gaussian noise addition, Low pass filter, Resampling, Requantizing, MP3 compression, without significant perceptual distortion (ODG is higher than -1). The proposed scheme has about 83 bps data payload.

Keywords: audio watermarking, spread spectrum, discrete wavelet transform, psychoacoustic, Savitsky-Golay filter

Procedia PDF Downloads 171
4131 A New Method to Estimate the Low Income Proportion: Monte Carlo Simulations

Authors: Encarnación Álvarez, Rosa M. García-Fernández, Juan F. Muñoz

Abstract:

Estimation of a proportion has many applications in economics and social studies. A common application is the estimation of the low income proportion, which gives the proportion of people classified as poor into a population. In this paper, we present this poverty indicator and propose to use the logistic regression estimator for the problem of estimating the low income proportion. Various sampling designs are presented. Assuming a real data set obtained from the European Survey on Income and Living Conditions, Monte Carlo simulation studies are carried out to analyze the empirical performance of the logistic regression estimator under the various sampling designs considered in this paper. Results derived from Monte Carlo simulation studies indicate that the logistic regression estimator can be more accurate than the customary estimator under the various sampling designs considered in this paper. The stratified sampling design can also provide more accurate results.

Keywords: poverty line, risk of poverty, auxiliary variable, ratio method

Procedia PDF Downloads 428
4130 Analysing Causal Effect of London Cycle Superhighways on Traffic Congestion

Authors: Prajamitra Bhuyan

Abstract:

Transport operators have a range of intervention options available to improve or enhance their networks. But often such interventions are made in the absence of sound evidence on what outcomes may result. Cycling superhighways were promoted as a sustainable and healthy travel mode which aims to cut traffic congestion. The estimation of the impacts of the cycle superhighways on congestion is complicated due to the non-random assignment of such intervention over the transport network. In this paper, we analyse the causal effect of cycle superhighways utilising pre-innervation and post-intervention information on traffic and road characteristics along with socio-economic factors. We propose a modeling framework based on the propensity score and outcome regression model. The method is also extended to doubly robust set-up. Simulation results show the superiority of the performance of the proposed method over existing competitors. The method is applied to analyse a real dataset on the London transport network, and the result would help effective decision making to improve network performance.

Keywords: average treatment effect, confounder, difference-in-difference, intelligent transportation system, potential outcome

Procedia PDF Downloads 207
4129 Economic Impacts of Sanctuary and Immigration and Customs Enforcement Policies Inclusive and Exclusive Institutions

Authors: Alexander David Natanson

Abstract:

This paper focuses on the effect of Sanctuary and Immigration and Customs Enforcement (ICE) policies on local economies. "Sanctuary cities" refers to municipal jurisdictions that limit their cooperation with the federal government's efforts to enforce immigration. Using county-level data from the American Community Survey and ICE data on economic indicators from 2006 to 2018, this study isolates the effects of local immigration policies on U.S. counties. The investigation is accomplished by simultaneously studying the policies' effects in counties where immigrants' families are persecuted via collaboration with Immigration and Customs Enforcement (ICE), in contrast to counties that provide protections. The analysis includes a difference-in-difference & two-way fixed effect model. Results are robust to nearest-neighbor matching, after the random assignment of treatment, after running estimations using different cutoffs for immigration policies, and with a regression discontinuity model comparing bordering counties with opposite policies. Results are also robust after restricting the data to a single-year policy adoption, using the Sun and Abraham estimator, and with event-study estimation to deal with the staggered treatment issue. In addition, the study reverses the estimation to understand what drives the decision to choose policies to detect the presence of reverse causality biases in the estimated policy impact on economic factors. The evidence demonstrates that providing protections to undocumented immigrants increases economic activity. The estimates show gains in per capita income ranging from 3.1 to 7.2, median wages between 1.7 to 2.6, and GDP between 2.4 to 4.1 percent. Regarding labor, sanctuary counties saw increases in total employment between 2.3 to 4 percent, and the unemployment rate declined from 12 to 17 percent. The data further shows that ICE policies have no statistically significant effects on income, median wages, or GDP but adverse effects on total employment, with declines from 1 to 2 percent, mostly in rural counties, and an increase in unemployment of around 7 percent in urban counties. In addition, results show a decline in the foreign-born population in ICE counties but no changes in sanctuary counties. The study also finds similar results for sanctuary counties when separating the data between urban, rural, educational attainment, gender, ethnic groups, economic quintiles, and the number of business establishments. The takeaway from this study is that institutional inclusion creates the dynamic nature of an economy, as inclusion allows for economic expansion due to the extension of fundamental freedoms to newcomers. Inclusive policies show positive effects on economic outcomes with no evident increase in population. To make sense of these results, the hypothesis and theoretical model propose that inclusive immigration policies play an essential role in conditioning the effect of immigration by decreasing uncertainties and constraints for immigrants' interaction in their communities, decreasing the cost from fear of deportation or the constant fear of criminalization and optimize their human capital.

Keywords: inclusive and exclusive institutions, post matching, fixed effect, time trend, regression discontinuity, difference-in-difference, randomization inference and sun, Abraham estimator

Procedia PDF Downloads 57
4128 An Overbooking Model for Car Rental Service with Different Types of Cars

Authors: Naragain Phumchusri, Kittitach Pongpairoj

Abstract:

Overbooking is a very useful revenue management technique that could help reduce costs caused by either undersales or oversales. In this paper, we propose an overbooking model for two types of cars that can minimize the total cost for car rental service. With two types of cars, there is an upgrade possibility for lower type to upper type. This makes the model more complex than one type of cars scenario. We have found that convexity can be proved in this case. Sensitivity analysis of the parameters is conducted to observe the effects of relevant parameters on the optimal solution. Model simplification is proposed using multiple linear regression analysis, which can help estimate the optimal overbooking level using appropriate independent variables. The results show that the overbooking level from multiple linear regression model is relatively close to the optimal solution (with the adjusted R-squared value of at least 72.8%). To evaluate the performance of the proposed model, the total cost was compared with the case where the decision maker uses a naïve method for the overbooking level. It was found that the total cost from optimal solution is only 0.5 to 1 percent (on average) lower than the cost from regression model, while it is approximately 67% lower than the cost obtained by the naïve method. It indicates that our proposed simplification method using regression analysis can effectively perform in estimating the overbooking level.

Keywords: overbooking, car rental industry, revenue management, stochastic model

Procedia PDF Downloads 146
4127 Coverage Probability Analysis of WiMAX Network under Additive White Gaussian Noise and Predicted Empirical Path Loss Model

Authors: Chaudhuri Manoj Kumar Swain, Susmita Das

Abstract:

This paper explores a detailed procedure of predicting a path loss (PL) model and its application in estimating the coverage probability in a WiMAX network. For this a hybrid approach is followed in predicting an empirical PL model of a 2.65 GHz WiMAX network deployed in a suburban environment. Data collection, statistical analysis, and regression analysis are the phases of operations incorporated in this approach and the importance of each of these phases has been discussed properly. The procedure of collecting data such as received signal strength indicator (RSSI) through experimental set up is demonstrated. From the collected data set, empirical PL and RSSI models are predicted with regression technique. Furthermore, with the aid of the predicted PL model, essential parameters such as PL exponent as well as the coverage probability of the network are evaluated. This research work may assist in the process of deployment and optimisation of any cellular network significantly.

Keywords: WiMAX, RSSI, path loss, coverage probability, regression analysis

Procedia PDF Downloads 140
4126 The Factors Of Supply Chain Collaboration

Authors: Ghada Soltane

Abstract:

The objective of this study was to identify factors impacting supply chain collaboration. a quantitative study was carried out on a sample of 84 Tunisian industrial companies. To verify the research hypotheses and test the direct effect of these factors on supply chain collaboration a multiple regression method was used using SPSS 26 software. The results show that there are four factors direct effects that affect supply chain collaboration in a meaningful and positive way, including: trust, engagement, information sharing and information quality

Keywords: supply chain collaboration, factors of collaboration, principal component analysis, multiple regression

Procedia PDF Downloads 0
4125 Regression Analysis in Estimating Stream-Flow and the Effect of Hierarchical Clustering Analysis: A Case Study in Euphrates-Tigris Basin

Authors: Goksel Ezgi Guzey, Bihrat Onoz

Abstract:

The scarcity of streamflow gauging stations and the increasing effects of global warming cause designing water management systems to be very difficult. This study is a significant contribution to assessing regional regression models for estimating streamflow. In this study, simulated meteorological data was related to the observed streamflow data from 1971 to 2020 for 33 stream gauging stations of the Euphrates-Tigris Basin. Ordinary least squares regression was used to predict flow for 2020-2100 with the simulated meteorological data. CORDEX- EURO and CORDEX-MENA domains were used with 0.11 and 0.22 grids, respectively, to estimate climate conditions under certain climate scenarios. Twelve meteorological variables simulated by two regional climate models, RCA4 and RegCM4, were used as independent variables in the ordinary least squares regression, where the observed streamflow was the dependent variable. The variability of streamflow was then calculated with 5-6 meteorological variables and watershed characteristics such as area and height prior to the application. Of the regression analysis of 31 stream gauging stations' data, the stations were subjected to a clustering analysis, which grouped the stations in two clusters in terms of their hydrometeorological properties. Two streamflow equations were found for the two clusters of stream gauging stations for every domain and every regional climate model, which increased the efficiency of streamflow estimation by a range of 10-15% for all the models. This study underlines the importance of homogeneity of a region in estimating streamflow not only in terms of the geographical location but also in terms of the meteorological characteristics of that region.

Keywords: hydrology, streamflow estimation, climate change, hydrologic modeling, HBV, hydropower

Procedia PDF Downloads 95
4124 DWT-SATS Based Detection of Image Region Cloning

Authors: Michael Zimba

Abstract:

A duplicated image region may be subjected to a number of attacks such as noise addition, compression, reflection, rotation, and scaling with the intention of either merely mating it to its targeted neighborhood or preventing its detection. In this paper, we present an effective and robust method of detecting duplicated regions inclusive of those affected by the various attacks. In order to reduce the dimension of the image, the proposed algorithm firstly performs discrete wavelet transform, DWT, of a suspicious image. However, unlike most existing copy move image forgery (CMIF) detection algorithms operating in the DWT domain which extract only the low frequency sub-band of the DWT of the suspicious image thereby leaving valuable information in the other three sub-bands, the proposed algorithm simultaneously extracts features from all the four sub-bands. The extracted features are not only more accurate representation of image regions but also robust to additive noise, JPEG compression, and affine transformation. Furthermore, principal component analysis-eigenvalue decomposition, PCA-EVD, is applied to reduce the dimension of the features. The extracted features are then sorted using the more computationally efficient Radix Sort algorithm. Finally, same affine transformation selection, SATS, a duplication verification method, is applied to detect duplicated regions. The proposed algorithm is not only fast but also more robust to attacks compared to the related CMIF detection algorithms. The experimental results show high detection rates.

Keywords: affine transformation, discrete wavelet transform, radix sort, SATS

Procedia PDF Downloads 201
4123 Robust Heart Rate Estimation from Multiple Cardiovascular and Non-Cardiovascular Physiological Signals Using Signal Quality Indices and Kalman Filter

Authors: Shalini Rankawat, Mansi Rankawat, Rahul Dubey, Mazad Zaveri

Abstract:

Physiological signals such as electrocardiogram (ECG) and arterial blood pressure (ABP) in the intensive care unit (ICU) are often seriously corrupted by noise, artifacts, and missing data, which lead to errors in the estimation of heart rate (HR) and incidences of false alarm from ICU monitors. Clinical support in ICU requires most reliable heart rate estimation. Cardiac activity, because of its relatively high electrical energy, may introduce artifacts in Electroencephalogram (EEG), Electrooculogram (EOG), and Electromyogram (EMG) recordings. This paper presents a robust heart rate estimation method by detection of R-peaks of ECG artifacts in EEG, EMG & EOG signals, using energy-based function and a novel Signal Quality Index (SQI) assessment technique. SQIs of physiological signals (EEG, EMG, & EOG) were obtained by correlation of nonlinear energy operator (teager energy) of these signals with either ECG or ABP signal. HR is estimated from ECG, ABP, EEG, EMG, and EOG signals from separate Kalman filter based upon individual SQIs. Data fusion of each HR estimate was then performed by weighing each estimate by the Kalman filters’ SQI modified innovations. The fused signal HR estimate is more accurate and robust than any of the individual HR estimate. This method was evaluated on MIMIC II data base of PhysioNet from bedside monitors of ICU patients. The method provides an accurate HR estimate even in the presence of noise and artifacts.

Keywords: ECG, ABP, EEG, EMG, EOG, ECG artifacts, Teager-Kaiser energy, heart rate, signal quality index, Kalman filter, data fusion

Procedia PDF Downloads 670
4122 Further Analysis of Global Robust Stability of Neural Networks with Multiple Time Delays

Authors: Sabri Arik

Abstract:

In this paper, we study the global asymptotic robust stability of delayed neural networks with norm-bounded uncertainties. By employing the Lyapunov stability theory and Homeomorphic mapping theorem, we derive some new types of sufficient conditions ensuring the existence, uniqueness and global asymptotic stability of the equilibrium point for the class of neural networks with discrete time delays under parameter uncertainties and with respect to continuous and slopebounded activation functions. An important aspect of our results is their low computational complexity as the reported results can be verified by checking some properties symmetric matrices associated with the uncertainty sets of network parameters. The obtained results are shown to be generalization of some of the previously published corresponding results. Some comparative numerical examples are also constructed to compare our results with some closely related existing literature results.

Keywords: neural networks, delayed systems, lyapunov functionals, stability analysis

Procedia PDF Downloads 499
4121 The Impact of Governance on Happiness: Evidence from Quantile Regressions

Authors: Chiung-Ju Huang

Abstract:

This study utilizes the quantile regression analysis to examine the impact of governance (including democratic quality and technical quality) on happiness in 101 countries worldwide, classified as “developed countries” and “developing countries”. The empirical results show that the impact of democratic quality and technical quality on happiness is significantly positive for “developed countries”, while is insignificant for “developing countries”. The results suggest that the authorities in developed countries can enhance the level of individual happiness by means of improving the democracy quality and technical quality. However, for developing countries, promoting the quality of governance in order to enhance the level of happiness may not be effective. Policy makers in developed countries may pay more attention on increasing real GDP per capita instead of promoting the quality of governance to enhance individual happiness.

Keywords: governance, happiness, multiple regression, quantile regression

Procedia PDF Downloads 249
4120 Breast Cancer Mortality and Comorbidities in Portugal: A Predictive Model Built with Real World Data

Authors: Cecília M. Antão, Paulo Jorge Nogueira

Abstract:

Breast cancer (BC) is the first cause of cancer mortality among Portuguese women. This retrospective observational study aimed at identifying comorbidities associated with BC female patients admitted to Portuguese public hospitals (2010-2018), investigating the effect of comorbidities on BC mortality rate, and building a predictive model using logistic regression. Results showed that the BC mortality in Portugal decreased in this period and reached 4.37% in 2018. Adjusted odds ratio indicated that secondary malignant neoplasms of liver, of bone and bone marrow, congestive heart failure, and diabetes were associated with an increased chance of dying from breast cancer. Although the Lisbon district (the most populated area) accounted for the largest percentage of BC patients, the logistic regression model showed that, besides patient’s age, being resident in Bragança, Castelo Branco, or Porto districts was directly associated with an increase of the mortality rate.

Keywords: breast cancer, comorbidities, logistic regression, adjusted odds ratio

Procedia PDF Downloads 56
4119 Assessing Relationships between Glandularity and Gray Level by Using Breast Phantoms

Authors: Yun-Xuan Tang, Pei-Yuan Liu, Kun-Mu Lu, Min-Tsung Tseng, Liang-Kuang Chen, Yuh-Feng Tsai, Ching-Wen Lee, Jay Wu

Abstract:

Breast cancer is predominant of malignant tumors in females. The increase in the glandular density increases the risk of breast cancer. BI-RADS is a frequently used density indicator in mammography; however, it significantly overestimates the glandularity. Therefore, it is very important to accurately and quantitatively assess the glandularity by mammography. In this study, 20%, 30% and 50% glandularity phantoms were exposed using a mammography machine at 28, 30 and 31 kVp, and 30, 55, 80 and 105 mAs, respectively. The regions of interest (ROIs) were drawn to assess the gray level. The relationship between the glandularity and gray level under various compression thicknesses, kVp, and mAs was established by the multivariable linear regression. A phantom verification was performed with automatic exposure control (AEC). The regression equation was obtained with an R-square value of 0.928. The average gray levels of the verification phantom were 8708, 8660 and 8434 for 0.952, 0.963 and 0.985 g/cm3, respectively. The percent differences of glandularity to the regression equation were 3.24%, 2.75% and 13.7%. We concluded that the proposed method could be clinically applied in mammography to improve the glandularity estimation and further increase the importance of breast cancer screening.

Keywords: mammography, glandularity, gray value, BI-RADS

Procedia PDF Downloads 464
4118 An Analysis of the Regression Hypothesis from a Shona Broca’s Aphasci Perspective

Authors: Esther Mafunda, Simbarashe Muparangi

Abstract:

The present paper tests the applicability of the Regression Hypothesis on the pathological language dissolution of a Shona male adult with Broca’s aphasia. It particularly assesses the prediction of the Regression Hypothesis, which states that the process according to which language is forgotten will be the reversal of the process according to which it will be acquired. The main aim of the paper is to find out whether mirror symmetries between L1 acquisition and L1 dissolution of tense in Shona and, if so, what might cause these regression patterns. The paper also sought to highlight the practical contributions that Linguistic theory can make to solving language-related problems. Data was collected from a 46-year-old male adult with Broca’s aphasia who was receiving speech therapy at St Giles Rehabilitation Centre in Harare, Zimbabwe. The primary data elicitation method was experimental, using the probe technique. The TART (Test for Assessing Reference Time) Shona version in the form of sequencing pictures was used to access tense by Broca’s aphasic and 3.5-year-old child. Using the SPSS (Statistical Package for Social Studies) and Excel analysis, it was established that the use of the future tense was impaired in Shona Broca’s aphasic whilst the present and past tense was intact. However, though the past tense was intact in the male adult with Broca’s aphasic, a reference to the remote past was made. The use of the future tense was also found to be difficult for the 3,5-year-old speaking child. No difficulties were encountered in using the present and past tenses. This means that mirror symmetries were found between L1 acquisition and L1 dissolution of tense in Shona. On the basis of the results of this research, it can be concluded that the use of tense in a Shona adult with Broca’s aphasia supports the Regression Hypothesis. The findings of this study are important in terms of speech therapy in the context of Zimbabwe. The study also contributes to Bantu linguistics in general and to Shona linguistics in particular. Further studies could also be done focusing on the rest of the Bantu language varieties in terms of aphasia.

Keywords: Broca’s Aphasia, regression hypothesis, Shona, language dissolution

Procedia PDF Downloads 60
4117 Data-Driven Dynamic Overbooking Model for Tour Operators

Authors: Kannapha Amaruchkul

Abstract:

We formulate a dynamic overbooking model for a tour operator, in which most reservations contain at least two people. The cancellation rate and the timing of the cancellation may depend on the group size. We propose two overbooking policies, namely economic- and service-based. In an economic-based policy, we want to minimize the expected oversold and underused cost, whereas, in a service-based policy, we ensure that the probability of an oversold situation does not exceed the pre-specified threshold. To illustrate the applicability of our approach, we use tour package data in 2016-2018 from a tour operator in Thailand to build a data-driven robust optimization model, and we tested the proposed overbooking policy in 2019. We also compare the data-driven approach to the conventional approach of fitting data into a probability distribution.

Keywords: applied stochastic model, data-driven robust optimization, overbooking, revenue management, tour operator

Procedia PDF Downloads 104
4116 Apricot Insurance Portfolio Risk

Authors: Kasirga Yildirak, Ismail Gur

Abstract:

We propose a model to measure hail risk of an Agricultural Insurance portfolio. Hail is one of the major catastrophic event that causes big amount of loss to an insurer. Moreover, it is very hard to predict due to its strange atmospheric characteristics. We make use of parcel based claims data on apricot damage collected by the Turkish Agricultural Insurance Pool (TARSIM). As our ultimate aim is to compute the loadings assigned to specific parcels, we build a portfolio risk model that makes use of PD and the severity of the exposures. PD is computed by Spherical-Linear and Circular –Linear regression models as the data carries coordinate information and seasonality. Severity is mapped into integer brackets so that Probability Generation Function could be employed. Individual regressions are run on each clusters estimated on different criteria. Loss distribution is constructed by Panjer Recursion technique. We also show that one risk-one crop model can easily be extended to the multi risk–multi crop model by assuming conditional independency.

Keywords: hail insurance, spherical regression, circular regression, spherical clustering

Procedia PDF Downloads 228
4115 Enhancing the Interpretation of Group-Level Diagnostic Results from Cognitive Diagnostic Assessment: Application of Quantile Regression and Cluster Analysis

Authors: Wenbo Du, Xiaomei Ma

Abstract:

With the empowerment of Cognitive Diagnostic Assessment (CDA), various domains of language testing and assessment have been investigated to dig out more diagnostic information. What is noticeable is that most of the extant empirical CDA-based research puts much emphasis on individual-level diagnostic purpose with very few concerned about learners’ group-level performance. Even though the personalized diagnostic feedback is the unique feature that differentiates CDA from other assessment tools, group-level diagnostic information cannot be overlooked in that it might be more practical in classroom setting. Additionally, the group-level diagnostic information obtained via current CDA always results in a “flat pattern”, that is, the mastery/non-mastery of all tested skills accounts for the two highest proportion. In that case, the outcome does not bring too much benefits than the original total score. To address these issues, the present study attempts to apply cluster analysis for group classification and quantile regression analysis to pinpoint learners’ performance at different proficiency levels (beginner, intermediate and advanced) thus to enhance the interpretation of the CDA results extracted from a group of EFL learners’ reading performance on a diagnostic reading test designed by PELDiaG research team from a key university in China. The results show that EM method in cluster analysis yield more appropriate classification results than that of CDA, and quantile regression analysis does picture more insightful characteristics of learners with different reading proficiencies. The findings are helpful and practical for instructors to refine EFL reading curriculum and instructional plan tailored based on the group classification results and quantile regression analysis. Meanwhile, these innovative statistical methods could also make up the deficiencies of CDA and push forward the development of language testing and assessment in the future.

Keywords: cognitive diagnostic assessment, diagnostic feedback, EFL reading, quantile regression

Procedia PDF Downloads 120
4114 Machine Learning Framework: Competitive Intelligence and Key Drivers Identification of Market Share Trends among Healthcare Facilities

Authors: Anudeep Appe, Bhanu Poluparthi, Lakshmi Kasivajjula, Udai Mv, Sobha Bagadi, Punya Modi, Aditya Singh, Hemanth Gunupudi, Spenser Troiano, Jeff Paul, Justin Stovall, Justin Yamamoto

Abstract:

The necessity of data-driven decisions in healthcare strategy formulation is rapidly increasing. A reliable framework which helps identify factors impacting a healthcare provider facility or a hospital (from here on termed as facility) market share is of key importance. This pilot study aims at developing a data-driven machine learning-regression framework which aids strategists in formulating key decisions to improve the facility’s market share which in turn impacts in improving the quality of healthcare services. The US (United States) healthcare business is chosen for the study, and the data spanning 60 key facilities in Washington State and about 3 years of historical data is considered. In the current analysis, market share is termed as the ratio of the facility’s encounters to the total encounters among the group of potential competitor facilities. The current study proposes a two-pronged approach of competitor identification and regression approach to evaluate and predict market share, respectively. Leveraged model agnostic technique, SHAP, to quantify the relative importance of features impacting the market share. Typical techniques in literature to quantify the degree of competitiveness among facilities use an empirical method to calculate a competitive factor to interpret the severity of competition. The proposed method identifies a pool of competitors, develops Directed Acyclic Graphs (DAGs) and feature level word vectors, and evaluates the key connected components at the facility level. This technique is robust since its data-driven, which minimizes the bias from empirical techniques. The DAGs factor in partial correlations at various segregations and key demographics of facilities along with a placeholder to factor in various business rules (for ex. quantifying the patient exchanges, provider references, and sister facilities). Identified are the multiple groups of competitors among facilities. Leveraging the competitors' identified developed and fine-tuned Random Forest Regression model to predict the market share. To identify key drivers of market share at an overall level, permutation feature importance of the attributes was calculated. For relative quantification of features at a facility level, incorporated SHAP (SHapley Additive exPlanations), a model agnostic explainer. This helped to identify and rank the attributes at each facility which impacts the market share. This approach proposes an amalgamation of the two popular and efficient modeling practices, viz., machine learning with graphs and tree-based regression techniques to reduce the bias. With these, we helped to drive strategic business decisions.

Keywords: competition, DAGs, facility, healthcare, machine learning, market share, random forest, SHAP

Procedia PDF Downloads 55
4113 Developing an Advanced Algorithm Capable of Classifying News, Articles and Other Textual Documents Using Text Mining Techniques

Authors: R. B. Knudsen, O. T. Rasmussen, R. A. Alphinas

Abstract:

The reason for conducting this research is to develop an algorithm that is capable of classifying news articles from the automobile industry, according to the competitive actions that they entail, with the use of Text Mining (TM) methods. It is needed to test how to properly preprocess the data for this research by preparing pipelines which fits each algorithm the best. The pipelines are tested along with nine different classification algorithms in the realm of regression, support vector machines, and neural networks. Preliminary testing for identifying the optimal pipelines and algorithms resulted in the selection of two algorithms with two different pipelines. The two algorithms are Logistic Regression (LR) and Artificial Neural Network (ANN). These algorithms are optimized further, where several parameters of each algorithm are tested. The best result is achieved with the ANN. The final model yields an accuracy of 0.79, a precision of 0.80, a recall of 0.78, and an F1 score of 0.76. By removing three of the classes that created noise, the final algorithm is capable of reaching an accuracy of 94%.

Keywords: Artificial Neural network, Competitive dynamics, Logistic Regression, Text classification, Text mining

Procedia PDF Downloads 93
4112 Multi-Point Dieless Forming Product Defect Reduction Using Reliability-Based Robust Process Optimization

Authors: Misganaw Abebe Baye, Ji-Woo Park, Beom-Soo Kang

Abstract:

The product quality of multi-point dieless forming (MDF) is identified to be dependent on the process parameters. Moreover, a certain variation of friction and material properties may have a substantially worse influence on the final product quality. This study proposed on how to compensate the MDF product defects by minimizing the sensitivity of noise parameter variations. This can be attained by reliability-based robust optimization (RRO) technique to obtain the optimal process setting of the controllable parameters. Initially two MDF Finite Element (FE) simulations of AA3003-H14 saddle shape showed a substantial amount of dimpling, wrinkling, and shape error. FE analyses are consequently applied on ABAQUS commercial software to obtain the correlation between the control process setting and noise variation with regard to the product defects. The best prediction models are chosen from the family of metamodels to swap the computational expensive FE simulation. Genetic algorithm (GA) is applied to determine the optimal process settings of the control parameters. Monte Carlo Analysis (MCA) is executed to determine how the noise parameter variation affects the final product quality. Finally, the RRO FE simulation and the experimental result show that the amendment of the control parameters in the final forming process leads to a considerably better-quality product.

Keywords: dimpling, multi-point dieless forming, reliability-based robust optimization, shape error, variation, wrinkling

Procedia PDF Downloads 222
4111 Study on Optimal Control Strategy of PM2.5 in Wuhan, China

Authors: Qiuling Xie, Shanliang Zhu, Zongdi Sun

Abstract:

In this paper, we analyzed the correlation relationship among PM2.5 from other five Air Quality Indices (AQIs) based on the grey relational degree, and built a multivariate nonlinear regression equation model of PM2.5 and the five monitoring indexes. For the optimal control problem of PM2.5, we took the partial large Cauchy distribution of membership equation as satisfaction function. We established a nonlinear programming model with the goal of maximum performance to price ratio. And the optimal control scheme is given.

Keywords: grey relational degree, multiple linear regression, membership function, nonlinear programming

Procedia PDF Downloads 261