Search results for: intuitionistic fuzzy regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3910

Search results for: intuitionistic fuzzy regression

3160 Binary Logistic Regression Model in Predicting the Employability of Senior High School Graduates

Authors: Cromwell F. Gopo, Joy L. Picar

Abstract:

This study aimed to predict the employability of senior high school graduates for S.Y. 2018- 2019 in the Davao del Norte Division through quantitative research design using the descriptive status and predictive approaches among the indicated parameters, namely gender, school type, academics, academic award recipient, skills, values, and strand. The respondents of the study were the 33 secondary schools offering senior high school programs identified through simple random sampling, which resulted in 1,530 cases of graduates’ secondary data, which were analyzed using frequency, percentage, mean, standard deviation, and binary logistic regression. Results showed that the majority of the senior high school graduates who come from large schools were females. Further, less than half of these graduates received any academic award in any semester. In general, the graduates’ performance in academics, skills, and values were proficient. Moreover, less than half of the graduates were not employed. Then, those who were employed were either contractual, casual, or part-time workers dominated by GAS graduates. Further, the predictors of employability were gender and the Information and Communications Technology (ICT) strand, while the remaining variables did not add significantly to the model. The null hypothesis had been rejected as the coefficients of the predictors in the binary logistic regression equation did not take the value of 0. After utilizing the model, it was concluded that Technical-Vocational-Livelihood (TVL) graduates except ICT had greater estimates of employability.

Keywords: employability, senior high school graduates, Davao del Norte, Philippines

Procedia PDF Downloads 152
3159 Designing Supplier Partnership Success Factors in the Coal Mining Industry

Authors: Ahmad Afif, Teuku Yuri M. Zagloel

Abstract:

Sustainable supply chain management is a new pattern that has emerged recently in industry and companies. The procurement process is one of the key factors for efficiency in supply chain management practices. Partnership is one of the procurement strategies for strategic items. The success factors of the partnership must be determined to avoid things that endanger the financial and operational status of the company. The current supplier partnership research focuses on the selection of general criteria and sustainable supplier selection. Currently, there is still limited research on the success factors of supplier partnerships that focus on strategic items in the coal mining industry. Meanwhile, the procurement of coal mining has its own characteristics, and there are regulations related to the procurement of goods. Therefore, this research was conducted to determine the categories of goods that are included in the strategic items and to design the success factors of supplier partnerships. The main factors studied are general, financial, production, reputation, synergies, and sustainable. The research was conducted using the Kraljic method to determine the categories of goods that are included in the strategic items. To design a supplier partnership success factor using the Hybrid Multi Criteria Decision Making method. Integrated Fuzzy AHP-Fuzzy TOPSIS is used to determine the weight of the success factors of supplier partnerships and to rank suppliers on the factors used.

Keywords: supplier, partnership, strategic item, success factors, and coal mining industry

Procedia PDF Downloads 131
3158 Prediction of Coronary Artery Stenosis Severity Based on Machine Learning Algorithms

Authors: Yu-Jia Jian, Emily Chia-Yu Su, Hui-Ling Hsu, Jian-Jhih Chen

Abstract:

Coronary artery is the major supplier of myocardial blood flow. When fat and cholesterol are deposit in the coronary arterial wall, narrowing and stenosis of the artery occurs, which may lead to myocardial ischemia and eventually infarction. According to the World Health Organization (WHO), estimated 740 million people have died of coronary heart disease in 2015. According to Statistics from Ministry of Health and Welfare in Taiwan, heart disease (except for hypertensive diseases) ranked the second among the top 10 causes of death from 2013 to 2016, and it still shows a growing trend. According to American Heart Association (AHA), the risk factors for coronary heart disease including: age (> 65 years), sex (men to women with 2:1 ratio), obesity, diabetes, hypertension, hyperlipidemia, smoking, family history, lack of exercise and more. We have collected a dataset of 421 patients from a hospital located in northern Taiwan who received coronary computed tomography (CT) angiography. There were 300 males (71.26%) and 121 females (28.74%), with age ranging from 24 to 92 years, and a mean age of 56.3 years. Prior to coronary CT angiography, basic data of the patients, including age, gender, obesity index (BMI), diastolic blood pressure, systolic blood pressure, diabetes, hypertension, hyperlipidemia, smoking, family history of coronary heart disease and exercise habits, were collected and used as input variables. The output variable of the prediction module is the degree of coronary artery stenosis. The output variable of the prediction module is the narrow constriction of the coronary artery. In this study, the dataset was randomly divided into 80% as training set and 20% as test set. Four machine learning algorithms, including logistic regression, stepwise regression, neural network and decision tree, were incorporated to generate prediction results. We used area under curve (AUC) / accuracy (Acc.) to compare the four models, the best model is neural network, followed by stepwise logistic regression, decision tree, and logistic regression, with 0.68 / 79 %, 0.68 / 74%, 0.65 / 78%, and 0.65 / 74%, respectively. Sensitivity of neural network was 27.3%, specificity was 90.8%, stepwise Logistic regression sensitivity was 18.2%, specificity was 92.3%, decision tree sensitivity was 13.6%, specificity was 100%, logistic regression sensitivity was 27.3%, specificity 89.2%. From the result of this study, we hope to improve the accuracy by improving the module parameters or other methods in the future and we hope to solve the problem of low sensitivity by adjusting the imbalanced proportion of positive and negative data.

Keywords: decision support, computed tomography, coronary artery, machine learning

Procedia PDF Downloads 229
3157 Determinants of Poverty: A Logit Regression Analysis of Zakat Applicants

Authors: Zunaidah Ab Hasan, Azhana Othman, Abd Halim Mohd Noor, Nor Shahrina Mohd Rafien

Abstract:

Zakat is a portion of wealth contributed from financially able Muslims to be distributed to predetermine recipients; main among them are the poor and the needy. Distribution of the zakat fund is given with the objective to lift the recipients from poverty. Due to the multidimensional and multifaceted nature of poverty, it is imperative that the causes of poverty are properly identified for assistance given by zakat authorities reached the intended target. Despite, various studies undertaken to identify the poor correctly, there are reports of the poor not receiving the adequate assistance required from zakat. Thus, this study examines the determinants of poverty among applicants for zakat assistance distributed by the State Islamic Religious Council in Malacca (SIRCM). Malacca is a state in Malaysia. The respondents were based on the list of names of new zakat applicants for the month of April and May 2014 provided by SIRCM. A binary logistic regression was estimated based on this data with either zakat applications is rejected or accepted as the dependent variable and set of demographic variables and health as the explanatory variables. Overall, the logistic model successfully predicted factors of acceptance of zakat applications. Three independent variables namely gender, age; size of households and health significantly explain the likelihood of a successful zakat application. Among others, the finding suggests the importance of focusing on providing education opportunity in helping the poor.

Keywords: logistic regression, zakat distribution, status of zakat applications, poverty, education

Procedia PDF Downloads 336
3156 A Study of the Establishment of the Evaluation Index System for Tourist Attraction Disaster Resilience

Authors: Chung-Hung Tsai, Ya-Ping Li

Abstract:

Tourism industry is highly depended on the natural environment and climate. Compared to other industries, it is more susceptible to environment and climate. Taiwan belongs to a sea island country and located in the subtropical monsoon zone. The events of climate variability, frequency of typhoons and rainfalls raged are caused regularly serious disaster. In traditional disaster assessment, it usually focuses on the disaster damage and risk assessment, which is short of the features from different industries to understand the impact of the restoring force in post-disaster resilience and the main factors that constitute resilience. The object of this study is based on disaster recovery experience of tourism area and to understand the main factors affecting the tourist area of disaster resilience. The combinations of literature review and interviews with experts are prepared an early indicator system of the disaster resilience. Then, it is screened through a Fuzzy Delphi Method and Analytic Network Process for weight analysis. Finally, this study will establish the tourism disaster resilience evaluation index system considering the Taiwan's tourism industry characteristics. We hope that be able to enhance disaster resilience after tourist areas and increases the sustainability of industrial development. It is expected to provide government departments the tourism industry as the future owner of the assets in extreme climates responses.

Keywords: resilience, Fuzzy Delphi Method, Analytic Network Process, industrial development

Procedia PDF Downloads 404
3155 Modelling the Impact of Installation of Heat Cost Allocators in District Heating Systems Using Machine Learning

Authors: Danica Maljkovic, Igor Balen, Bojana Dalbelo Basic

Abstract:

Following the regulation of EU Directive on Energy Efficiency, specifically Article 9, individual metering in district heating systems has to be introduced by the end of 2016. These directions have been implemented in member state’s legal framework, Croatia is one of these states. The directive allows installation of both heat metering devices and heat cost allocators. Mainly due to bad communication and PR, the general public false image was created that the heat cost allocators are devices that save energy. Although this notion is wrong, the aim of this work is to develop a model that would precisely express the influence of installation heat cost allocators on potential energy savings in each unit within multifamily buildings. At the same time, in recent years, a science of machine learning has gain larger application in various fields, as it is proven to give good results in cases where large amounts of data are to be processed with an aim to recognize a pattern and correlation of each of the relevant parameter as well as in the cases where the problem is too complex for a human intelligence to solve. A special method of machine learning, decision tree method, has proven an accuracy of over 92% in prediction general building consumption. In this paper, a machine learning algorithms will be used to isolate the sole impact of installation of heat cost allocators on a single building in multifamily houses connected to district heating systems. Special emphasises will be given regression analysis, logistic regression, support vector machines, decision trees and random forest method.

Keywords: district heating, heat cost allocator, energy efficiency, machine learning, decision tree model, regression analysis, logistic regression, support vector machines, decision trees and random forest method

Procedia PDF Downloads 249
3154 Ground Motion Modeling Using the Least Absolute Shrinkage and Selection Operator

Authors: Yildiz Stella Dak, Jale Tezcan

Abstract:

Ground motion models that relate a strong motion parameter of interest to a set of predictive seismological variables describing the earthquake source, the propagation path of the seismic wave, and the local site conditions constitute a critical component of seismic hazard analyses. When a sufficient number of strong motion records are available, ground motion relations are developed using statistical analysis of the recorded ground motion data. In regions lacking a sufficient number of recordings, a synthetic database is developed using stochastic, theoretical or hybrid approaches. Regardless of the manner the database was developed, ground motion relations are developed using regression analysis. Development of a ground motion relation is a challenging process which inevitably requires the modeler to make subjective decisions regarding the inclusion criteria of the recordings, the functional form of the model and the set of seismological variables to be included in the model. Because these decisions are critically important to the validity and the applicability of the model, there is a continuous interest on procedures that will facilitate the development of ground motion models. This paper proposes the use of the Least Absolute Shrinkage and Selection Operator (LASSO) in selecting the set predictive seismological variables to be used in developing a ground motion relation. The LASSO can be described as a penalized regression technique with a built-in capability of variable selection. Similar to the ridge regression, the LASSO is based on the idea of shrinking the regression coefficients to reduce the variance of the model. Unlike ridge regression, where the coefficients are shrunk but never set equal to zero, the LASSO sets some of the coefficients exactly to zero, effectively performing variable selection. Given a set of candidate input variables and the output variable of interest, LASSO allows ranking the input variables in terms of their relative importance, thereby facilitating the selection of the set of variables to be included in the model. Because the risk of overfitting increases as the ratio of the number of predictors to the number of recordings increases, selection of a compact set of variables is important in cases where a small number of recordings are available. In addition, identification of a small set of variables can improve the interpretability of the resulting model, especially when there is a large number of candidate predictors. A practical application of the proposed approach is presented, using more than 600 recordings from the National Geospatial-Intelligence Agency (NGA) database, where the effect of a set of seismological predictors on the 5% damped maximum direction spectral acceleration is investigated. The set of candidate predictors considered are Magnitude, Rrup, Vs30. Using LASSO, the relative importance of the candidate predictors has been ranked. Regression models with increasing levels of complexity were constructed using one, two, three, and four best predictors, and the models’ ability to explain the observed variance in the target variable have been compared. The bias-variance trade-off in the context of model selection is discussed.

Keywords: ground motion modeling, least absolute shrinkage and selection operator, penalized regression, variable selection

Procedia PDF Downloads 330
3153 Quality Parameters of Offset Printing Wastewater

Authors: Kiurski S. Jelena, Kecić S. Vesna, Aksentijević M. Snežana

Abstract:

Samples of tap and wastewater were collected in three offset printing facilities in Novi Sad, Serbia. Ten physicochemical parameters were analyzed within all collected samples: pH, conductivity, m - alkalinity, p - alkalinity, acidity, carbonate concentration, hydrogen carbonate concentration, active oxygen content, chloride concentration and total alkali content. All measurements were conducted using the standard analytical and instrumental methods. Comparing the obtained results for tap water and wastewater, a clear quality difference was noticeable, since all physicochemical parameters were significantly higher within wastewater samples. The study also involves the application of simple linear regression analysis on the obtained dataset. By using software package ORIGIN 5 the pH value was mutually correlated with other physicochemical parameters. Based on the obtained values of Pearson coefficient of determination a strong positive correlation between chloride concentration and pH (r = -0.943), as well as between acidity and pH (r = -0.855) was determined. In addition, statistically significant difference was obtained only between acidity and chloride concentration with pH values, since the values of parameter F (247.634 and 182.536) were higher than Fcritical (5.59). In this way, results of statistical analysis highlighted the most influential parameter of water contamination in offset printing, in the form of acidity and chloride concentration. The results showed that variable dependence could be represented by the general regression model: y = a0 + a1x+ k, which further resulted with matching graphic regressions.

Keywords: pollution, printing industry, simple linear regression analysis, wastewater

Procedia PDF Downloads 235
3152 A Hybrid-Evolutionary Optimizer for Modeling the Process of Obtaining Bricks

Authors: Marius Gavrilescu, Sabina-Adriana Floria, Florin Leon, Silvia Curteanu, Costel Anton

Abstract:

Natural sciences provide a wide range of experimental data whose related problems require study and modeling beyond the capabilities of conventional methodologies. Such problems have solution spaces whose complexity and high dimensionality require correspondingly complex regression methods for proper characterization. In this context, we propose an optimization method which consists in a hybrid dual optimizer setup: a global optimizer based on a modified variant of the popular Imperialist Competitive Algorithm (ICA), and a local optimizer based on a gradient descent approach. The ICA is modified such that intermediate solution populations are more quickly and efficiently pruned of low-fitness individuals by appropriately altering the assimilation, revolution and competition phases, which, combined with an initialization strategy based on low-discrepancy sampling, allows for a more effective exploration of the corresponding solution space. Subsequently, gradient-based optimization is used locally to seek the optimal solution in the neighborhoods of the solutions found through the modified ICA. We use this combined approach to find the optimal configuration and weights of a fully-connected neural network, resulting in regression models used to characterize the process of obtained bricks using silicon-based materials. Installations in the raw ceramics industry, i.e., bricks, are characterized by significant energy consumption and large quantities of emissions. Thus, the purpose of our approach is to determine by simulation the working conditions, including the manufacturing mix recipe with the addition of different materials, to minimize the emissions represented by CO and CH4. Our approach determines regression models which perform significantly better than those found using the traditional ICA for the aforementioned problem, resulting in better convergence and a substantially lower error.

Keywords: optimization, biologically inspired algorithm, regression models, bricks, emissions

Procedia PDF Downloads 82
3151 Econometric Analysis of West African Countries’ Container Terminal Throughput and Gross Domestic Products

Authors: Kehinde Peter Oyeduntan, Kayode Oshinubi

Abstract:

The west African ports have been experiencing large inflow and outflow of containerized cargo in the last decades, and this has created a quest amongst the countries to attain the status of hub port for the sub-region. This study analyzed the relationship between the container throughput and Gross Domestic Products (GDP) of nine west African countries, using Simple Linear Regression (SLR), Polynomial Regression Model (PRM) and Support Vector Machines (SVM) with a time series of 20 years. The results showed that there exists a high correlation between the GDP and container throughput. The model also predicted the container throughput in west Africa for the next 20 years. The findings and recommendations presented in this research will guide policy makers and help improve the management of container ports and terminals in west Africa, thereby boosting the economy.

Keywords: container, ports, terminals, throughput

Procedia PDF Downloads 215
3150 Fuzzy Expert Approach for Risk Mitigation on Functional Urban Areas Affected by Anthropogenic Ground Movements

Authors: Agnieszka A. Malinowska, R. Hejmanowski

Abstract:

A number of European cities are strongly affected by ground movements caused by anthropogenic activities or post-anthropogenic metamorphosis. Those are mainly water pumping, current mining operation, the collapse of post-mining underground voids or mining-induced earthquakes. These activities lead to large and small-scale ground displacements and a ground ruptures. The ground movements occurring in urban areas could considerably affect stability and safety of structures and infrastructures. The complexity of the ground deformation phenomenon in relation to the structures and infrastructures vulnerability leads to considerable constraints in assessing the threat of those objects. However, the increase of access to the free software and satellite data could pave the way for developing new methods and strategies for environmental risk mitigation and management. Open source geographical information systems (OS GIS), may support data integration, management, and risk analysis. Lately, developed methods based on fuzzy logic and experts methods for buildings and infrastructure damage risk assessment could be integrated into OS GIS. Those methods were verified base on back analysis proving their accuracy. Moreover, those methods could be supported by ground displacement observation. Based on freely available data from European Space Agency and free software, ground deformation could be estimated. The main innovation presented in the paper is the application of open source software (OS GIS) for integration developed models and assessment of the threat of urban areas. Those approaches will be reinforced by analysis of ground movement based on free satellite data. Those data would support the verification of ground movement prediction models. Moreover, satellite data will enable our mapping of ground deformation in urbanized areas. Developed models and methods have been implemented in one of the urban areas hazarded by underground mining activity. Vulnerability maps supported by satellite ground movement observation would mitigate the hazards of land displacements in urban areas close to mines.

Keywords: fuzzy logic, open source geographic information science (OS GIS), risk assessment on urbanized areas, satellite interferometry (InSAR)

Procedia PDF Downloads 159
3149 The Influence of the Vocational Teachers Empowerment toward the Vocational High Schools’ Performance Based on the Education National Standards of Indonesia

Authors: Abdul Haris Setiawan

Abstract:

Teachers empowerment is one of the important factors considered to contribute significantly to the achievement of the national education goals. This study was conducted to determine the influence on the vocational teachers empowerment toward the performance of the vocational high schools based on the Education National Standards of Indonesia. The population of the study was all vocational teachers at the State Vocational High schools in Surakarta, Central Java Province, Indonesia. The sampling technique used proportional random sampling technique. This study used a quantitative descriptive statistical analysis techniques. The data was collected using questionnaires. The data has been collected and then tested using analysis requirements test. Having tested using the requirements analysis and then the data processed using regression analysis between the independent and dependent variables to determine the effect and the regression equation. The results of the study found that the level of vocational high schools’ performance based on the Education National Standards of Indonesia was 74.29%, including in the high category; the level of vocational teachers empowerment was 76.20%, including in the high category; there was a positive influence of vocational teachers empowerment toward the vocational high schools’ performance based on the Education National Standards of Indonesia with a correlation coefficient of 0,886, and a contribution of 78.50% with the regression equation Y = 79.431 +0.534 X.

Keywords: vocational teachers, empowerment, vocational high school, the education national standards

Procedia PDF Downloads 394
3148 Prediction of Index-Mechanical Properties of Pyroclastic Rock Utilizing Electrical Resistivity Method

Authors: İsmail İnce

Abstract:

The aim of this study is to determine index and mechanical properties of pyroclastic rock in a practical way by means of electrical resistivity method. For this purpose, electrical resistivity, uniaxial compressive strength, point load strength, P-wave velocity, density and porosity values of 10 different pyroclastic rocks were measured in the laboratory. A simple regression analysis was made among the index-mechanical properties of the samples compatible with electrical resistivity values. A strong exponentially relation was found between index-mechanical properties and electrical resistivity values. The electrical resistivity method can be used to assess the engineering properties of the rock from which it is difficult to obtain regular shaped samples as a non-destructive method.

Keywords: electrical resistivity, index-mechanical properties, pyroclastic rocks, regression analysis

Procedia PDF Downloads 473
3147 Using Machine Learning to Enhance Win Ratio for College Ice Hockey Teams

Authors: Sadixa Sanjel, Ahmed Sadek, Naseef Mansoor, Zelalem Denekew

Abstract:

Collegiate ice hockey (NCAA) sports analytics is different from the national level hockey (NHL). We apply and compare multiple machine learning models such as Linear Regression, Random Forest, and Neural Networks to predict the win ratio for a team based on their statistics. Data exploration helps determine which statistics are most useful in increasing the win ratio, which would be beneficial to coaches and team managers. We ran experiments to select the best model and chose Random Forest as the best performing. We conclude with how to bridge the gap between the college and national levels of sports analytics and the use of machine learning to enhance team performance despite not having a lot of metrics or budget for automatic tracking.

Keywords: NCAA, NHL, sports analytics, random forest, regression, neural networks, game predictions

Procedia PDF Downloads 114
3146 A Survey on Quasi-Likelihood Estimation Approaches for Longitudinal Set-ups

Authors: Naushad Mamode Khan

Abstract:

The Com-Poisson (CMP) model is one of the most popular discrete generalized linear models (GLMS) that handles both equi-, over- and under-dispersed data. In longitudinal context, an integer-valued autoregressive (INAR(1)) process that incorporates covariate specification has been developed to model longitudinal CMP counts. However, the joint likelihood CMP function is difficult to specify and thus restricts the likelihood based estimating methodology. The joint generalized quasilikelihood approach (GQL-I) was instead considered but is rather computationally intensive and may not even estimate the regression effects due to a complex and frequently ill conditioned covariance structure. This paper proposes a new GQL approach for estimating the regression parameters (GQLIII) that are based on a single score vector representation. The performance of GQL-III is compared with GQL-I and separate marginal GQLs (GQL-II) through some simulation experiments and is proved to yield equally efficient estimates as GQL-I and is far more computationally stable.

Keywords: longitudinal, com-Poisson, ill-conditioned, INAR(1), GLMS, GQL

Procedia PDF Downloads 355
3145 The Relationship between Coping Styles and Internet Addiction among High School Students

Authors: Adil Kaval, Digdem Muge Siyez

Abstract:

With the negative effects of internet use in a person's life, the use of the Internet has become an issue. This subject was mostly considered as internet addiction, and it was investigated. In literature, it is noteworthy that some theoretical models have been proposed to explain the reasons for internet addiction. In addition to these theoretical models, it may be thought that the coping style for stressing events can be a predictor of internet addiction. It was aimed to test with logistic regression the effect of high school students' coping styles on internet addiction levels. Sample of the study consisted of 770 Turkish adolescents (471 girls, 299 boys) selected from high schools in the 2017-2018 academic year in İzmir province. Internet Addiction Test, Coping Scale for Child and Adolescents and a demographic information form were used in this study. The results of the logistic regression analysis indicated that the model of coping styles predicted internet addiction provides a statistically significant prediction of internet addiction. Gender does not predict whether or not to be addicted to the internet. The active coping style is not effective on internet addiction levels, while the avoiding and negative coping style are effective on internet addiction levels. With this model, % 79.1 of internet addiction in high school is estimated. The Negelkerke pseudo R2 indicated that the model accounted for %35 of the total variance. The results of this study on Turkish adolescents are similar to the results of other studies in the literature. It can be argued that avoiding and negative coping styles are important risk factors in the development of internet addiction.

Keywords: adolescents, coping, internet addiction, regression analysis

Procedia PDF Downloads 174
3144 Efficient Credit Card Fraud Detection Based on Multiple ML Algorithms

Authors: Neha Ahirwar

Abstract:

In the contemporary digital era, the rise of credit card fraud poses a significant threat to both financial institutions and consumers. As fraudulent activities become more sophisticated, there is an escalating demand for robust and effective fraud detection mechanisms. Advanced machine learning algorithms have become crucial tools in addressing this challenge. This paper conducts a thorough examination of the design and evaluation of a credit card fraud detection system, utilizing four prominent machine learning algorithms: random forest, logistic regression, decision tree, and XGBoost. The surge in digital transactions has opened avenues for fraudsters to exploit vulnerabilities within payment systems. Consequently, there is an urgent need for proactive and adaptable fraud detection systems. This study addresses this imperative by exploring the efficacy of machine learning algorithms in identifying fraudulent credit card transactions. The selection of random forest, logistic regression, decision tree, and XGBoost for scrutiny in this study is based on their documented effectiveness in diverse domains, particularly in credit card fraud detection. These algorithms are renowned for their capability to model intricate patterns and provide accurate predictions. Each algorithm is implemented and evaluated for its performance in a controlled environment, utilizing a diverse dataset comprising both genuine and fraudulent credit card transactions.

Keywords: efficient credit card fraud detection, random forest, logistic regression, XGBoost, decision tree

Procedia PDF Downloads 67
3143 An Analysis of Classification of Imbalanced Datasets by Using Synthetic Minority Over-Sampling Technique

Authors: Ghada A. Alfattni

Abstract:

Analysing unbalanced datasets is one of the challenges that practitioners in machine learning field face. However, many researches have been carried out to determine the effectiveness of the use of the synthetic minority over-sampling technique (SMOTE) to address this issue. The aim of this study was therefore to compare the effectiveness of the SMOTE over different models on unbalanced datasets. Three classification models (Logistic Regression, Support Vector Machine and Nearest Neighbour) were tested with multiple datasets, then the same datasets were oversampled by using SMOTE and applied again to the three models to compare the differences in the performances. Results of experiments show that the highest number of nearest neighbours gives lower values of error rates. 

Keywords: imbalanced datasets, SMOTE, machine learning, logistic regression, support vector machine, nearest neighbour

Procedia PDF Downloads 350
3142 Comparing Performance Indicators among Mechanistic, Organic, and Bureaucratic Organizations

Authors: Benchamat Laksaniyanon, Padcharee Phasuk, Rungtawan Boonphanakan

Abstract:

With globalization, organizations had to adjust to an unstable environment in order to survive in a competitive arena. Typically within the field of management, different types of organizations include mechanistic, bureaucratic and organic ones. In fact, bureaucratic and mechanistic organizations have some characteristics in common. Bureaucracy is one type of Thailand organization which adapted from mechanistic concept to develop an organization that is suitable for the characteristic and culture of Thailand. The objective of this study is to compare the adjustment strategies of both organizations in order to find key performance indicators (KPI) suitable for improving organization in Thailand. The methodology employed is binary logistic regression. The results of this study will be valuable for developing future management strategies for both bureaucratic and mechanistic organizations.

Keywords: mechanistic, bureaucratic and organic organization, binary logistic regression, key performance indicators (KPI)

Procedia PDF Downloads 359
3141 An Application of Integrated Multi-Objective Particles Swarm Optimization and Genetic Algorithm Metaheuristic through Fuzzy Logic for Optimization of Vehicle Routing Problems in Sugar Industry

Authors: Mukhtiar Singh, Sumeet Nagar

Abstract:

Vehicle routing problem (VRP) is a combinatorial optimization and nonlinear programming problem aiming to optimize decisions regarding given set of routes for a fleet of vehicles in order to provide cost-effective and efficient delivery of both services and goods to the intended customers. This paper proposes the application of integrated particle swarm optimization (PSO) and genetic optimization algorithm (GA) to address the Vehicle routing problem in sugarcane industry in India. Suger industry is very prominent agro-based industry in India due to its impacts on rural livelihood and estimated to be employing around 5 lakhs workers directly in sugar mills. Due to various inadequacies, inefficiencies and inappropriateness associated with the current vehicle routing model it costs huge money loss to the industry which needs to be addressed in proper context. The proposed algorithm utilizes the crossover operation that originally appears in genetic algorithm (GA) to improve its flexibility and manipulation more readily and avoid being trapped in local optimum, and simultaneously for improving the convergence speed of the algorithm, level set theory is also added to it. We employ the hybrid approach to an example of VRP and compare its result with those generated by PSO, GA, and parallel PSO algorithms. The experimental comparison results indicate that the performance of hybrid algorithm is superior to others, and it will become an effective approach for solving discrete combinatory problems.

Keywords: fuzzy logic, genetic algorithm, particle swarm optimization, vehicle routing problem

Procedia PDF Downloads 394
3140 Exploring Factors Affecting Electricity Production in Malaysia

Authors: Endang Jati Mat Sahid, Hussain Ali Bekhet

Abstract:

Ability to supply reliable and secure electricity has been one of the crucial components of economic development for any country. Forecasting of electricity production is therefore very important for accurate investment planning of generation power plants. In this study, we aim to examine and analyze the factors that affect electricity generation. Multiple regression models were used to find the relationship between various variables and electricity production. The models will simultaneously determine the effects of the variables on electricity generation. Many variables influencing electricity generation, i.e. natural gas (NG), coal (CO), fuel oil (FO), renewable energy (RE), gross domestic product (GDP) and fuel prices (FP), were examined for Malaysia. The results demonstrate that NG, CO, and FO were the main factors influencing electricity generation growth. This study then identified a number of policy implications resulting from the empirical results.

Keywords: energy policy, energy security, electricity production, Malaysia, the regression model

Procedia PDF Downloads 164
3139 Form of Distribution of Traffic Accident and Environment Factors of Road Affecting of Traffic Accident in Dusit District, Only Area Responsible of Samsen Police Station

Authors: Musthaya Patchanee

Abstract:

This research aimed to study form of traffic distribution and environmental factors of road that affect traffic accidents in Dusit District, only areas responsible of Samsen Police Station. Data used in this analysis is the secondary data of traffic accident case from year 2011. Observed area units are 15 traffic lines that are under responsible of Samsen Police Station. Technique and method used are the Cartographic Method, the Correlation Analysis, and the Multiple Regression Analysis. The results of form of traffic accidents show that, the Samsen Road area had most traffic accidents (24.29%), second was Rachvithi Road (18.10%), third was Sukhothai Road (15.71%), fourth was Rachasrima Road (12.38%), and fifth was Amnuaysongkram Road (7.62%). The result from Dusit District, only areas responsible of Samsen police station, has suggested that the scale of accidents have high positive correlation with statistic significant at level 0.05 and the frequency of travel (r=0.857). Traffic intersection point (r=0.763)and traffic control equipments (r=0.713) are relevant factors respectively. By using the Multiple Regression Analysis, travel frequency is the only one that has considerable influences on traffic accidents in Dusit district only Samsen Police Station area. Also, a factor in frequency of travel can explain the change in traffic accidents scale to 73.40 (R2 = 0.734). By using the Multiple regression summation from analysis was Y ̂=-7.977+0.044X6.

Keywords: form of traffic distribution, environmental factors of road, traffic accidents, Dusit district

Procedia PDF Downloads 391
3138 Modeling of Traffic Turning Movement

Authors: Michael Tilahun Mulugeta

Abstract:

Pedestrians are the most vulnerable road users as they are more exposed to the risk of collusion. Pedestrian safety at road intersections still remains the most vital and yet unsolved issue in Addis Ababa, Ethiopia. One of the critical points in pedestrian safety is the occurrence of conflict between turning vehicle and pedestrians at un-signalized intersection. However, a better understanding of the factors that affect the likelihood of the conflicts would help provide direction for countermeasures aimed at reducing the number of crashes. This paper has sorted to explore a model to describe the relation between traffic conflicts and influencing factors using Multiple Linear regression methodology. In this research the main focus is to study the interaction of turning (left & right) vehicle with pedestrian at unsignalized intersections. The specific objectives also to determine factors that affect the number of potential conflicts and develop a model of potential conflict.

Keywords: potential, regression analysis, pedestrian, conflicts

Procedia PDF Downloads 66
3137 Understanding the Linkages of Human Development and Fertility Change in Districts of Uttar Pradesh

Authors: Mamta Rajbhar, Sanjay K. Mohanty

Abstract:

India's progress in achieving replacement level of fertility is largely contingent on fertility reduction in the state of Uttar Pradesh as it accounts 17% of India's population with a low level of development. Though the TFR in the state has declined from 5.1 in 1991 to 3.4 by 2011, it conceals large differences in fertility level across districts. Using data from multiple sources this paper tests the hypothesis that the improvement in human development significantly reduces the fertility levels in districts of Uttar Pradesh. The unit of analyses is district, and fertility estimates are derived using the reverse survival method(RSM) while human development indices(HDI) are are estimated using uniform methodology adopted by UNDP for three period. The correlation and linear regression models are used to examine the relationship of fertility change and human development indices across districts. Result show the large variation and significant change in fertility level among the districts of Uttar Pradesh. During 1991-2011, eight districts had experienced a decline of TFR by 10-20%, 30 districts by 20-30% and 32 districts had experienced decline of more than 30%. On human development aspect, 17 districts recorded increase of more than 0.170 in HDI, 18 districts in the range of 0.150-0.170, 29 districts between 0.125-0.150 and six districts in the range of 0.1-0.125 during 1991-2011. Study shows significant negative relationship between HDI and TFR. HDI alone explains 70% variation in TFR. Also, the regression coefficient of TFR and HDI has become stronger over time; from -0.524 in 1991, -0.7477 by 2001 and -0.7181 by 2010. The regression analyses indicate that 0.1 point increase in HDI value will lead to 0.78 point decline in TFR. The HDI alone explains 70% variation in TFR. Improving the HDI will certainly reduce the fertility level in the districts.

Keywords: Fertility, HDI, Uttar Pradesh

Procedia PDF Downloads 250
3136 Dry Relaxation Shrinkage Prediction of Bordeaux Fiber Using a Feed Forward Neural

Authors: Baeza S. Roberto

Abstract:

The knitted fabric suffers a deformation in its dimensions due to stretching and tension factors, transverse and longitudinal respectively, during the process in rectilinear knitting machines so it performs a dry relaxation shrinkage procedure and thermal action of prefixed to obtain stable conditions in the knitting. This paper presents a dry relaxation shrinkage prediction of Bordeaux fiber using a feed forward neural network and linear regression models. Six operational alternatives of shrinkage were predicted. A comparison of the results was performed finding neural network models with higher levels of explanation of the variability and prediction. The presence of different reposes are included. The models were obtained through a neural toolbox of Matlab and Minitab software with real data in a knitting company of Southern Guanajuato. The results allow predicting dry relaxation shrinkage of each alternative operation.

Keywords: neural network, dry relaxation, knitting, linear regression

Procedia PDF Downloads 585
3135 A Comparative Analysis Approach Based on Fuzzy AHP, TOPSIS and PROMETHEE for the Selection Problem of GSCM Solutions

Authors: Omar Boutkhoum, Mohamed Hanine, Abdessadek Bendarag

Abstract:

Sustainable economic growth is nowadays driving firms to extend toward the adoption of many green supply chain management (GSCM) solutions. However, the evaluation and selection of these solutions is a matter of concern that needs very serious decisions, involving complexity owing to the presence of various associated factors. To resolve this problem, a comparative analysis approach based on multi-criteria decision-making methods is proposed for adequate evaluation of sustainable supply chain management solutions. In the present paper, we propose an integrated decision-making model based on FAHP (Fuzzy Analytic Hierarchy Process), TOPSIS (Technique for Order of Preference by Similarity to Ideal Solution) and PROMETHEE (Preference Ranking Organisation METHod for Enrichment Evaluations) to contribute to a better understanding and development of new sustainable strategies for industrial organizations. Due to the varied importance of the selected criteria, FAHP is used to identify the evaluation criteria and assign the importance weights for each criterion, while TOPSIS and PROMETHEE methods employ these weighted criteria as inputs to evaluate and rank the alternatives. The main objective is to provide a comparative analysis based on TOPSIS and PROMETHEE processes to help make sound and reasoned decisions related to the selection problem of GSCM solution.

Keywords: GSCM solutions, multi-criteria analysis, decision support system, TOPSIS, FAHP, PROMETHEE

Procedia PDF Downloads 163
3134 Paraoxonase 1 (PON 1) Arylesterase Activity and Apolipoprotein B: Predictors of Myocardial Infarction

Authors: Mukund Ramchandra Mogarekar, Pankaj Kumar, Shraddha Vilas More

Abstract:

Background: Myocardial infarction (MI) is defined as myocardial cell death due to prolonged ischemia as a consequence of atherosclerosis. TC, low-density lipoprotein cholesterol (LDL-C), very low-density lipoprotein cholesterol (VLDL-C), Apo B, and lipoprotein(a) was found as atherogenic factors while high-density lipoprotein cholesterol (HDL-C) was anti-atherogenic. Methods and Results: The study group consists of 40, MI subjects and 40 healthy individuals in control group. PON 1 Arylesterase activity (ARE) was measured by using phenylacetate. Phenotyping was done by double substrate method, serum AOPP by using chloramine T and Apo B by Turbidimetric immunoassay. PON 1 ARE activities were significantly lower (p< 0.05) and AOPPs & Apo B were higher in MI subjects (p> 0.05). Trimodal distribution of QQ, QR, and RR phenotypes of study population showed no significant difference among cases and controls (p> 0.05). Univariate binary logistic regression analysis showed independent association of TC, HDL, LDL, AOPP, Apo B, and PON 1 ARE activity with MI and multiple forward binary logistic regression showed PON 1 ARE activity and serum Apo B as an independent predictor of MI. Conclusions: Decrease in PON 1 ARE activity in MI subjects than in controls suggests increased oxidative stress in MI which is reflected by significantly increased AOPP and Apo B. PON1 polymorphism of QQ, QR and RR showed no significant difference in protection against MI. Univariate and multiple binary logistic regression showed PON1 ARE activity and serum Apo B as an independent predictor of MI.

Keywords: advanced oxidation protein product, apolipoprotein B, PON 1 arylesterase activity, myocardial infarction

Procedia PDF Downloads 266
3133 Assessment of Green Infrastructure for Sustainable Urban Water Management

Authors: Suraj Sharma

Abstract:

Green infrastructure (GI) offers a contemporary approach for reducing the risk of flooding, improve water quality, and harvesting stormwater for sustainable use. GI promotes landscape planning to enhance sustainable development and urban resilience. However, the existing literature is lacking in ensuring the comprehensive assessment of GI performance in terms of ecosystem function and services for social, ecological, and economical system resilience. We propose a robust indicator set and fuzzy comprehensive evaluation (FCE) for quantitative and qualitative analysis for sustainable water management to assess the capacity of urban resilience. Green infrastructure in urban resilience water management system (GIUR-WMS) supports decision-making for GI planning through scenario comparisons with urban resilience capacity index. To demonstrate the GIUR-WMS, we develop five scenarios for five sectors of Chandigarh (12, 26, 14, 17, and 34) to test common type of GI (rain barrel, rain gardens, detention basins, porous pavements, and open spaces). The result shows the open spaces achieve the highest green infrastructure urban resilience index of 4.22/5. To implement the open space scenario in urban sites, suitable vacant can be converted to green spaces (example: forest, low impact recreation areas, and detention basins) GIUR-WMS is easy to replicate, customize and apply to cities of different sizes to assess environmental, social and ecological dimensions.

Keywords: green infrastructure, assessment, urban resilience, water management system, fuzzy comprehensive evaluation

Procedia PDF Downloads 143
3132 Learning-Oriented School Education: Indicator Construction and Taiwan's Implementation Performance

Authors: Meiju Chen, Chaoyu Guo, Chia Wei Tang

Abstract:

The present study's purpose is twofold: first, to construct indicators for learning-oriented school education and, second, to conduct a survey to examine how learning-oriented education has been implemented in junior high schools after the launch of the 12-year compulsory curriculum. For indicator system construction, we compiled relevant literature to develop a preliminary indicator list model and then conducted two rounds of a questionnaire survey to gain comprehensive feedback from experts to finalize our indicator model. In the survey's first round, 12 experts were invited to evaluate the indicators' appropriateness. Based on the experts' consensus, we determined our final indicator list and used it to develop the Fuzzy Delphi questionnaire to finalize the indicator system and each indicator's relative value. For the fact-finding survey, we collected 454 valid samples to examine how the concept of learning-oriented education is adopted and implemented in the junior high school context. We also used this data in our importance-performance analysis to explore the strengths and weaknesses of school education in Taiwan. The results suggest that the indicator system for learning-oriented school education must consist of seven dimensions and 34 indicators. Among the seven dimensions, 'student learning' and 'curriculum planning and implementation' are the most important yet underperforming dimensions that need immediate improvement. We anticipate that the indicator system will be a useful tool for other countries' evaluation of schools' performance in learning-oriented education.

Keywords: learning-oriented education, school education, fuzzy Delphi method, importance-performance analysis

Procedia PDF Downloads 143
3131 The Potential Factors Relating to the Decision of Return Migration of Myanmar Migrant Workers: A Case Study in Prachuap Khiri Khan Province

Authors: Musthaya Patchanee

Abstract:

The aim of this research is to study potential factors relating to the decision of return migration of Myanmar migrant workers in Prachuap Khiri Khan Province by conducting a random sampling of 400 people aged between 15-59 who migrated from Myanmar. The information collected through interviews was analyzed to find a percentage and mean using the Stepwise Multiple Regression Analysis. The results have shown that 33.25% of Myanmar migrant workers want to return to their home country within the next 1-5 years, 46.25%, in 6-10 years and the rest, in over 10 years. The factors relating to such decision can be concluded that the scale of the decision of return migration has a positive relationship with a statistical significance at 0.05 with a conformity with friends and relatives (r=0.886), a relationship with family and community (r=0.782), possession of land in hometown (r=0.756) and educational level (r=0.699). However, the factor of property possession in Prachuap Khiri Khan is the only factor with a high negative relationship (r=0.-537). From the Stepwise Multiple Regression Analysis, the results have shown that the conformity with friends and relatives and educational level factors are influential to the decision of return migration of Myanmar migrant workers in Prachuap Khiri Khan Province, which can predict the decision at 86.60% and the multiple regression equation from the analysis is Y= 6.744+1.198 conformity + 0.647 education.

Keywords: decision of return migration, factors of return migration, Myanmar migrant workers, Prachuap Khiri Khan Province

Procedia PDF Downloads 541