Search results for: censored regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 778

Search results for: censored regression

478 Electricity Load Modeling: An Application to Italian Market

Authors: Giovanni Masala, Stefania Marica

Abstract:

Forecasting electricity load plays a crucial role regards decision making and planning for economical purposes. Besides, in the light of the recent privatization and deregulation of the power industry, the forecasting of future electricity load turned out to be a very challenging problem. Empirical data about electricity load highlights a clear seasonal behavior (higher load during the winter season), which is partly due to climatic effects. We also emphasize the presence of load periodicity at a weekly basis (electricity load is usually lower on weekends or holidays) and at daily basis (electricity load is clearly influenced by the hour). Finally, a long-term trend may depend on the general economic situation (for example, industrial production affects electricity load). All these features must be captured by the model. The purpose of this paper is then to build an hourly electricity load model. The deterministic component of the model requires non-linear regression and Fourier series while we will investigate the stochastic component through econometrical tools. The calibration of the parameters’ model will be performed by using data coming from the Italian market in a 6 year period (2007- 2012). Then, we will perform a Monte Carlo simulation in order to compare the simulated data respect to the real data (both in-sample and out-of-sample inspection). The reliability of the model will be deduced thanks to standard tests which highlight a good fitting of the simulated values.

Keywords: ARMA-GARCH process, electricity load, fitting tests, Fourier series, Monte Carlo simulation, non-linear regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1486
477 Regional Analysis of Streamflow Drought: A Case Study for Southwestern Iran

Authors: M. Byzedi, B. Saghafian

Abstract:

Droughts are complex, natural hazards that, to a varying degree, affect some parts of the world every year. The range of drought impacts is related to drought occurring in different stages of the hydrological cycle and usually different types of droughts, such as meteorological, agricultural, hydrological, and socioeconomical are distinguished. Streamflow drought was analyzed by the method of truncation level (at 70% level) on daily discharges measured in 54 hydrometric stations in southwestern Iran. Frequency analysis was carried out for annual maximum series (AMS) of drought deficit volume and duration series. Some factors including physiographic, climatic, geologic, and vegetation cover were studied as influential factors in the regional analysis. According to the results of factor analysis, six most effective factors were identified as area, rainfall from December to February, the percent of area with Normalized Difference Vegetation Index (NDVI) <0.1, the percent of convex area, drainage density and the minimum of watershed elevation that explained 90.9% of variance. The homogenous regions were determined by cluster analysis and discriminate function analysis. Suitable multivariate regression models were evaluated for streamflow drought deficit volume with 2 years return period. The significance level of regression models was 0.01. The results showed that the watershed area is the most effective factor with high correlation with deficit volume. Also, drought duration was not a suitable drought index for regional analysis.

Keywords: Iran, Streamflow drought, truncation level method, regional analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1744
476 Evaluation of the Impact of Dataset Characteristics for Classification Problems in Biological Applications

Authors: Kanthida Kusonmano, Michael Netzer, Bernhard Pfeifer, Christian Baumgartner, Klaus R. Liedl, Armin Graber

Abstract:

Availability of high dimensional biological datasets such as from gene expression, proteomic, and metabolic experiments can be leveraged for the diagnosis and prognosis of diseases. Many classification methods in this area have been studied to predict disease states and separate between predefined classes such as patients with a special disease versus healthy controls. However, most of the existing research only focuses on a specific dataset. There is a lack of generic comparison between classifiers, which might provide a guideline for biologists or bioinformaticians to select the proper algorithm for new datasets. In this study, we compare the performance of popular classifiers, which are Support Vector Machine (SVM), Logistic Regression, k-Nearest Neighbor (k-NN), Naive Bayes, Decision Tree, and Random Forest based on mock datasets. We mimic common biological scenarios simulating various proportions of real discriminating biomarkers and different effect sizes thereof. The result shows that SVM performs quite stable and reaches a higher AUC compared to other methods. This may be explained due to the ability of SVM to minimize the probability of error. Moreover, Decision Tree with its good applicability for diagnosis and prognosis shows good performance in our experimental setup. Logistic Regression and Random Forest, however, strongly depend on the ratio of discriminators and perform better when having a higher number of discriminators.

Keywords: Classification, High dimensional data, Machine learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2384
475 Predictor Factors for Treatment Failure among Patients on Second Line Antiretroviral Therapy

Authors: Mohd. A. M. Rahim, Yahaya Hassan, Mathumalar L. Fahrni

Abstract:

Second line antiretroviral therapy (ART) regimen is used when patients fail their first line regimen. There are many factors such as non-adherence, drug resistance as well as virological and immunological failure that lead to second line highly active antiretroviral therapy (HAART) regimen treatment failure. This study was aimed at determining predictor factors to treatment failure with second line HAART and analyzing median survival time. An observational, retrospective study was conducted in Sungai Buloh Hospital (HSB) to assess current status of HIV patients treated with second line HAART regimen. Convenience sampling was used and 104 patients were included based on the study’s inclusion and exclusion criteria. Data was collected for six months i.e. from July until December 2013. Data was then analysed using SPSS version 18. Kaplan-Meier and Cox regression analyses were used to measure median survival times and predictor factors for treatment failure. The study population consisted mainly of male subjects, aged 30- 45 years, who were heterosexual, and had HIV infection for less than 6 years. The most common second line HAART regimen given was lopinavir/ritonavir (LPV/r)-based combination. Kaplan-Meier analysis showed that patients on LPV/r demonstrated longer median survival times than patients on indinavir/ritonavir (IDV/r) based combination (p<0.001). The commonest reason for a treatment to fail with second line HAART was non-adherence. Based on Cox regression analysis, other predictor factors for treatment failure with second line HAART regimen were age and mode of HIV transmission.

Keywords: Adherence, antiretroviral therapy, second line, treatment failure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2717
474 A Three Elements Vector Valued Structure’s Ultimate Strength-Strong Motion-Intensity Measure

Authors: A. Nicknam, N. Eftekhari, A. Mazarei, M. Ganjvar

Abstract:

This article presents an alternative collapse capacity intensity measure in the three elements form which is influenced by the spectral ordinates at periods longer than that of the first mode period at near and far source sites. A parameter, denoted by β, is defined by which the spectral ordinate effects, up to the effective period (2T1), on the intensity measure are taken into account. The methodology permits to meet the hazard-levelled target extreme event in the probabilistic and deterministic forms. A MATLAB code is developed involving OpenSees to calculate the collapse capacities of the 8 archetype RC structures having 2 to 20 stories for regression process. The incremental dynamic analysis (IDA) method is used to calculate the structure’s collapse values accounting for the element stiffness and strength deterioration. The general near field set presented by FEMA is used in a series of performing nonlinear analyses. 8 linear relationships are developed for the 8structutres leading to the correlation coefficient up to 0.93. A collapse capacity near field prediction equation is developed taking into account the results of regression processes obtained from the 8 structures. The proposed prediction equation is validated against a set of actual near field records leading to a good agreement. Implementation of the proposed equation to the four archetype RC structures demonstrated different collapse capacities at near field site compared to those of FEMA. The reasons of differences are believed to be due to accounting for the spectral shape effects.

Keywords: Collapse capacity, fragility analysis, spectral shape effects, IDA method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1794
473 Comparative Evaluation of Accuracy of Selected Machine Learning Classification Techniques for Diagnosis of Cancer: A Data Mining Approach

Authors: Rajvir Kaur, Jeewani Anupama Ginige

Abstract:

With recent trends in Big Data and advancements in Information and Communication Technologies, the healthcare industry is at the stage of its transition from clinician oriented to technology oriented. Many people around the world die of cancer because the diagnosis of disease was not done at an early stage. Nowadays, the computational methods in the form of Machine Learning (ML) are used to develop automated decision support systems that can diagnose cancer with high confidence in a timely manner. This paper aims to carry out the comparative evaluation of a selected set of ML classifiers on two existing datasets: breast cancer and cervical cancer. The ML classifiers compared in this study are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and Artificial Neural Networks (ANN). The evaluation is carried out based on standard evaluation metrics Precision (P), Recall (R), F1-score and Accuracy. The experimental results based on the evaluation metrics show that ANN showed the highest-level accuracy (99.4%) when tested with breast cancer dataset. On the other hand, when these ML classifiers are tested with the cervical cancer dataset, Ensemble (Bagged Tree) technique gave better accuracy (93.1%) in comparison to other classifiers.

Keywords: Artificial neural networks, breast cancer, cancer dataset, classifiers, cervical cancer, F-score, logistic regression, machine learning, precision, recall, support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1553
472 Evaluation of the Beach Erosion Process in Varadero, Matanzas, Cuba: Effects of Different Hurricane Trajectories

Authors: Ana Gabriela Diaz, Luis Fermín Córdova, Jr., Roberto Lamazares

Abstract:

The island of Cuba, the largest of the Greater Antilles, is located in the tropical North Atlantic. It is annually affected by numerous weather events, which have caused severe damage to our coastal areas. In the same way that many other coastlines around the world, the beautiful beaches of the Hicacos Peninsula also suffer from erosion. This leads to a structural regression of the coastline. If measures are not taken, the hotels will be exposed to the advance of the sea, and it will be a serious problem for the economy. With the aim of studying the intensity of this type of activity, specialists of group of coastal and marine engineering from CIH, in the framework of the research conducted within the project MEGACOSTAS 2, provide their research to simulate extreme events and assess their impact in coastal areas, mainly regarding the definition of flood volumes and morphodynamic changes in sandy beaches. The main objective of this work is the evaluation of the process of Varadero beach erosion (the coastal sector has an important impact in the country's economy) on the Hicacos Peninsula for different paths of hurricanes. The mathematical model XBeach, which was integrated into the Coastal engineering system introduced by the project of MEGACOSTA 2 to determine the area and the more critical profiles for the path of hurricanes under study, was applied. The results of this project have shown that Center area is the greatest dynamic area in the simulation of the three paths of hurricanes under study, showing high erosion volumes and the greatest average length of regression of the coastline, from 15- 22 m.

Keywords: Beach, erosion, mathematical model, coastal areas.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1219
471 Potentials of Raphia hookeri Wine in Livelihood Sustenance among Rural and Urban Populations in Nigeria

Authors: A. A. Aiyeloja, A.T. Oladele, O. Tumulo

Abstract:

Raphia wine is an important forest product with cultural significance besides its use as medicine and food in southern Nigeria. This work aims to evaluate the profitability of Raphia wine production and marketing in Sapele Local Government Area, Nigeria. Four communities (Sapele, Ogiede, Okuoke and Elume) were randomly selected for data collection via questionnaires among producers and marketers. A total of 50 producers and 34 marketers were randomly selected for interview. Data was analyzed using descriptive statistics, profit margin, multiple regression and rate of returns on investment (RORI). Annual average profit was highest in Okuoke (Producers – N90, 000.00, Marketers - N70, 000.00) and least in Sapele (Producers N50, 000.00, Marketers – N45, 000.00). Calculated RORI for marketers were Elume (40.0%), Okuoke (25.0%), Ogiede (33.3%) and Sapele (50.0%). Regression results showed that location has significant effects (0.000, ρ ≤ 0.05) on profit margins. Male (58.8%) and female (41.2%) invest in Raphia wine marketing, while males (100.0%) dominate production. Results showed that Raphia wine has potentials to generate household income, enhance food security and improve quality of life in rural, semi-urban and urban communities. Improved marketing channels, storage facilities and credit facilities via cooperative groups are recommended for producers and marketers by concerned agencies.

Keywords: Raphia wine, Profit margin, RORI, Livelihood, Nigeria.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2426
470 A Linear Regression Model for Estimating Anxiety Index Using Wide Area Frontal Lobe Brain Blood Volume

Authors: Takashi Kaburagi, Masashi Takenaka, Yosuke Kurihara, Takashi Matsumoto

Abstract:

Major depressive disorder (MDD) is one of the most common mental illnesses today. It is believed to be caused by a combination of several factors, including stress. Stress can be quantitatively evaluated using the State-Trait Anxiety Inventory (STAI), one of the best indices to evaluate anxiety. Although STAI scores are widely used in applications ranging from clinical diagnosis to basic research, the scores are calculated based on a self-reported questionnaire. An objective evaluation is required because the subject may intentionally change his/her answers if multiple tests are carried out. In this article, we present a modified index called the “multi-channel Laterality Index at Rest (mc-LIR)” by recording the brain activity from a wider area of the frontal lobe using multi-channel functional near-infrared spectroscopy (fNIRS). The presented index aims to measure multiple positions near the Fpz defined by the international 10-20 system positioning. Using 24 subjects, the dependencies on the number of measuring points used to calculate the mc-LIR and its correlation coefficients with the STAI scores are reported. Furthermore, a simple linear regression was performed to estimate the STAI scores from mc-LIR. The cross-validation error is also reported. The experimental results show that using multiple positions near the Fpz will improve the correlation coefficients and estimation than those using only two positions.

Keywords: Stress, functional near-infrared spectroscopy, frontal lobe, state-trait anxiety inventory score.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1166
469 Foreign Direct Investment on Economic Growth by Industries in Central and Eastern European Countries

Authors: Shorena Pharjiani

Abstract:

Present empirical paper investigates the relationship between FDI and economic growth by 10 selected industries in 10 Central and Eastern European countries from the period 1995 to 2012. Different estimation approaches were used to explore the connection between FDI and economic growth, for example OLS, RE, FE with and without time dummies. Obtained empirical results leads to some main consequences: First, the Central and East European countries (CEEC) attracted foreign direct investment, which raised the productivity of industries they entered in. It should be concluded that the linkage between FDI and output growth by industries is positive and significant enough to suggest that foreign firm’s participation enhanced the productivity of the industries they occupied. There had been an endogeneity problem in the regression and fixed effects estimation approach was used which partially corrected the regression analysis in order to make the results less biased. Second, it should be stressed that the results show that time has an important role in making FDI operational for enhancing output growth by industries via total factor productivity. Third, R&D positively affected economic growth and at the same time, it should take some time for research and development to influence economic growth. Fourth, the general trends masked crucial differences at the country level: over the last 20 years, the analysis of the tables and figures at the country level show that the main recipients of FDI of the 11 Central and Eastern European countries were Hungary, Poland and the Czech Republic. The main reason was that these countries had more open door policies for attracting the FDI. Fifth, according to the graphical analysis, while Hungary had the highest FDI inflow in this region, it was not reflected in the GDP growth as much as in other Central and Eastern European countries.

Keywords: Central and East European countries (CEEC), economic growth, FDI, panel data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1665
468 Customer Churn Prediction Using Four Machine Learning Algorithms Integrating Feature Selection and Normalization in the Telecom Sector

Authors: Alanoud Moraya Aldalan, Abdulaziz Almaleh

Abstract:

A crucial part of maintaining a customer-oriented business in the telecommunications industry is understanding the reasons and factors that lead to customer churn. Competition between telecom companies has greatly increased in recent years, which has made it more important to understand customers’ needs in this strong market. For those who are looking to turn over their service providers, understanding their needs is especially important. Predictive churn is now a mandatory requirement for retaining customers in the telecommunications industry. Machine learning can be used to accomplish this. Churn Prediction has become a very important topic in terms of machine learning classification in the telecommunications industry. Understanding the factors of customer churn and how they behave is very important to building an effective churn prediction model. This paper aims to predict churn and identify factors of customers’ churn based on their past service usage history. Aiming at this objective, the study makes use of feature selection, normalization, and feature engineering. Then, this study compared the performance of four different machine learning algorithms on the Orange dataset: Logistic Regression, Random Forest, Decision Tree, and Gradient Boosting. Evaluation of the performance was conducted by using the F1 score and ROC-AUC. Comparing the results of this study with existing models has proven to produce better results. The results showed the Gradients Boosting with feature selection technique outperformed in this study by achieving a 99% F1-score and 99% AUC, and all other experiments achieved good results as well.

Keywords: Machine Learning, Gradient Boosting, Logistic Regression, Churn, Random Forest, Decision Tree, ROC, AUC, F1-score.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 408
467 Replicating Brain’s Resting State Functional Connectivity Network Using a Multi-Factor Hub-Based Model

Authors: B. L. Ho, L. Shi, D. F. Wang, V. C. T. Mok

Abstract:

The brain’s functional connectivity while temporally non-stationary does express consistency at a macro spatial level. The study of stable resting state connectivity patterns hence provides opportunities for identification of diseases if such stability is severely perturbed. A mathematical model replicating the brain’s spatial connections will be useful for understanding brain’s representative geometry and complements the empirical model where it falls short. Empirical computations tend to involve large matrices and become infeasible with fine parcellation. However, the proposed analytical model has no such computational problems. To improve replicability, 92 subject data are obtained from two open sources. The proposed methodology, inspired by financial theory, uses multivariate regression to find relationships of every cortical region of interest (ROI) with some pre-identified hubs. These hubs acted as representatives for the entire cortical surface. A variance-covariance framework of all ROIs is then built based on these relationships to link up all the ROIs. The result is a high level of match between model and empirical correlations in the range of 0.59 to 0.66 after adjusting for sample size; an increase of almost forty percent. More significantly, the model framework provides an intuitive way to delineate between systemic drivers and idiosyncratic noise while reducing dimensions by more than 30 folds, hence, providing a way to conduct attribution analysis. Due to its analytical nature and simple structure, the model is useful as a standalone toolkit for network dependency analysis or as a module for other mathematical models.

Keywords: Functional magnetic resonance imaging, multivariate regression, network hubs, resting state functional connectivity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 807
466 Evaluation of Short-Term Load Forecasting Techniques Applied for Smart Micro Grids

Authors: Xiaolei Hu, Enrico Ferrera, Riccardo Tomasi, Claudio Pastrone

Abstract:

Load Forecasting plays a key role in making today's and future's Smart Energy Grids sustainable and reliable. Accurate power consumption prediction allows utilities to organize in advance their resources or to execute Demand Response strategies more effectively, which enables several features such as higher sustainability, better quality of service, and affordable electricity tariffs. It is easy yet effective to apply Load Forecasting at larger geographic scale, i.e. Smart Micro Grids, wherein the lower available grid flexibility makes accurate prediction more critical in Demand Response applications. This paper analyses the application of short-term load forecasting in a concrete scenario, proposed within the EU-funded GreenCom project, which collect load data from single loads and households belonging to a Smart Micro Grid. Three short-term load forecasting techniques, i.e. linear regression, artificial neural networks, and radial basis function network, are considered, compared, and evaluated through absolute forecast errors and training time. The influence of weather conditions in Load Forecasting is also evaluated. A new definition of Gain is introduced in this paper, which innovatively serves as an indicator of short-term prediction capabilities of time spam consistency. Two models, 24- and 1-hour-ahead forecasting, are built to comprehensively compare these three techniques.

Keywords: Short-term load forecasting, smart micro grid, linear regression, artificial neural networks, radial basis function network, Gain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2602
465 The Impact of Socio-Economic and Type of Religion on the Behavior of Obedience among Arab-Israeli Teenagers

Authors: Sadhana Ghnayem

Abstract:

This article examines the relationship between several socio-economic and background variables of Arab-Israeli families and their effect on the conflict management style of forcing, where teenage children are expected to obey their parents without questioning. The article explores the inter-generational gap and the desire of Arab-Israeli parents to force their teenage children to obey without questioning. The independent variables include: the sex of the parent, religion (Christian or Muslim), income of the parent, years of education of the parent, and the sex of the teenage child. We use the dependent variable of “Obedience Without Questioning” that is reported twice: by each of the parents as well as by the children. We circulated a questionnaire and collected data from a sample of 180 parents and their adolescent child living in the Galilee area during 2018. In this questionnaire we asked each of the parent and his/her teenage child about whether the latter is expected to follow the instructions of the former without questioning. The outcome of this article indicates, first, that Christian-Arab families are less authoritarian than Muslims families in demanding sheer obedience from their children. Second, female parents indicate more than male parents that their teenage child indeed obeys without questioning. Third, there is a negative correlation between the variable “Income” and “Obedience without Questioning.” Yet, the regression coefficient of this variable is close zero. Fourth, there is a positive correlation between years of education and obedience reported by the children. In other words, more educated parents are more likely to demand obedience from their children.  Finally, after running the regression, the study also found that the impact of the variables of religion as well as the sex of the child on the dependent variable of obedience is also significant at above 95 and 90%, respectively.

Keywords: Arab-Israeli parents, Obedience, Forcing, Inter-generational gap.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 793
464 Meta Model for Optimum Design Objective Function of Steel Frames Subjected to Seismic Loads

Authors: Salah R. Al Zaidee, Ali S. Mahdi

Abstract:

Except for simple problems of statically determinate structures, optimum design problems in structural engineering have implicit objective functions where structural analysis and design are essential within each searching loop. With these implicit functions, the structural engineer is usually enforced to write his/her own computer code for analysis, design, and searching for optimum design among many feasible candidates and cannot take advantage of available software for structural analysis, design, and searching for the optimum solution. The meta-model is a regression model used to transform an implicit objective function into objective one and leads in turn to decouple the structural analysis and design processes from the optimum searching process. With the meta-model, well-known software for structural analysis and design can be used in sequence with optimum searching software. In this paper, the meta-model has been used to develop an explicit objective function for plane steel frames subjected to dead, live, and seismic forces. Frame topology is assumed as predefined based on architectural and functional requirements. Columns and beams sections and different connections details are the main design variables in this study. Columns and beams are grouped to reduce the number of design variables and to make the problem similar to that adopted in engineering practice. Data for the implicit objective function have been generated based on analysis and assessment for many design proposals with CSI SAP software. These data have been used later in SPSS software to develop a pure quadratic nonlinear regression model for the explicit objective function. Good correlations with a coefficient, R2, in the range from 0.88 to 0.99 have been noted between the original implicit functions and the corresponding explicit functions generated with meta-model.

Keywords: Meta-modal, objective function, steel frames, seismic analysis, design.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1333
463 Forecast of the Small Wind Turbines Sales with Replacement Purchases and with or without Account of Price Changes

Authors: V. Churkin, M. Lopatin

Abstract:

The purpose of the paper is to estimate the US small wind turbines market potential and forecast the small wind turbines sales in the US. The forecasting method is based on the application of the Bass model and the generalized Bass model of innovations diffusion under replacement purchases. In the work an exponential distribution is used for modeling of replacement purchases. Only one parameter of such distribution is determined by average lifetime of small wind turbines. The identification of the model parameters is based on nonlinear regression analysis on the basis of the annual sales statistics which has been published by the American Wind Energy Association (AWEA) since 2001 up to 2012. The estimation of the US average market potential of small wind turbines (for adoption purchases) without account of price changes is 57080 (confidence interval from 49294 to 64866 at P = 0.95) under average lifetime of wind turbines 15 years, and 62402 (confidence interval from 54154 to 70648 at P = 0.95) under average lifetime of wind turbines 20 years. In the first case the explained variance is 90,7%, while in the second - 91,8%. The effect of the wind turbines price changes on their sales was estimated using generalized Bass model. This required a price forecast. To do this, the polynomial regression function, which is based on the Berkeley Lab statistics, was used. The estimation of the US average market potential of small wind turbines (for adoption purchases) in that case is 42542 (confidence interval from 32863 to 52221 at P = 0.95) under average lifetime of wind turbines 15 years, and 47426 (confidence interval from 36092 to 58760 at P = 0.95) under average lifetime of wind turbines 20 years. In the first case the explained variance is 95,3%, while in the second – 95,3%.

Keywords: Bass model, generalized Bass model, replacement purchases, sales forecasting of innovations, statistics of sales of small wind turbines in the United States.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1883
462 Full-genomic Network Inference for Non-model organisms: A Case Study for the Fungal Pathogen Candida albicans

Authors: Jörg Linde, Ekaterina Buyko, Robert Altwasser, Udo Hahn, Reinhard Guthke

Abstract:

Reverse engineering of full-genomic interaction networks based on compendia of expression data has been successfully applied for a number of model organisms. This study adapts these approaches for an important non-model organism: The major human fungal pathogen Candida albicans. During the infection process, the pathogen can adapt to a wide range of environmental niches and reversibly changes its growth form. Given the importance of these processes, it is important to know how they are regulated. This study presents a reverse engineering strategy able to infer fullgenomic interaction networks for C. albicans based on a linear regression, utilizing the sparseness criterion (LASSO). To overcome the limited amount of expression data and small number of known interactions, we utilize different prior-knowledge sources guiding the network inference to a knowledge driven solution. Since, no database of known interactions for C. albicans exists, we use a textmining system which utilizes full-text research papers to identify known regulatory interactions. By comparing with these known regulatory interactions, we find an optimal value for global modelling parameters weighting the influence of the sparseness criterion and the prior-knowledge. Furthermore, we show that soft integration of prior-knowledge additionally improves the performance. Finally, we compare the performance of our approach to state of the art network inference approaches.

Keywords: Pathogen, network inference, text-mining, Candida albicans, LASSO, mutual information, reverse engineering, linear regression, modelling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1673
461 An Automated Stock Investment System Using Machine Learning Techniques: An Application in Australia

Authors: Carol Anne Hargreaves

Abstract:

A key issue in stock investment is how to select representative features for stock selection. The objective of this paper is to firstly determine whether an automated stock investment system, using machine learning techniques, may be used to identify a portfolio of growth stocks that are highly likely to provide returns better than the stock market index. The second objective is to identify the technical features that best characterize whether a stock’s price is likely to go up and to identify the most important factors and their contribution to predicting the likelihood of the stock price going up. Unsupervised machine learning techniques, such as cluster analysis, were applied to the stock data to identify a cluster of stocks that was likely to go up in price – portfolio 1. Next, the principal component analysis technique was used to select stocks that were rated high on component one and component two – portfolio 2. Thirdly, a supervised machine learning technique, the logistic regression method, was used to select stocks with a high probability of their price going up – portfolio 3. The predictive models were validated with metrics such as, sensitivity (recall), specificity and overall accuracy for all models. All accuracy measures were above 70%. All portfolios outperformed the market by more than eight times. The top three stocks were selected for each of the three stock portfolios and traded in the market for one month. After one month the return for each stock portfolio was computed and compared with the stock market index returns. The returns for all three stock portfolios was 23.87% for the principal component analysis stock portfolio, 11.65% for the logistic regression portfolio and 8.88% for the K-means cluster portfolio while the stock market performance was 0.38%. This study confirms that an automated stock investment system using machine learning techniques can identify top performing stock portfolios that outperform the stock market.

Keywords: Machine learning, stock market trading, logistic principal component analysis, automated stock investment system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1098
460 Machine Learning Techniques in Bank Credit Analysis

Authors: Fernanda M. Assef, Maria Teresinha A. Steiner

Abstract:

The aim of this paper is to compare and discuss better classifier algorithm options for credit risk assessment by applying different Machine Learning techniques. Using records from a Brazilian financial institution, this study uses a database of 5,432 companies that are clients of the bank, where 2,600 clients are classified as non-defaulters, 1,551 are classified as defaulters and 1,281 are temporarily defaulters, meaning that the clients are overdue on their payments for up 180 days. For each case, a total of 15 attributes was considered for a one-against-all assessment using four different techniques: Artificial Neural Networks Multilayer Perceptron (ANN-MLP), Artificial Neural Networks Radial Basis Functions (ANN-RBF), Logistic Regression (LR) and finally Support Vector Machines (SVM). For each method, different parameters were analyzed in order to obtain different results when the best of each technique was compared. Initially the data were coded in thermometer code (numerical attributes) or dummy coding (for nominal attributes). The methods were then evaluated for each parameter and the best result of each technique was compared in terms of accuracy, false positives, false negatives, true positives and true negatives. This comparison showed that the best method, in terms of accuracy, was ANN-RBF (79.20% for non-defaulter classification, 97.74% for defaulters and 75.37% for the temporarily defaulter classification). However, the best accuracy does not always represent the best technique. For instance, on the classification of temporarily defaulters, this technique, in terms of false positives, was surpassed by SVM, which had the lowest rate (0.07%) of false positive classifications. All these intrinsic details are discussed considering the results found, and an overview of what was presented is shown in the conclusion of this study.

Keywords: Artificial Neural Networks, ANNs, classifier algorithms, credit risk assessment, logistic regression, machine learning, support vector machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1281
459 Dispersion Rate of Spilled Oil in Water Column under Non-Breaking Water Waves

Authors: Hanifeh Imanian, Morteza Kolahdoozan

Abstract:

The purpose of this study is to present a mathematical phrase for calculating the dispersion rate of spilled oil in water column under non-breaking waves. In this regard, a multiphase numerical model is applied for which waves and oil phase were computed concurrently, and accuracy of its hydraulic calculations have been proven. More than 200 various scenarios of oil spilling in wave waters were simulated using the multiphase numerical model and its outcome were collected in a database. The recorded results were investigated to identify the major parameters affected vertical oil dispersion and finally 6 parameters were identified as main independent factors. Furthermore, some statistical tests were conducted to identify any relationship between the dependent variable (dispersed oil mass in the water column) and independent variables (water wave specifications containing height, length and wave period and spilled oil characteristics including density, viscosity and spilled oil mass). Finally, a mathematical-statistical relationship is proposed to predict dispersed oil in marine waters. To verify the proposed relationship, a laboratory example available in the literature was selected. Oil mass rate penetrated in water body computed by statistical regression was in accordance with experimental data was predicted. On this occasion, it was necessary to verify the proposed mathematical phrase. In a selected laboratory case available in the literature, mass oil rate penetrated in water body computed by suggested regression. Results showed good agreement with experimental data. The validated mathematical-statistical phrase is a useful tool for oil dispersion prediction in oil spill events in marine areas.

Keywords: Dispersion, marine environment, mathematical-statistical relationship, oil spill.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1146
458 The Loess Regression Relationship Between Age and BMI for both Sydney World Masters Games Athletes and the Australian National Population

Authors: Joe Walsh, Mike Climstein, Ian Timothy Heazlewood, Stephen Burke, Jyrki Kettunen, Kent Adams, Mark DeBeliso

Abstract:

Thousands of masters athletes participate quadrennially in the World Masters Games (WMG), yet this cohort of athletes remains proportionately under-investigated. Due to a growing global obesity pandemic in context of benefits of physical activity across the lifespan, the BMI trends for this unique population was of particular interest. The nexus between health, physical activity and aging is complex and has raised much interest in recent times due to the realization that a multifaceted approach is necessary in order to counteract the obesity pandemic. By investigating age based trends within a population adhering to competitive sport at older ages, further insight might be gleaned to assist in understanding one of many factors influencing this relationship.BMI was derived using data gathered on a total of 6,071 masters athletes (51.9% male, 48.1% female) aged 25 to 91 years ( =51.5, s =±9.7), competing at the Sydney World Masters Games (2009). Using linear and loess regression it was demonstrated that the usual tendency for prevalence of higher BMI increasing with age was reversed in the sample. This trend in reversal was repeated for both male and female only sub-sets of the sample participants, indicating the possibility of improved prevalence of BMI with increasing age for both the sample as a whole and these individual sub-groups.This evidence of improved classification in one index of health (reduced BMI) for masters athletes (when compared to the general population) implies there are either improved levels of this index of health with aging due to adherence to sport or possibly the reduced BMI is advantageous and contributes to this cohort adhering (or being attracted) to masters sport at older ages.

Keywords: Aging, masters athlete, Quetelet Index, sport

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1712
457 A Quantitative Model for Determining the Area of the “Core and Structural System Elements” of Tall Office Buildings

Authors: Görkem Arslan Kılınç

Abstract:

Due to the high construction, operation, and maintenance costs of tall buildings, quantification of the area in the plan layout which provides a financial return is an important design criterion. The area of the “core and the structural system elements” does not provide financial return but must exist in the plan layout. Some characteristic items of tall office buildings affect the size of these areas. From this point of view, 15 tall office buildings were systematically investigated. The typical office floor plans of these buildings were re-produced digitally. The area of the “core and the structural system elements” in each building and the characteristic items of each building were calculated. These characteristic items are the size of the long and short plan edge, plan length/width ratio, size of the core long and short edge, core length/width ratio, core area, slenderness, building height, number of floors, and floor height. These items were analyzed by correlation and regression analyses. Results of this paper put forward that; characteristic items which affect the area of "core and structural system elements" are plan long and short edge size, core short edge size, building height, and the number of floors. A one-unit increase in plan short side size increases the area of the "core and structural system elements" in the plan by 12,378 m2. An increase in core short edge size increases the area of the core and structural system elements in the plan by 25,650 m2. Subsequent studies can be conducted by expanding the sample of the study and considering the geographical location of the building.

Keywords: Core area, correlation analysis, floor area, regression analysis, space efficiency, tall office buildings.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 506
456 Meta Model Based EA for Complex Optimization

Authors: Maumita Bhattacharya

Abstract:

Evolutionary Algorithms are population-based, stochastic search techniques, widely used as efficient global optimizers. However, many real life optimization problems often require finding optimal solution to complex high dimensional, multimodal problems involving computationally very expensive fitness function evaluations. Use of evolutionary algorithms in such problem domains is thus practically prohibitive. An attractive alternative is to build meta models or use an approximation of the actual fitness functions to be evaluated. These meta models are order of magnitude cheaper to evaluate compared to the actual function evaluation. Many regression and interpolation tools are available to build such meta models. This paper briefly discusses the architectures and use of such meta-modeling tools in an evolutionary optimization context. We further present two evolutionary algorithm frameworks which involve use of meta models for fitness function evaluation. The first framework, namely the Dynamic Approximate Fitness based Hybrid EA (DAFHEA) model [14] reduces computation time by controlled use of meta-models (in this case approximate model generated by Support Vector Machine regression) to partially replace the actual function evaluation by approximate function evaluation. However, the underlying assumption in DAFHEA is that the training samples for the metamodel are generated from a single uniform model. This does not take into account uncertain scenarios involving noisy fitness functions. The second model, DAFHEA-II, an enhanced version of the original DAFHEA framework, incorporates a multiple-model based learning approach for the support vector machine approximator to handle noisy functions [15]. Empirical results obtained by evaluating the frameworks using several benchmark functions demonstrate their efficiency

Keywords: Meta model, Evolutionary algorithm, Stochastictechnique, Fitness function, Optimization, Support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2067
455 Work Engagement of Malaysian Nurses: Exploring the Impact of Hope and Resilience

Authors: Noraini Othman, Aizzat Mohd Nasurdin

Abstract:

The purpose of this study was to investigate the relationship between hope and resilience with work engagement. A total of 422 staff nurses working in three public hospitals in Peninsular Malaysia participated in this study. Statistical results using regression analysis revealed that hope and resilience were positively related to work engagement. Possible reasons for these findings, as well as their implications and future research directions are discussed.

Keywords: hope, nurses, resilience, work engagement

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3752
454 An Economic Analysis of Phu Kradueng National Park

Authors: Chutarat Boontho

Abstract:

The purposes of this study were as follows to evaluate the economic value of Phu Kradueng National Park by the travel cost method (TCM) and the contingent valuation method (CVM) and to estimate the demand for traveling and the willingness to pay. The data for this study were collected by conducting two large scale surveys on users and non-users. A total of 1,016 users and 1,034 non-users were interviewed. The data were analyzed using multiple linear regression analysis, logistic regression model and the consumer surplus (CS) was the integral of demand function for trips. The survey found, were as follows: 1)Using the travel cost method which provides an estimate of direct benefits to park users, we found that visitors- total willingness to pay per visit was 2,284.57 bath, of which 958.29 bath was travel cost, 1,129.82 bath was expenditure for accommodation, food, and services, and 166.66 bath was consumer surplus or the visitors -net gain or satisfaction from the visit (the integral of demand function for trips). 2) Thai visitors to Phu Kradueng National Park were further willing to pay an average of 646.84 bath per head per year to ensure the continued existence of Phu Kradueng National Park and to preserve their option to use it in the future. 3) Thai non-visitors, on the other hand, are willing to pay an average of 212.61 bath per head per year for the option and existence value provided by the Park. 4) The total economic value of Phu Kradueng National Park to Thai visitors and non-visitors taken together stands today at 9,249.55 million bath per year. 5) The users- average willingness to pay for access to Phu Kradueng National Park rises from 40 bath to 84.66 bath per head per trip for improved services such as road improvement, increased cleanliness, and upgraded information. This paper was needed to investigate of the potential market demand for bio prospecting in Phu Kradueng national Park and to investigate how a larger share of the economic benefits of tourism could be distributed income to the local residents.

Keywords: Contingent Valuation Method, Travel Cost Method, Consumer surplus.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1788
453 Integrated Mass Rapid Transit System for Smart City Project in Western India

Authors: Debasis Sarkar, Jatan Talati

Abstract:

This paper is an attempt to develop an Integrated Mass Rapid Transit System (MRTS) for a smart city project in Western India. Integrated transportation is one of the enablers of smart transportation for providing a seamless intercity as well as regional level transportation experience. The success of a smart city project at the city level for transportation is providing proper integration to different mass rapid transit modes by way of integrating information, physical, network of routes fares, etc. The methodology adopted for this study was primary data research through questionnaire survey. The respondents of the questionnaire survey have responded on the issues about their perceptions on the ways and means to improve public transport services in urban cities. The respondents were also required to identify the factors and attributes which might motivate more people to shift towards the public mode. Also, the respondents were questioned about the factors which they feel might restrain the integration of various modes of MRTS. Furthermore, this study also focuses on developing a utility equation for respondents with the help of multiple linear regression analysis and its probability to shift to public transport for certain factors listed in the questionnaire. It has been observed that for shifting to public transport, the most important factors that need to be considered were travel time saving and comfort rating. Also, an Integrated MRTS can be obtained by combining metro rail with BRTS, metro rail with monorail, monorail with BRTS and metro rail with Indian railways. Providing a common smart card to transport users for accessing all the different available modes would be a pragmatic solution towards integration of the available modes of MRTS.

Keywords: Mass rapid transit systems, smart city, metro rail, bus rapid transit system, multiple linear regression, smart card, automated fare collection system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1226
452 Assessment of Path Loss Prediction Models for Wireless Propagation Channels at L-Band Frequency over Different Micro-Cellular Environments of Ekiti State, Southwestern Nigeria

Authors: C. I. Abiodun, S. O. Azi, J. S. Ojo, P. Akinyemi

Abstract:

The design of accurate and reliable mobile communication systems depends majorly on the suitability of path loss prediction methods and the adaptability of the methods to various environments of interest. In this research, the results of the adaptability of radio channel behavior are presented based on practical measurements carried out in the 1800 MHz frequency band. The measurements are carried out in typical urban, suburban and rural environments in Ekiti State, Southwestern part of Nigeria. A total number of seven base stations of MTN GSM service located in the studied environments were monitored. Path loss and break point distances were deduced from the measured received signal strength (RSS) and a practical path loss model is proposed based on the deduced break point distances. The proposed two slope model, regression line and four existing path loss models were compared with the measured path loss values. The standard deviations of each model with respect to the measured path loss were estimated for each base station. The proposed model and regression line exhibited lowest standard deviations followed by the Cost231-Hata model when compared with the Erceg Ericsson and SUI models. Generally, the proposed two-slope model shows closest agreement with the measured values with a mean error values of 2 to 6 dB. These results show that, either the proposed two slope model or Cost 231-Hata model may be used to predict path loss values in mobile micro cell coverage in the well-considered environments. Information from this work will be useful for link design of microwave band wireless access systems in the region.

Keywords: Break-point distances, path loss models, path loss exponent, received signal strength.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 819
451 Estimating the Life-Distribution Parameters of Weibull-Life PV Systems Utilizing Non-Parametric Analysis

Authors: Saleem Z. Ramadan

Abstract:

In this paper, a model is proposed to determine the life distribution parameters of the useful life region for the PV system utilizing a combination of non-parametric and linear regression analysis for the failure data of these systems. Results showed that this method is dependable for analyzing failure time data for such reliable systems when the data is scarce.

Keywords: Masking, Bathtub model, reliability, non-parametric analysis, useful life.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1843
450 Surface Water Flow of Urban Areas and Sustainable Urban Planning

Authors: Sheetal Sharma

Abstract:

Urban planning is associated with land transformation from natural areas to modified and developed ones which leads to modification of natural environment. The basic knowledge of relationship between both should be ascertained before proceeding for the development of natural areas. Changes on land surface due to build up pavements, roads and similar land cover, affect surface water flow. There is a gap between urban planning and basic knowledge of hydrological processes which should be known to the planners. The paper aims to identify these variations in surface flow due to urbanization for a temporal scale of 40 years using Storm Water Management Mode (SWMM) and again correlating these findings with the urban planning guidelines in study area along with geological background to find out the suitable combinations of land cover, soil and guidelines. For the purpose of identifying the changes in surface flows, 19 catchments were identified with different geology and growth in 40 years facing different ground water levels fluctuations. The increasing built up, varying surface runoff are studied using Arc GIS and SWMM modeling, regression analysis for runoff. Resulting runoff for various land covers and soil groups with varying built up conditions were observed. The modeling procedures also included observations for varying precipitation and constant built up in all catchments. All these observations were combined for individual catchment and single regression curve was obtained for runoff. Thus, it was observed that alluvial with suitable land cover was better for infiltration and least generation of runoff but excess built up could not be sustained on alluvial soil. Similarly, basalt had least recharge and most runoff demanding maximum vegetation over it. Sandstone resulted in good recharging if planned with more open spaces and natural soils with intermittent vegetation. Hence, these observations made a keystone base for planners while planning various land uses on different soils. This paper contributes and provides a solution to basic knowledge gap, which urban planners face during development of natural surfaces.

Keywords: Runoff, built up, roughness, recharge, temporal changes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1459
449 Aircraft Gas Turbine Engines Technical Condition Identification System

Authors: A. M. Pashayev, C. Ardil, D. D. Askerov, R. A. Sadiqov, P. S. Abdullayev

Abstract:

In this paper is shown that the probability-statistic methods application, especially at the early stage of the aviation gas turbine engine (GTE) technical condition diagnosing, when the flight information has property of the fuzzy, limitation and uncertainty is unfounded. Hence is considered the efficiency of application of new technology Soft Computing at these diagnosing stages with the using of the Fuzzy Logic and Neural Networks methods. Training with high accuracy of fuzzy multiple linear and non-linear models (fuzzy regression equations) which received on the statistical fuzzy data basis is made. Thus for GTE technical condition more adequate model making are analysed dynamics of skewness and kurtosis coefficients' changes. Researches of skewness and kurtosis coefficients values- changes show that, distributions of GTE work parameters have fuzzy character. Hence consideration of fuzzy skewness and kurtosis coefficients is expedient. Investigation of the basic characteristics changes- dynamics of GTE work parameters allows to draw conclusion on necessity of the Fuzzy Statistical Analysis at preliminary identification of the engines' technical condition. Researches of correlation coefficients values- changes shows also on their fuzzy character. Therefore for models choice the application of the Fuzzy Correlation Analysis results is offered. For checking of models adequacy is considered the Fuzzy Multiple Correlation Coefficient of Fuzzy Multiple Regression. At the information sufficiency is offered to use recurrent algorithm of aviation GTE technical condition identification (Hard Computing technology is used) on measurements of input and output parameters of the multiple linear and non-linear generalised models at presence of noise measured (the new recursive Least Squares Method (LSM)). The developed GTE condition monitoring system provides stage-bystage estimation of engine technical conditions. As application of the given technique the estimation of the new operating aviation engine temperature condition was made.

Keywords: Gas turbine engines, neural networks, fuzzy logic, fuzzy statistics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1904