Search results for: panel data regression models
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 29333

Search results for: panel data regression models

29033 Consumption Insurance against the Chronic Illness: Evidence from Thailand

Authors: Yuthapoom Thanakijborisut

Abstract:

This paper studies consumption insurance against the chronic illness in Thailand. The study estimates the impact of household consumption in the chronic illness on consumption growth. Chronic illness is the health care costs of a person or a household’s decision in treatment for the long term; the causes and effects of the household’s ability for smooth consumption. The chronic illnesses are measured in health status when at least one member within the household faces the chronic illness. The data used is from the Household Social Economic Panel Survey conducted during 2007 and 2012. The survey collected data from approximately 6,000 households from every province, both inside and outside municipal areas in Thailand. The study estimates the change in household consumption by using an ordinary least squares (OLS) regression model. The result shows that the members within the household facing the chronic illness would reduce the consumption by around 4%. This case indicates that consumption insurance in Thailand is quite sufficient against chronic illness.

Keywords: consumption insurance, chronic illness, health care, Thailand

Procedia PDF Downloads 213
29032 Modeling Geogenic Groundwater Contamination Risk with the Groundwater Assessment Platform (GAP)

Authors: Joel Podgorski, Manouchehr Amini, Annette Johnson, Michael Berg

Abstract:

One-third of the world’s population relies on groundwater for its drinking water. Natural geogenic arsenic and fluoride contaminate ~10% of wells. Prolonged exposure to high levels of arsenic can result in various internal cancers, while high levels of fluoride are responsible for the development of dental and crippling skeletal fluorosis. In poor urban and rural settings, the provision of drinking water free of geogenic contamination can be a major challenge. In order to efficiently apply limited resources in the testing of wells, water resource managers need to know where geogenically contaminated groundwater is likely to occur. The Groundwater Assessment Platform (GAP) fulfills this need by providing state-of-the-art global arsenic and fluoride contamination hazard maps as well as enabling users to create their own groundwater quality models. The global risk models were produced by logistic regression of arsenic and fluoride measurements using predictor variables of various soil, geological and climate parameters. The maps display the probability of encountering concentrations of arsenic or fluoride exceeding the World Health Organization’s (WHO) stipulated concentration limits of 10 µg/L or 1.5 mg/L, respectively. In addition to a reconsideration of the relevant geochemical settings, these second-generation maps represent a great improvement over the previous risk maps due to a significant increase in data quantity and resolution. For example, there is a 10-fold increase in the number of measured data points, and the resolution of predictor variables is generally 60 times greater. These same predictor variable datasets are available on the GAP platform for visualization as well as for use with a modeling tool. The latter requires that users upload their own concentration measurements and select the predictor variables that they wish to incorporate in their models. In addition, users can upload additional predictor variable datasets either as features or coverages. Such models can represent an improvement over the global models already supplied, since (a) users may be able to use their own, more detailed datasets of measured concentrations and (b) the various processes leading to arsenic and fluoride groundwater contamination can be isolated more effectively on a smaller scale, thereby resulting in a more accurate model. All maps, including user-created risk models, can be downloaded as PDFs. There is also the option to share data in a secure environment as well as the possibility to collaborate in a secure environment through the creation of communities. In summary, GAP provides users with the means to reliably and efficiently produce models specific to their region of interest by making available the latest datasets of predictor variables along with the necessary modeling infrastructure.

Keywords: arsenic, fluoride, groundwater contamination, logistic regression

Procedia PDF Downloads 317
29031 Optimal Designof Brush Roll for Semiconductor Wafer Using CFD Analysis

Authors: Byeong-Sam Kim, Kyoungwoo Park

Abstract:

This research analyzes structure of flat panel display (FPD) such as LCD as quantitative through CFD analysis and modeling change to minimize the badness rate and rate of production decrease by damage of large scale plater at wafer heating chamber at semi-conductor manufacturing process. This glass panel and wafer device with atmospheric pressure or chemical vapor deposition equipment for transporting and transferring wafers, robot hands carry these longer and wider wafers can also be easily handled. As a contact handling system composed of several problems in increased potential for fracture or warping. A non-contact handling system is required to solve this problem. The panel and wafer warping makes it difficult to carry out conventional contact to analysis. We propose a new non-contact transportation system with combining air suction and blowout. The numerical analysis and experimental is, therefore, should be performed to obtain compared to results achieved with non-contact solutions. This wafer panel noncontact handler shows its strength in maintaining high cleanliness levels for semiconductor production processes.

Keywords: flat panel display, non contact transportation, heat treatment process, CFD analysis

Procedia PDF Downloads 388
29030 Comparison of Different k-NN Models for Speed Prediction in an Urban Traffic Network

Authors: Seyoung Kim, Jeongmin Kim, Kwang Ryel Ryu

Abstract:

A database that records average traffic speeds measured at five-minute intervals for all the links in the traffic network of a metropolitan city. While learning from this data the models that can predict future traffic speed would be beneficial for the applications such as the car navigation system, building predictive models for every link becomes a nontrivial job if the number of links in a given network is huge. An advantage of adopting k-nearest neighbor (k-NN) as predictive models is that it does not require any explicit model building. Instead, k-NN takes a long time to make a prediction because it needs to search for the k-nearest neighbors in the database at prediction time. In this paper, we investigate how much we can speed up k-NN in making traffic speed predictions by reducing the amount of data to be searched for without a significant sacrifice of prediction accuracy. The rationale behind this is that we had a better look at only the recent data because the traffic patterns not only repeat daily or weekly but also change over time. In our experiments, we build several different k-NN models employing different sets of features which are the current and past traffic speeds of the target link and the neighbor links in its up/down-stream. The performances of these models are compared by measuring the average prediction accuracy and the average time taken to make a prediction using various amounts of data.

Keywords: big data, k-NN, machine learning, traffic speed prediction

Procedia PDF Downloads 333
29029 Effect of Corrosion on the Shear Buckling Strength

Authors: Myoung-Jin Lee, Sung-Jin Lee, Young-Kon Park, Jin-Wook Kim, Bo-Kyoung Kim, Song-Hun Chong, Sun-Ii Kim

Abstract:

The ability to resist the shear strength arises mainly from the web panel of steel girders and as such, the shear buckling strength of these girders has been extensively investigated. For example, Blaser’s reported that when buckling occurs, the tension field has an effect after the buckling strength of the steel is reached. The findings of these studies have been applied by AASHTO, AISC, and to the European Code that provides guidelines for designs aimed at preventing shear buckling. Steel girders are susceptible to corrosion resulting from exposure to natural elements such as rainfall, humidity, and temperature. This corrosion leads to a reduction in the size of the web panel section, thereby resulting in a decrease in the shear strength. The decrease in the panel section has a significant effect on the maintenance section of the bridge. However, in most conventional designs, the influence of corrosion is overlooked during the calculation of the shear buckling strength and hence over-design is common. Therefore, in this study, a steel girder with an A/D of 1:1, as well as a 6-mm-, 16-mm-, and 12-mm-thick web panel, flange, and intermediate reinforcing material, respectively, were used. The total length was set to that (3200 mm) of the default model. The effect of corrosion shear buckling was investigated by determining the volume amount of corrosion, shape of the erosion patterns, and the angular change in the tensile field of the shear buckling strength. This study provides the basic data that will enable designs that incorporate values closer (than those used in most conventional designs) to the actual shear buckling strength.

Keywords: corrosion, shear buckling strength, steel girder, shear strength

Procedia PDF Downloads 348
29028 A Quadratic Model to Early Predict the Blastocyst Stage with a Time Lapse Incubator

Authors: Cecile Edel, Sandrine Giscard D'Estaing, Elsa Labrune, Jacqueline Lornage, Mehdi Benchaib

Abstract:

Introduction: The use of incubator equipped with time-lapse technology in Artificial Reproductive Technology (ART) allows a continuous surveillance. With morphocinetic parameters, algorithms are available to predict the potential outcome of an embryo. However, the different proposed time-lapse algorithms do not take account the missing data, and then some embryos could not be classified. The aim of this work is to construct a predictive model even in the case of missing data. Materials and methods: Patients: A retrospective study was performed, in biology laboratory of reproduction at the hospital ‘Femme Mère Enfant’ (Lyon, France) between 1 May 2013 and 30 April 2015. Embryos (n= 557) obtained from couples (n=108) were cultured in a time-lapse incubator (Embryoscope®, Vitrolife, Goteborg, Sweden). Time-lapse incubator: The morphocinetic parameters obtained during the three first days of embryo life were used to build the predictive model. Predictive model: A quadratic regression was performed between the number of cells and time. N = a. T² + b. T + c. N: number of cells at T time (T in hours). The regression coefficients were calculated with Excel software (Microsoft, Redmond, WA, USA), a program with Visual Basic for Application (VBA) (Microsoft) was written for this purpose. The quadratic equation was used to find a value that allows to predict the blastocyst formation: the synthetize value. The area under the curve (AUC) obtained from the ROC curve was used to appreciate the performance of the regression coefficients and the synthetize value. A cut-off value has been calculated for each regression coefficient and for the synthetize value to obtain two groups where the difference of blastocyst formation rate according to the cut-off values was maximal. The data were analyzed with SPSS (IBM, Il, Chicago, USA). Results: Among the 557 embryos, 79.7% had reached the blastocyst stage. The synthetize value corresponds to the value calculated with time value equal to 99, the highest AUC was then obtained. The AUC for regression coefficient ‘a’ was 0.648 (p < 0.001), 0.363 (p < 0.001) for the regression coefficient ‘b’, 0.633 (p < 0.001) for the regression coefficient ‘c’, and 0.659 (p < 0.001) for the synthetize value. The results are presented as follow: blastocyst formation rate under cut-off value versus blastocyst rate formation above cut-off value. For the regression coefficient ‘a’ the optimum cut-off value was -1.14.10-3 (61.3% versus 84.3%, p < 0.001), 0.26 for the regression coefficient ‘b’ (83.9% versus 63.1%, p < 0.001), -4.4 for the regression coefficient ‘c’ (62.2% versus 83.1%, p < 0.001) and 8.89 for the synthetize value (58.6% versus 85.0%, p < 0.001). Conclusion: This quadratic regression allows to predict the outcome of an embryo even in case of missing data. Three regression coefficients and a synthetize value could represent the identity card of an embryo. ‘a’ regression coefficient represents the acceleration of cells division, ‘b’ regression coefficient represents the speed of cell division. We could hypothesize that ‘c’ regression coefficient could represent the intrinsic potential of an embryo. This intrinsic potential could be dependent from oocyte originating the embryo. These hypotheses should be confirmed by studies analyzing relationship between regression coefficients and ART parameters.

Keywords: ART procedure, blastocyst formation, time-lapse incubator, quadratic model

Procedia PDF Downloads 287
29027 Bank, Stock Market Efficiency and Economic Growth: Lessons for ASEAN-5

Authors: Tan Swee Liang

Abstract:

This paper estimates bank and stock market efficiency associations with real per capita GDP growth by examining panel-data across three different regions using Panel-Corrected Standard Errors (PCSE) regression developed by Beck and Katz (1995). Data from five economies in ASEAN (Singapore, Malaysia, Thailand, Philippines, and Indonesia), five economies in Asia (Japan, China, Hong Kong SAR, South Korea, and India) and seven economies in OECD (Australia, Canada, Denmark, Norway, Sweden, United Kingdom U.K., and United States U.S.), between 1990 and 2017 are used. Empirical findings suggest one, for Asia-5 high bank net interest margin means greater bank profitability, hence spurring economic growth. Two, for OECD-7 low bank overhead costs (as a share of total assets) may reflect weak competition and weak investment in providing superior banking services, hence dampening economic growth. Three, stock market turnover ratio has negative association with OECD-7 economic growth, but a positive association with Asia-5, which suggest the relationship between liquidity and growth is ambiguous. Lastly, for ASEAN-5 high bank overhead costs (as a share of total assets) may suggest expenses have not been channelled efficiently to income generating activities. One practical implication of the findings is that policy makers should take necessary measures toward financial liberalisation policies that boost growth through the efficiency channel, so that funds are efficiently allocated through the financial system between financial and real sectors.

Keywords: financial development, banking system, capital markets, economic growth

Procedia PDF Downloads 115
29026 Towards the Development of Uncertainties Resilient Business Model for Driving the Solar Panel Industry in Nigeria Power Sector

Authors: Balarabe Z. Ahmad, Anne-Lorène Vernay

Abstract:

The emergence of electricity in Nigeria was dated back to 1896. The power plants have the potential to generate 12,522 MW of electric power. Whereas current dispatch is about 4,000 MW, access to electrification is about 60%, with consumption at 0.14 MWh/capita. The government embarked on energy reforms to mitigate energy poverty. The reform targeted the provision of electricity access to 75% of the population by 2020 and 90% by 2030. Growth of total electricity demand by a factor of 5 by 2035 had been projected. This means that Nigeria will require almost 530 TWh of electricity which can be delivered through generators with a capacity of 65 GW. Analogously, the geographical location of Nigeria has placed it in an advantageous position as the source of solar energy; the availability of a high sunshine belt is obvious in the country. The implication is that the far North, where energy poverty is high, equally has about twice the solar radiation as against southern Nigeria. Hence, the chance of generating solar electricity is 66% possible at 11850 x 103 GWh per year, which is one hundred times the current electricity consumption rate in the country. Harvesting these huge potentials may be a mirage if the entrepreneurs in the solar panel business are left with the conventional business models that are not uncertainty resilient. Currently, business entities in RE in Nigeria are uncertain of; accessing the national grid, purchasing potentials of cooperating organizations, currency fluctuation and interest rate increases. Uncertainties such as the security of projects and government policy are issues entrepreneurs must navigate to remain sustainable in the solar panel industry in Nigeria. The aim of this paper is to identify how entrepreneurial firms consider uncertainties in developing workable business models for commercializing solar energy projects in Nigeria. In an attempt to develop a novel business model, the paper investigated how entrepreneurial firms assess and navigate uncertainties. The roles of key stakeholders in helping entrepreneurs to manage uncertainties in the Nigeria RE sector were probed in the ongoing study. The study explored empirical uncertainties that are peculiar to RE entrepreneurs in Nigeria. A mixed-mode of research was embraced using qualitative data from face-to-face interviews conducted on the Solar Energy Entrepreneurs and the experts drawn from key stakeholders. Content analysis of the interview was done using Atlas. It is a nine qualitative tool. The result suggested that all stakeholders are required to synergize in developing an uncertainty resilient business model. It was opined that the RE entrepreneurs need modifications in the business recommendations encapsulated in the energy policy in Nigeria to strengthen their capability in delivering solar energy solutions to the yawning Nigerians.

Keywords: uncertainties, entrepreneurial, business model, solar-panel

Procedia PDF Downloads 124
29025 Regional Flood-Duration-Frequency Models for Norway

Authors: Danielle M. Barna, Kolbjørn Engeland, Thordis Thorarinsdottir, Chong-Yu Xu

Abstract:

Design flood values give estimates of flood magnitude within a given return period and are essential to making adaptive decisions around land use planning, infrastructure design, and disaster mitigation. Often design flood values are needed at locations with insufficient data. Additionally, in hydrologic applications where flood retention is important (e.g., floodplain management and reservoir design), design flood values are required at different flood durations. A statistical approach to this problem is a development of a regression model for extremes where some of the parameters are dependent on flood duration in addition to being covariate-dependent. In hydrology, this is called a regional flood-duration-frequency (regional-QDF) model. Typically, the underlying statistical distribution is chosen to be the Generalized Extreme Value (GEV) distribution. However, as the support of the GEV distribution depends on both its parameters and the range of the data, special care must be taken with the development of the regional model. In particular, we find that the GEV is problematic when developing a GAMLSS-type analysis due to the difficulty of proposing a link function that is independent of the unknown parameters and the observed data. We discuss these challenges in the context of developing a regional QDF model for Norway.

Keywords: design flood values, bayesian statistics, regression modeling of extremes, extreme value analysis, GEV

Procedia PDF Downloads 46
29024 Evaluating Traffic Congestion Using the Bayesian Dirichlet Process Mixture of Generalized Linear Models

Authors: Ren Moses, Emmanuel Kidando, Eren Ozguven, Yassir Abdelrazig

Abstract:

This study applied traffic speed and occupancy to develop clustering models that identify different traffic conditions. Particularly, these models are based on the Dirichlet Process Mixture of Generalized Linear regression (DML) and change-point regression (CR). The model frameworks were implemented using 2015 historical traffic data aggregated at a 15-minute interval from an Interstate 295 freeway in Jacksonville, Florida. Using the deviance information criterion (DIC) to identify the appropriate number of mixture components, three traffic states were identified as free-flow, transitional, and congested condition. Results of the DML revealed that traffic occupancy is statistically significant in influencing the reduction of traffic speed in each of the identified states. Influence on the free-flow and the congested state was estimated to be higher than the transitional flow condition in both evening and morning peak periods. Estimation of the critical speed threshold using CR revealed that 47 mph and 48 mph are speed thresholds for congested and transitional traffic condition during the morning peak hours and evening peak hours, respectively. Free-flow speed thresholds for morning and evening peak hours were estimated at 64 mph and 66 mph, respectively. The proposed approaches will facilitate accurate detection and prediction of traffic congestion for developing effective countermeasures.

Keywords: traffic congestion, multistate speed distribution, traffic occupancy, Dirichlet process mixtures of generalized linear model, Bayesian change-point detection

Procedia PDF Downloads 267
29023 The Use of Geographically Weighted Regression for Deforestation Analysis: Case Study in Brazilian Cerrado

Authors: Ana Paula Camelo, Keila Sanches

Abstract:

The Geographically Weighted Regression (GWR) was proposed in geography literature to allow relationship in a regression model to vary over space. In Brazil, the agricultural exploitation of the Cerrado Biome is the main cause of deforestation. In this study, we propose a methodology using geostatistical methods to characterize the spatial dependence of deforestation in the Cerrado based on agricultural production indicators. Therefore, it was used the set of exploratory spatial data analysis tools (ESDA) and confirmatory analysis using GWR. It was made the calibration a non-spatial model, evaluation the nature of the regression curve, election of the variables by stepwise process and multicollinearity analysis. After the evaluation of the non-spatial model was processed the spatial-regression model, statistic evaluation of the intercept and verification of its effect on calibration. In an analysis of Spearman’s correlation the results between deforestation and livestock was +0.783 and with soybeans +0.405. The model presented R²=0.936 and showed a strong spatial dependence of agricultural activity of soybeans associated to maize and cotton crops. The GWR is a very effective tool presenting results closer to the reality of deforestation in the Cerrado when compared with other analysis.

Keywords: deforestation, geographically weighted regression, land use, spatial analysis

Procedia PDF Downloads 334
29022 The Effect of Particle Porosity in Mixed Matrix Membrane Permeation Models

Authors: Z. Sadeghi, M. R. Omidkhah, M. E. Masoomi

Abstract:

The purpose of this paper is to examine gas transport behavior of mixed matrix membranes (MMMs) combined with porous particles. Main existing models are categorized in two main groups; two-phase (ideal contact) and three-phase (non-ideal contact). A new coefficient, J, was obtained to express equations for estimating effect of the particle porosity in two-phase and three-phase models. Modified models evaluates with existing models and experimental data using Matlab software. Comparison of gas permeability of proposed modified models with existing models in different MMMs shows a better prediction of gas permeability in MMMs.

Keywords: mixed matrix membrane, permeation models, porous particles, porosity

Procedia PDF Downloads 349
29021 Prediction of Mechanical Strength of Multiscale Hybrid Reinforced Cementitious Composite

Authors: Salam Alrekabi, A. B. Cundy, Mohammed Haloob Al-Majidi

Abstract:

Novel multiscale hybrid reinforced cementitious composites based on carbon nanotubes (MHRCC-CNT), and carbon nanofibers (MHRCC-CNF) are new types of cement-based material fabricated with micro steel fibers and nanofilaments, featuring superior strain hardening, ductility, and energy absorption. This study focused on established models to predict the compressive strength, and direct and splitting tensile strengths of the produced cementitious composites. The analysis was carried out based on the experimental data presented by the previous author’s study, regression analysis, and the established models that available in the literature. The obtained models showed small differences in the predictions and target values with experimental verification indicated that the estimation of the mechanical properties could be achieved with good accuracy.

Keywords: multiscale hybrid reinforced cementitious composites, carbon nanotubes, carbon nanofibers, mechanical strength prediction

Procedia PDF Downloads 142
29020 Technical and Practical Aspects of Sizing a Autonomous PV System

Authors: Abdelhak Bouchakour, Mustafa Brahami, Layachi Zaghba

Abstract:

The use of photovoltaic energy offers an inexhaustible supply of energy but also a clean and non-polluting energy, which is a definite advantage. The geographical location of Algeria promotes the development of the use of this energy. Indeed, given the importance of the intensity of the radiation received and the duration of sunshine. For this reason, the objective of our work is to develop a data-processing tool (software) of calculation and optimization of dimensioning of the photovoltaic installations. Our approach of optimization is basing on mathematical models, which amongst other things describe the operation of each part of the installation, the energy production, the storage and the consumption of energy.

Keywords: solar panel, solar radiation, inverter, optimization

Procedia PDF Downloads 581
29019 Big Data Analysis with Rhipe

Authors: Byung Ho Jung, Ji Eun Shin, Dong Hoon Lim

Abstract:

Rhipe that integrates R and Hadoop environment made it possible to process and analyze massive amounts of data using a distributed processing environment. In this paper, we implemented multiple regression analysis using Rhipe with various data sizes of actual data. Experimental results for comparing the performance of our Rhipe with stats and biglm packages available on bigmemory, showed that our Rhipe was more fast than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases. We also compared the computing speeds of pseudo-distributed and fully-distributed modes for configuring Hadoop cluster. The results showed that fully-distributed mode was faster than pseudo-distributed mode, and computing speeds of fully-distributed mode were faster as the number of data nodes increases.

Keywords: big data, Hadoop, Parallel regression analysis, R, Rhipe

Procedia PDF Downloads 478
29018 Foreign Direct Investment, Economic Growth and CO2 Emissions: Evidence from WAIFEM Member Countries

Authors: Nasiru Inuwa, Haruna Usman Modibbo, Yahya Zakari Abdullahi

Abstract:

The purpose of this paper is to investigate the effects of foreign direct investment (FDI), economic growth on carbon emissions in context of WAIFEM member countries. The Im-Pesaran-Shin panel unit root test, Kao residual based test panel cointegration technique and panel Granger causality tests over the period 1980-2012 within a multivariate framework were applied. The results of cointegration test revealed a long run equilibrium relationship among CO2 emissions, economic growth and foreign direct investment. The results of Granger causality tests revealed a unidirectional causality running from economic growth to CO2 emissions for the panel of WAIFEM countries at the 5% level. Also, Granger causality runs from economic growth to foreign direct investment without feedback. However, no causality relationship between foreign direct investment and CO2 emissions for the panel of WAIFEM countries was observed. The study therefore, suggest that policy makers from WAIFEM member countries should design policies aim at attracting more foreign direct investments inflow as well the adoption of cleaner production technologies in order to reduce CO2 emissions.

Keywords: economic growth, CO2 emissions, causality, WAIFEM

Procedia PDF Downloads 535
29017 Validation of Escherichia coli O157:H7 Inactivation on Apple-Carrot Juice Treated with Manothermosonication by Kinetic Models

Authors: Ozan Kahraman, Hao Feng

Abstract:

Several models such as Weibull, Modified Gompertz, Biphasic linear, and Log-logistic models have been proposed in order to describe non-linear inactivation kinetics and used to fit non-linear inactivation data of several microorganisms for inactivation by heat, high pressure processing or pulsed electric field. First-order kinetic parameters (D-values and z-values) have often been used in order to identify microbial inactivation by non-thermal processing methods such as ultrasound. Most ultrasonic inactivation studies employed first-order kinetic parameters (D-values and z-values) in order to describe the reduction on microbial survival count. This study was conducted to analyze the E. coli O157:H7 inactivation data by using five microbial survival models (First-order, Weibull, Modified Gompertz, Biphasic linear and Log-logistic). First-order, Weibull, Modified Gompertz, Biphasic linear and Log-logistic kinetic models were used for fitting inactivation curves of Escherichia coli O157:H7. The residual sum of squares and the total sum of squares criteria were used to evaluate the models. The statistical indices of the kinetic models were used to fit inactivation data for E. coli O157:H7 by MTS at three temperatures (40, 50, and 60 0C) and three pressures (100, 200, and 300 kPa). Based on the statistical indices and visual observations, the Weibull and Biphasic models were best fitting of the data for MTS treatment as shown by high R2 values. The non-linear kinetic models, including the Modified Gompertz, First-order, and Log-logistic models did not provide any better fit to data from MTS compared the Weibull and Biphasic models. It was observed that the data found in this study did not follow the first-order kinetics. It is possibly because of the cells which are sensitive to ultrasound treatment were inactivated first, resulting in a fast inactivation period, while those resistant to ultrasound were killed slowly. The Weibull and biphasic models were found as more flexible in order to determine the survival curves of E. coli O157:H7 treated by MTS on apple-carrot juice.

Keywords: Weibull, Biphasic, MTS, kinetic models, E.coli O157:H7

Procedia PDF Downloads 342
29016 Geographic Information System for District Level Energy Performance Simulations

Authors: Avichal Malhotra, Jerome Frisch, Christoph van Treeck

Abstract:

The utilization of semantic, cadastral and topological data from geographic information systems (GIS) has exponentially increased for building and urban-scale energy performance simulations. Urban planners, simulation scientists, and researchers use virtual 3D city models for energy analysis, algorithms and simulation tools. For dynamic energy simulations at city and district level, this paper provides an overview of the available GIS data models and their levels of detail. Adhering to different norms and standards, these models also intend to describe building and construction industry data. For further investigations, CityGML data models are considered for simulations. Though geographical information modelling has considerably many different implementations, extensions of virtual city data can also be made for domain specific applications. Highlighting the use of the extended CityGML models for energy researches, a brief introduction to the Energy Application Domain Extension (ADE) along with its significance is made. Consequently, addressing specific input simulation data, a workflow using Modelica underlining the usage of GIS information and the quantification of its significance over annual heating energy demand is presented in this paper.

Keywords: CityGML, EnergyADE, energy performance simulation, GIS

Procedia PDF Downloads 146
29015 Comparison of Multivariate Adaptive Regression Splines and Random Forest Regression in Predicting Forced Expiratory Volume in One Second

Authors: P. V. Pramila , V. Mahesh

Abstract:

Pulmonary Function Tests are important non-invasive diagnostic tests to assess respiratory impairments and provides quantifiable measures of lung function. Spirometry is the most frequently used measure of lung function and plays an essential role in the diagnosis and management of pulmonary diseases. However, the test requires considerable patient effort and cooperation, markedly related to the age of patients esulting in incomplete data sets. This paper presents, a nonlinear model built using Multivariate adaptive regression splines and Random forest regression model to predict the missing spirometric features. Random forest based feature selection is used to enhance both the generalization capability and the model interpretability. In the present study, flow-volume data are recorded for N= 198 subjects. The ranked order of feature importance index calculated by the random forests model shows that the spirometric features FVC, FEF 25, PEF,FEF 25-75, FEF50, and the demographic parameter height are the important descriptors. A comparison of performance assessment of both models prove that, the prediction ability of MARS with the `top two ranked features namely the FVC and FEF 25 is higher, yielding a model fit of R2= 0.96 and R2= 0.99 for normal and abnormal subjects. The Root Mean Square Error analysis of the RF model and the MARS model also shows that the latter is capable of predicting the missing values of FEV1 with a notably lower error value of 0.0191 (normal subjects) and 0.0106 (abnormal subjects). It is concluded that combining feature selection with a prediction model provides a minimum subset of predominant features to train the model, yielding better prediction performance. This analysis can assist clinicians with a intelligence support system in the medical diagnosis and improvement of clinical care.

Keywords: FEV, multivariate adaptive regression splines pulmonary function test, random forest

Procedia PDF Downloads 280
29014 Using Machine Learning to Enhance Win Ratio for College Ice Hockey Teams

Authors: Sadixa Sanjel, Ahmed Sadek, Naseef Mansoor, Zelalem Denekew

Abstract:

Collegiate ice hockey (NCAA) sports analytics is different from the national level hockey (NHL). We apply and compare multiple machine learning models such as Linear Regression, Random Forest, and Neural Networks to predict the win ratio for a team based on their statistics. Data exploration helps determine which statistics are most useful in increasing the win ratio, which would be beneficial to coaches and team managers. We ran experiments to select the best model and chose Random Forest as the best performing. We conclude with how to bridge the gap between the college and national levels of sports analytics and the use of machine learning to enhance team performance despite not having a lot of metrics or budget for automatic tracking.

Keywords: NCAA, NHL, sports analytics, random forest, regression, neural networks, game predictions

Procedia PDF Downloads 85
29013 The Effects of Corporate Governance on Firm’s Financial Performance: A Study of Family and Non-family Owned Firms in Pakistan

Authors: Saad Bin Nasir

Abstract:

This research will examine the impact of corporate governance on firm performance in family and non-family owned firms in Pakistan. For the purpose of this research, corporate governance mechanisms which included are board size, board composition, leadership structure, board meetings are taken as independent variable and firm performance taken as dependent variable and it will be measured with return on asset and return on equity. Firm size and firm’s age will be taken as control variables. Secondary data will collect from audited annul reports of companies and panel data regression model will applied, to check the impact of corporate governance on firm performance.

Keywords: board size, board composition, Leadership Structure, board meetings, firm performance, family and non-family owned firms

Procedia PDF Downloads 352
29012 A Survey on Quasi-Likelihood Estimation Approaches for Longitudinal Set-ups

Authors: Naushad Mamode Khan

Abstract:

The Com-Poisson (CMP) model is one of the most popular discrete generalized linear models (GLMS) that handles both equi-, over- and under-dispersed data. In longitudinal context, an integer-valued autoregressive (INAR(1)) process that incorporates covariate specification has been developed to model longitudinal CMP counts. However, the joint likelihood CMP function is difficult to specify and thus restricts the likelihood based estimating methodology. The joint generalized quasilikelihood approach (GQL-I) was instead considered but is rather computationally intensive and may not even estimate the regression effects due to a complex and frequently ill conditioned covariance structure. This paper proposes a new GQL approach for estimating the regression parameters (GQLIII) that are based on a single score vector representation. The performance of GQL-III is compared with GQL-I and separate marginal GQLs (GQL-II) through some simulation experiments and is proved to yield equally efficient estimates as GQL-I and is far more computationally stable.

Keywords: longitudinal, com-Poisson, ill-conditioned, INAR(1), GLMS, GQL

Procedia PDF Downloads 335
29011 Partial Least Square Regression for High-Dimentional and High-Correlated Data

Authors: Mohammed Abdullah Alshahrani

Abstract:

The research focuses on investigating the use of partial least squares (PLS) methodology for addressing challenges associated with high-dimensional correlated data. Recent technological advancements have led to experiments producing data characterized by a large number of variables compared to observations, with substantial inter-variable correlations. Such data patterns are common in chemometrics, where near-infrared (NIR) spectrometer calibrations record chemical absorbance levels across hundreds of wavelengths, and in genomics, where thousands of genomic regions' copy number alterations (CNA) are recorded from cancer patients. PLS serves as a widely used method for analyzing high-dimensional data, functioning as a regression tool in chemometrics and a classification method in genomics. It handles data complexity by creating latent variables (components) from original variables. However, applying PLS can present challenges. The study investigates key areas to address these challenges, including unifying interpretations across three main PLS algorithms and exploring unusual negative shrinkage factors encountered during model fitting. The research presents an alternative approach to addressing the interpretation challenge of predictor weights associated with PLS. Sparse estimation of predictor weights is employed using a penalty function combining a lasso penalty for sparsity and a Cauchy distribution-based penalty to account for variable dependencies. The results demonstrate sparse and grouped weight estimates, aiding interpretation and prediction tasks in genomic data analysis. High-dimensional data scenarios, where predictors outnumber observations, are common in regression analysis applications. Ordinary least squares regression (OLS), the standard method, performs inadequately with high-dimensional and highly correlated data. Copy number alterations (CNA) in key genes have been linked to disease phenotypes, highlighting the importance of accurate classification of gene expression data in bioinformatics and biology using regularized methods like PLS for regression and classification.

Keywords: partial least square regression, genetics data, negative filter factors, high dimensional data, high correlated data

Procedia PDF Downloads 14
29010 Impact of Board Characteristics on Financial Performance: A Study of Manufacturing Sector of Pakistan

Authors: Saad Bin Nasir

Abstract:

The research will examine the role of corporate governance (CG) practices on firm’s financial performance. Population of this research will be manufacture sector of Pakistan. For the purposes of measurement of impact of corporate governance practices such as board size, board independence, ceo/chairman duality, will take as independent variables and for the measurement of firm’s performance return on assets and return on equity will take as dependent variables. Panel data regression model will be used to estimate the impact of CG on firm performance.

Keywords: corporate governance, board size, board independence, leadership

Procedia PDF Downloads 494
29009 Thin-Layer Drying Characteristics and Modelling of Instant Coffee Solution

Authors: Apolinar Picado, Ronald Solís, Rafael Gamero

Abstract:

The thin-layer drying characteristics of instant coffee solution were investigated in a laboratory tunnel dryer. Drying experiments were carried out at three temperatures (80, 100 and 120 °C) and an air velocity of 1.2 m/s. Drying experimental data obtained are fitted to six (6) thin-layer drying models using the non-linear least squares regression analysis. The acceptability of the thin-layer drying model has been based on a value of the correlation coefficient that should be close to one, and low values for root mean square error (RMSE) and chi-square (x²). According to this evaluation, the most suitable model for describing drying process of thin-layer instant coffee solution is the Page model. Further, the effective moisture diffusivity and the activation energy were computed employing the drying experimental data. The effective moisture diffusivity values varied from 1.6133 × 10⁻⁹ to 1.6224 × 10⁻⁹ m²/s over the temperature range studied and the activation energy was estimated to be 162.62 J/mol.

Keywords: activation energy, diffusivity, instant coffee, thin-layer models

Procedia PDF Downloads 231
29008 Imputing Missing Data in Electronic Health Records: A Comparison of Linear and Non-Linear Imputation Models

Authors: Alireza Vafaei Sadr, Vida Abedi, Jiang Li, Ramin Zand

Abstract:

Missing data is a common challenge in medical research and can lead to biased or incomplete results. When the data bias leaks into models, it further exacerbates health disparities; biased algorithms can lead to misclassification and reduced resource allocation and monitoring as part of prevention strategies for certain minorities and vulnerable segments of patient populations, which in turn further reduce data footprint from the same population – thus, a vicious cycle. This study compares the performance of six imputation techniques grouped into Linear and Non-Linear models on two different realworld electronic health records (EHRs) datasets, representing 17864 patient records. The mean absolute percentage error (MAPE) and root mean squared error (RMSE) are used as performance metrics, and the results show that the Linear models outperformed the Non-Linear models in terms of both metrics. These results suggest that sometimes Linear models might be an optimal choice for imputation in laboratory variables in terms of imputation efficiency and uncertainty of predicted values.

Keywords: EHR, machine learning, imputation, laboratory variables, algorithmic bias

Procedia PDF Downloads 53
29007 Influential Parameters in Estimating Soil Properties from Cone Penetrating Test: An Artificial Neural Network Study

Authors: Ahmed G. Mahgoub, Dahlia H. Hafez, Mostafa A. Abu Kiefa

Abstract:

The Cone Penetration Test (CPT) is a common in-situ test which generally investigates a much greater volume of soil more quickly than possible from sampling and laboratory tests. Therefore, it has the potential to realize both cost savings and assessment of soil properties rapidly and continuously. The principle objective of this paper is to demonstrate the feasibility and efficiency of using artificial neural networks (ANNs) to predict the soil angle of internal friction (Φ) and the soil modulus of elasticity (E) from CPT results considering the uncertainties and non-linearities of the soil. In addition, ANNs are used to study the influence of different parameters and recommend which parameters should be included as input parameters to improve the prediction. Neural networks discover relationships in the input data sets through the iterative presentation of the data and intrinsic mapping characteristics of neural topologies. General Regression Neural Network (GRNN) is one of the powerful neural network architectures which is utilized in this study. A large amount of field and experimental data including CPT results, plate load tests, direct shear box, grain size distribution and calculated data of overburden pressure was obtained from a large project in the United Arab Emirates. This data was used for the training and the validation of the neural network. A comparison was made between the obtained results from the ANN's approach, and some common traditional correlations that predict Φ and E from CPT results with respect to the actual results of the collected data. The results show that the ANN is a very powerful tool. Very good agreement was obtained between estimated results from ANN and actual measured results with comparison to other correlations available in the literature. The study recommends some easily available parameters that should be included in the estimation of the soil properties to improve the prediction models. It is shown that the use of friction ration in the estimation of Φ and the use of fines content in the estimation of E considerable improve the prediction models.

Keywords: angle of internal friction, cone penetrating test, general regression neural network, soil modulus of elasticity

Procedia PDF Downloads 398
29006 Performance Analysis of Proprietary and Non-Proprietary Tools for Regression Testing Using Genetic Algorithm

Authors: K. Hema Shankari, R. Thirumalaiselvi, N. V. Balasubramanian

Abstract:

The present paper addresses to the research in the area of regression testing with emphasis on automated tools as well as prioritization of test cases. The uniqueness of regression testing and its cyclic nature is pointed out. The difference in approach between industry, with business model as basis, and academia, with focus on data mining, is highlighted. Test Metrics are discussed as a prelude to our formula for prioritization; a case study is further discussed to illustrate this methodology. An industrial case study is also described in the paper, where the number of test cases is so large that they have to be grouped as Test Suites. In such situations, a genetic algorithm proposed by us can be used to reconfigure these Test Suites in each cycle of regression testing. The comparison is made between a proprietary tool and an open source tool using the above-mentioned metrics. Our approach is clarified through several tables.

Keywords: APFD metric, genetic algorithm, regression testing, RFT tool, test case prioritization, selenium tool

Procedia PDF Downloads 403
29005 Instability Index Method and Logistic Regression to Assess Landslide Susceptibility in County Route 89, Taiwan

Authors: Y. H. Wu, Ji-Yuan Lin, Yu-Ming Liou

Abstract:

This study aims to set up the landslide susceptibility map of County Route 89 at Ren-Ai Township in Nantou County using the Instability Index Method and Logistic regression. Seven susceptibility factors including Slope Angle, Aspect, Elevation, Distance to fold, Distance to River, Distance to Road and Accumulated Rainfall were obtained by GIS based on the Typhoon Toraji landslide area identified by Industrial Technology Research Institute in 2001. To calculate the landslide percentage of each factor and acquire the weight and grade the grid by means of Instability Index Method. In this study, landslide susceptibility can be classified into four grades: high, medium high, medium low and low, in order to determine the advantages and disadvantages of the two models. The precision of this model is verified by classification error matrix and SRC curve. These results suggest that the logistic regression model is a preferred method than instability index in the assessment of landslide susceptibility. It is suitable for the landslide prediction and precaution in this area in the future.

Keywords: instability index method, logistic regression, landslide susceptibility, SRC curve

Procedia PDF Downloads 261
29004 Increment of Panel Flutter Margin Using Adaptive Stiffeners

Authors: S. Raja, K. M. Parammasivam, V. Aghilesh

Abstract:

Fluid-structure interaction is a crucial consideration in the design of many engineering systems such as flight vehicles and bridges. Aircraft lifting surfaces and turbine blades can fail due to oscillations caused by fluid-structure interaction. Hence, it is focussed to study the fluid-structure interaction in the present research. First, the effect of free vibration over the panel is studied. It is well known that the deformation of a panel and flow induced forces affects one another. The selected panel has a span 300mm, chord 300mm and thickness 2 mm. The project is to study, the effect of cross-sectional area and the stiffener location is carried out for the same panel. The stiffener spacing is varied along both the chordwise and span-wise direction. Then for that optimal location the ideal stiffener length is identified. The effect of stiffener cross-section shapes (T, I, Hat, Z) over flutter velocity has been conducted. The flutter velocities of the selected panel with two rectangular stiffeners of cantilever configuration are estimated using MSC NASTRAN software package. As the flow passes over the panel, deformation takes place which further changes the flow structure over it. With increasing velocity, the deformation goes on increasing, but the stiffness of the system tries to dampen the excitation and maintain equilibrium. But beyond a critical velocity, the system damping suddenly becomes ineffective, so it loses its equilibrium. This estimated in NASTRAN using PK method. The first 10 modal frequencies of a simple panel and stiffened panel are estimated numerically and are validated with open literature. A grid independence study is also carried out and the modal frequency values remain the same for element lengths less than 20 mm. The current investigation concludes that the span-wise stiffener placement is more effective than the chord-wise placement. The maximum flutter velocity achieved for chord-wise placement is 204 m/s while for a span-wise arrangement it is augmented to 963 m/s for the stiffeners location of ¼ and ¾ of the chord from the panel edge (50% of chord from either side of the mid-chord line). The flutter velocity is directly proportional to the stiffener cross-sectional area. A significant increment in flutter velocity from 218m/s to 1024m/s is observed for the stiffener lengths varying from 50% to 60% of the span. The maximum flutter velocity above Mach 3 is achieved. It is also observed that for a stiffened panel, the full effect of stiffener can be achieved only when the stiffener end is clamped. Stiffeners with Z cross section incremented the flutter velocity from 142m/s (Panel with no stiffener) to 328 m/s, which is 2.3 times that of simple panel.

Keywords: stiffener placement, stiffener cross-sectional area, stiffener length, stiffener cross sectional area shape

Procedia PDF Downloads 269