Search results for: principal component regression (PCR)
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5969

Search results for: principal component regression (PCR)

5819 Model Averaging for Poisson Regression

Authors: Zhou Jianhong

Abstract:

Model averaging is a desirable approach to deal with model uncertainty, which, however, has rarely been explored for Poisson regression. In this paper, we propose a model averaging procedure based on an unbiased estimator of the expected Kullback-Leibler distance for the Poisson regression. Simulation study shows that the proposed model average estimator outperforms some other commonly used model selection and model average estimators in some situations. Our proposed methods are further applied to a real data example and the advantage of this method is demonstrated again.

Keywords: model averaging, poission regression, Kullback-Leibler distance, statistics

Procedia PDF Downloads 491
5818 Principal Well-Being at Hong Kong: A Quantitative Investigation

Authors: Junjun Chen, Yingxiu Li

Abstract:

The occupational well-being of school principals has played a vital role in the pursuit of individual and school wellness and success. However, principals’ well-being worldwide is under increasing threat because of the challenging and complex nature of their work and growing demands for school standardisation and accountability. Pressure is particularly acute in the post-pandemicfuture as principals attempt to deal with the impact of the pandemic on top of more regular demands. This is particularly true in Hong Kong, as school principals are increasingly wedged between unparalleled political, social, and academic responsibilities. Recognizing the semantic breadth of well-being, scholars have not determined a single, mutually agreeable definition but agreed that the concept of well-being has multiple dimensions across various disciplines. The multidimensional approach promises more precise assessments of the relationships between well-being and other concepts than the ‘affect-only’ approach or other single domains for capturing the essence of principal well-being. The multiple-dimension well-being concept is adopted in this project to understand principal well-being in this study. This study aimed to understand the situation of principal well-being and its influential drivers with a sample of 670 principals from Hong Kong and Mainland China. An online survey was sent to the participants after the breakout of COVID-19 by the researchers. All participants were well informed about the purposes and procedure of the project and the confidentiality of the data prior to filling in the questionnaire. Confirmatory factor analysis and structural equation modelling performed with Mplus were employed to deal with the dataset. The data analysis procedure involved the following three steps. First, the descriptive statistics (e.g., mean and standard deviation) were calculated. Second, confirmatory factor analysis (CFA) was used to trim principal well-being measurement performed with maximum likelihood estimation. Third, structural equation modelling (SEM) was employed to test the influential factors of principal well-being. The results of this study indicated that the overall of principal well-being were above the average mean score. The highest ranking in this study given by the principals was to their psychological and social well-being (M = 5.21). This was followed by spiritual (M = 5.14; SD = .77), cognitive (M = 5.14; SD = .77), emotional (M = 4.96; SD = .79), and physical well-being (M = 3.15; SD = .73). Participants ranked their physical well-being the lowest. Moreover, professional autonomy, supervisor and collegial support, school physical conditions, professional networking, and social media have showed a significant impact on principal well-being. The findings of this study will potentially enhance not only principal well-being, but also the functioning of an individual principal and a school without sacrificing principal well-being for quality education in the process. This will eventually move one step forward for a new future - a wellness society advocated by OECD. Importantly, well-being is an inside job that begins with choosing to have wellness, whilst supports to become a wellness principal are also imperative.

Keywords: well-being, school principals, quantitative, influential factors

Procedia PDF Downloads 58
5817 Establishment of the Regression Uncertainty of the Critical Heat Flux Power Correlation for an Advanced Fuel Bundle

Authors: L. Q. Yuan, J. Yang, A. Siddiqui

Abstract:

A new regression uncertainty analysis methodology was applied to determine the uncertainties of the critical heat flux (CHF) power correlation for an advanced 43-element bundle design, which was developed by Canadian Nuclear Laboratories (CNL) to achieve improved economics, resource utilization and energy sustainability. The new methodology is considered more appropriate than the traditional methodology in the assessment of the experimental uncertainty associated with regressions. The methodology was first assessed using both the Monte Carlo Method (MCM) and the Taylor Series Method (TSM) for a simple linear regression model, and then extended successfully to a non-linear CHF power regression model (CHF power as a function of inlet temperature, outlet pressure and mass flow rate). The regression uncertainty assessed by MCM agrees well with that by TSM. An equation to evaluate the CHF power regression uncertainty was developed and expressed as a function of independent variables that determine the CHF power.

Keywords: CHF experiment, CHF correlation, regression uncertainty, Monte Carlo Method, Taylor Series Method

Procedia PDF Downloads 392
5816 Social Economy Effects on Wetlands Change in China during Three Decades Rapid Growth Period

Authors: Ying Ge

Abstract:

Wetlands are one of the essential types of ecosystems in the world. They are of great value to human society thanks to their special ecosystem functions and services, such as protecting biodiversity, adjusting hydrology and climate, providing essential habitats and, products and tourism resources. However, wetlands worldwide are degrading severely due to climate change, accelerated urbanization, and rapid economic development. Both nature and human factors drive wetland change, and the influences are variable from wetland types. Thus, the objectives of this study were to (1) to compare the changes in China’s wetland area during the three decades rapid growth period (1978-2008); (2) to analyze the effects of social economy and environmental factors on wetlands change (area loss and change of wetland types) in China during the high-speed economic development. The socio-economic influencing factors include population, income, education, development of agriculture, industry, infrastructure, wastewater amount, etc. Several statistical methods (canonical correlation analysis, principal component analysis, and regression analysis) were employed to analyze the relationship between socio-economic indicators and wetland area change. This study will determine the relevant driving socio-economic factors on wetland changes, which is of great significance for wetland protection and management.

Keywords: socioeconomic effects, China, wetland change, wetland type

Procedia PDF Downloads 50
5815 Antibacterial Evaluation, in Silico ADME and QSAR Studies of Some Benzimidazole Derivatives

Authors: Strahinja Kovačević, Lidija Jevrić, Miloš Kuzmanović, Sanja Podunavac-Kuzmanović

Abstract:

In this paper, various derivatives of benzimidazole have been evaluated against Gram-negative bacteria Escherichia coli. For all investigated compounds the minimum inhibitory concentration (MIC) was determined. Quantitative structure-activity relationships (QSAR) attempts to find consistent relationships between the variations in the values of molecular properties and the biological activity for a series of compounds so that these rules can be used to evaluate new chemical entities. The correlation between MIC and some absorption, distribution, metabolism and excretion (ADME) parameters was investigated, and the mathematical models for predicting the antibacterial activity of this class of compounds were developed. The quality of the multiple linear regression (MLR) models was validated by the leave-one-out (LOO) technique, as well as by the calculation of the statistical parameters for the developed models and the results are discussed on the basis of the statistical data. The results of this study indicate that ADME parameters have a significant effect on the antibacterial activity of this class of compounds. Principal component analysis (PCA) and agglomerative hierarchical clustering algorithms (HCA) confirmed that the investigated molecules can be classified into groups on the basis of the ADME parameters: Madin-Darby Canine Kidney cell permeability (MDCK), Plasma protein binding (PPB%), human intestinal absorption (HIA%) and human colon carcinoma cell permeability (Caco-2).

Keywords: benzimidazoles, QSAR, ADME, in silico

Procedia PDF Downloads 349
5814 Non-Parametric Regression over Its Parametric Couterparts with Large Sample Size

Authors: Jude Opara, Esemokumo Perewarebo Akpos

Abstract:

This paper is on non-parametric linear regression over its parametric counterparts with large sample size. Data set on anthropometric measurement of primary school pupils was taken for the analysis. The study used 50 randomly selected pupils for the study. The set of data was subjected to normality test, and it was discovered that the residuals are not normally distributed (i.e. they do not follow a Gaussian distribution) for the commonly used least squares regression method for fitting an equation into a set of (x,y)-data points using the Anderson-Darling technique. The algorithms for the nonparametric Theil’s regression are stated in this paper as well as its parametric OLS counterpart. The use of a programming language software known as “R Development” was used in this paper. From the analysis, the result showed that there exists a significant relationship between the response and the explanatory variable for both the parametric and non-parametric regression. To know the efficiency of one method over the other, the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) are used, and it is discovered that the nonparametric regression performs better than its parametric regression counterparts due to their lower values in both the AIC and BIC. The study however recommends that future researchers should study a similar work by examining the presence of outliers in the data set, and probably expunge it if detected and re-analyze to compare results.

Keywords: Theil’s regression, Bayesian information criterion, Akaike information criterion, OLS

Procedia PDF Downloads 278
5813 Quantitative Structure-Property Relationship Study of Base Dissociation Constants of Some Benzimidazoles

Authors: Sanja O. Podunavac-Kuzmanović, Lidija R. Jevrić, Strahinja Z. Kovačević

Abstract:

Benzimidazoles are a group of compounds with significant antibacterial, antifungal and anticancer activity. The studied compounds consist of the main benzimidazole structure with different combinations of substituens. This study is based on the two-dimensional and three-dimensional molecular modeling and calculation of molecular descriptors (physicochemical and lipophilicity descriptors) of structurally diverse benzimidazoles. Molecular modeling was carried out by using ChemBio3D Ultra version 14.0 software. The obtained 3D models were subjected to energy minimization using molecular mechanics force field method (MM2). The cutoff for structure optimization was set at a gradient of 0.1 kcal/Åmol. The obtained set of molecular descriptors was used in principal component analysis (PCA) of possible similarities and dissimilarities among the studied derivatives. After the molecular modeling, the quantitative structure-property relationship (QSPR) analysis was applied in order to get the mathematical models which can be used in prediction of pKb values of structurally similar benzimidazoles. The obtained models are based on statistically valid multiple linear regression (MLR) equations. The calculated cross-validation parameters indicate the high prediction ability of the established QSPR models. This study is financially supported by COST action CM1306 and the project No. 114-451-347/2015-02, financially supported by the Provincial Secretariat for Science and Technological Development of Vojvodina.

Keywords: benzimidazoles, chemometrics, molecular modeling, molecular descriptors, QSPR

Procedia PDF Downloads 258
5812 Electricity Load Modeling: An Application to Italian Market

Authors: Giovanni Masala, Stefania Marica

Abstract:

Forecasting electricity load plays a crucial role regards decision making and planning for economical purposes. Besides, in the light of the recent privatization and deregulation of the power industry, the forecasting of future electricity load turned out to be a very challenging problem. Empirical data about electricity load highlights a clear seasonal behavior (higher load during the winter season), which is partly due to climatic effects. We also emphasize the presence of load periodicity at a weekly basis (electricity load is usually lower on weekends or holidays) and at daily basis (electricity load is clearly influenced by the hour). Finally, a long-term trend may depend on the general economic situation (for example, industrial production affects electricity load). All these features must be captured by the model. The purpose of this paper is then to build an hourly electricity load model. The deterministic component of the model requires non-linear regression and Fourier series while we will investigate the stochastic component through econometrical tools. The calibration of the parameters’ model will be performed by using data coming from the Italian market in a 6 year period (2007- 2012). Then, we will perform a Monte Carlo simulation in order to compare the simulated data respect to the real data (both in-sample and out-of-sample inspection). The reliability of the model will be deduced thanks to standard tests which highlight a good fitting of the simulated values.

Keywords: ARMA-GARCH process, electricity load, fitting tests, Fourier series, Monte Carlo simulation, non-linear regression

Procedia PDF Downloads 377
5811 Knowledge Management Factors Affecting the Level of Commitment

Authors: Abbas Keramati, Abtin Boostani, Mohammad Jamal Sadeghi

Abstract:

This paper examines the influence of knowledge management factors on organizational commitment for employees in the oil and gas drilling industry of Iran. We determine what knowledge factors have the greatest impact on the personnel loyalty and commitment to the organization using collected data from a survey of over 300 full-time personnel working in three large companies active in oil and gas drilling industry of Iran. To specify the effect of knowledge factors in the organizational commitment of the personnel in the studied organizations, the Principal Component Analysis (PCA) is used. Findings of our study show that the factors such as knowledge and expertise, in-service training, the knowledge value and the application of individuals’ knowledge in the organization as the factor “learning and perception of personnel from the value of knowledge within the organization” has the greatest impact on the organizational commitment. After this factor, “existence of knowledge and knowledge sharing environment in the organization”; “existence of potential knowledge exchanging in the organization”; and “organizational knowledge level” factors have the most impact on the organizational commitment of personnel, respectively.

Keywords: drilling industry, knowledge management, organizational commitment, loyalty, principle component analysis

Procedia PDF Downloads 326
5810 Prediction of Childbearing Orientations According to Couples' Sexual Review Component

Authors: Razieh Rezaeekalantari

Abstract:

Objective: The purpose of this study was to investigate the prediction of parenting orientations in terms of the components of couples' sexual review. Methods: This was a descriptive correlational research method. The population consisted of 500 couples referring to Sari Health Center. Two hundred and fifteen (215) people were selected randomly by using Krejcie-Morgan-sample-size-table. For data collection, the childbearing orientations scale and the Multidimensional Sexual Self-Concept Questionnaire were used. Result: For data analysis, the mean and standard deviation were used and to analyze the research hypothesis regression correlation and inferential statistics were used. Conclusion: The findings indicate that there is not a significant relationship between the tendency to childbearing and the predictive value of sexual review (r = 0.84) with significant level (sig = 219.19) (P < 0.05). So, with 95% confidence, we conclude that there is not a meaningful relationship between sexual orientation and tendency to child-rearing.

Keywords: couples referring, health center, sexual review component, parenting orientations

Procedia PDF Downloads 200
5809 Study on Principals Using Change Leadership to Promote School Innovation: A Case Study of a Primary School in Taiwan

Authors: Chih-Wen Fan

Abstract:

Backgrounds/ Research goals : School improvement requires change leadership, which often means discomfort. Principals are the key people that determine the effectiveness of schools. In an era of organization’s pursuit of speed and effectiveness, school administration has to be accountable and innovative. Effective principals work to improve achievement by focusing on the administrative and teaching quality of improvement. However, there is a lack of literature addressing the relevant case studies on school change leadership. This article explores how principals can use change leadership to drive school change. It analyze the driving factors of principal changes in the case school, the beliefs of change leadership, specific methods, and what impact they have. Methods: This study applies the case study research method to the selected primary school located in an urban area for case study, which has achieved excellent performance after reform and innovation. The researchers selected an older primary school located in an urban area that was transformed into a high-performance primary school after changes were enacted by the principal. The selected case was recommended by three supervisors of the Education Department. The case school underwent leadership change by the new principal during his term, and won an award from the Ministry of Education. Total of 8 teachers are interviewed. The data encoding includes interviews and documents. Expected results/ conclusions: The conclusions of the study are, as follows: (1) The influence for Principal Lin's change leadership is from internal and external environmental development and change pressures. (2) The principal's belief in change leadership is to recognize the sense of crisis, and to create a climate of change and demand for change. (3) The principal's specific actions are intended to identify key members, resolve resistance, use innovative thinking, and promote organizational learning. (4) Principal Lin's change leadership can enhance the professional functions of all employees through appropriate authorization. (5) The effectiveness of change leadership lies in teachers' participation in decision-making; the school's reputation has been enhanced through featured courses.

Keywords: change leadership, empowerment, crisis awareness, case study

Procedia PDF Downloads 108
5808 Content Based Video Retrieval System Using Principal Object Analysis

Authors: Van Thinh Bui, Anh Tuan Tran, Quoc Viet Ngo, The Bao Pham

Abstract:

Video retrieval is a searching problem on videos or clips based on content in which they are relatively close to an input image or video. The application of this retrieval consists of selecting video in a folder or recognizing a human in security camera. However, some recent approaches have been in challenging problem due to the diversity of video types, frame transitions and camera positions. Besides, that an appropriate measures is selected for the problem is a question. In order to overcome all obstacles, we propose a content-based video retrieval system in some main steps resulting in a good performance. From a main video, we process extracting keyframes and principal objects using Segmentation of Aggregating Superpixels (SAS) algorithm. After that, Speeded Up Robust Features (SURF) are selected from those principal objects. Then, the model “Bag-of-words” in accompanied by SVM classification are applied to obtain the retrieval result. Our system is performed on over 300 videos in diversity from music, history, movie, sports, and natural scene to TV program show. The performance is evaluated in promising comparison to the other approaches.

Keywords: video retrieval, principal objects, keyframe, segmentation of aggregating superpixels, speeded up robust features, bag-of-words, SVM

Procedia PDF Downloads 278
5807 Use of Multistage Transition Regression Models for Credit Card Income Prediction

Authors: Denys Osipenko, Jonathan Crook

Abstract:

Because of the variety of the card holders’ behaviour types and income sources each consumer account can be transferred to a variety of states. Each consumer account can be inactive, transactor, revolver, delinquent, defaulted and requires an individual model for the income prediction. The estimation of transition probabilities between statuses at the account level helps to avoid the memorylessness of the Markov Chains approach. This paper investigates the transition probabilities estimation approaches to credit cards income prediction at the account level. The key question of empirical research is which approach gives more accurate results: multinomial logistic regression or multistage conditional logistic regression with binary target. Both models have shown moderate predictive power. Prediction accuracy for conditional logistic regression depends on the order of stages for the conditional binary logistic regression. On the other hand, multinomial logistic regression is easier for usage and gives integrate estimations for all states without priorities. Thus further investigations can be concentrated on alternative modeling approaches such as discrete choice models.

Keywords: multinomial regression, conditional logistic regression, credit account state, transition probability

Procedia PDF Downloads 462
5806 Internet Purchases in European Union Countries: Multiple Linear Regression Approach

Authors: Ksenija Dumičić, Anita Čeh Časni, Irena Palić

Abstract:

This paper examines economic and Information and Communication Technology (ICT) development influence on recently increasing Internet purchases by individuals for European Union member states. After a growing trend for Internet purchases in EU27 was noticed, all possible regression analysis was applied using nine independent variables in 2011. Finally, two linear regression models were studied in detail. Conducted simple linear regression analysis confirmed the research hypothesis that the Internet purchases in analysed EU countries is positively correlated with statistically significant variable Gross Domestic Product per capita (GDPpc). Also, analysed multiple linear regression model with four regressors, showing ICT development level, indicates that ICT development is crucial for explaining the Internet purchases by individuals, confirming the research hypothesis.

Keywords: European union, Internet purchases, multiple linear regression model, outlier

Procedia PDF Downloads 279
5805 Quantitative Structure-Activity Relationship Analysis of Binding Affinity of a Series of Anti-Prion Compounds to Human Prion Protein

Authors: Strahinja Kovačević, Sanja Podunavac-Kuzmanović, Lidija Jevrić, Milica Karadžić

Abstract:

The present study is based on the quantitative structure-activity relationship (QSAR) analysis of eighteen compounds with anti-prion activity. The structures and anti-prion activities (expressed in response units, RU%) of the analyzed compounds are taken from CHEMBL database. In the first step of analysis 85 molecular descriptors were calculated and based on them the hierarchical cluster analysis (HCA) and principal component analysis (PCA) were carried out in order to detect potential significant similarities or dissimilarities among the studied compounds. The calculated molecular descriptors were physicochemical, lipophilicity and ADMET (absorption, distribution, metabolism, excretion and toxicity) descriptors. The first stage of the QSAR analysis was simple linear regression modeling. It resulted in one acceptable model that correlates Henry's law constant with RU% units. The obtained 2D-QSAR model was validated by cross-validation as an internal validation method. The validation procedure confirmed the model’s quality and therefore it can be used for prediction of anti-prion activity. The next stage of the analysis of anti-prion activity will include 3D-QSAR and molecular docking approaches in order to select the most promising compounds in treatment of prion diseases. These results are the part of the project No. 114-451-268/2016-02 financially supported by the Provincial Secretariat for Science and Technological Development of AP Vojvodina.

Keywords: anti-prion activity, chemometrics, molecular modeling, QSAR

Procedia PDF Downloads 274
5804 Modeling Thermal Changes of Urban Blocks in Relation to the Landscape Structure and Configuration in Guilan Province

Authors: Roshanak Afrakhteh, Abdolrasoul Salman Mahini, Mahdi Motagh, Hamidreza Kamyab

Abstract:

Urban Heat Islands (UHIs) are distinctive urban areas characterized by densely populated central cores surrounded by less densely populated peripheral lands. These areas experience elevated temperatures, primarily due to impermeable surfaces and specific land use patterns. The consequences of these temperature variations are far-reaching, impacting the environment and society negatively, leading to increased energy consumption, air pollution, and public health concerns. This paper emphasizes the need for simplified approaches to comprehend UHI temperature dynamics and explains how urban development patterns contribute to land surface temperature variation. To illustrate this relationship, the study focuses on the Guilan Plain, utilizing techniques like principal component analysis and generalized additive models. The research centered on mapping land use and land surface temperature in the low-lying area of Guilan province. Satellite data from Landsat sensors for three different time periods (2002, 2012, and 2021) were employed. Using eCognition software, a spatial unit known as a "city block" was utilized through object-based analysis. The study also applied the normalized difference vegetation index (NDVI) method to estimate land surface radiance. Predictive variables for urban land surface temperature within residential city blocks were identified categorized as intrinsic (related to the block's structure) and neighboring (related to adjacent blocks) variables. Principal Component Analysis (PCA) was used to select significant variables, and a Generalized Additive Model (GAM) approach, implemented using R's mgcv package, modeled the relationship between urban land surface temperature and predictor variables.Notable findings included variations in urban temperature across different years attributed to environmental and climatic factors. Block size, shared boundary, mother polygon area, and perimeter-to-area ratio were identified as main variables for the generalized additive regression model. This model showed non-linear relationships, with block size, shared boundary, and mother polygon area positively correlated with temperature, while the perimeter-to-area ratio displayed a negative trend. The discussion highlights the challenges of predicting urban surface temperature and the significance of block size in determining urban temperature patterns. It also underscores the importance of spatial configuration and unit structure in shaping urban temperature patterns. In conclusion, this study contributes to the growing body of research on the connection between land use patterns and urban surface temperature. Block size, along with block dispersion and aggregation, emerged as key factors influencing urban surface temperature in residential areas. The proposed methodology enhances our understanding of parameter significance in shaping urban temperature patterns across various regions, particularly in Iran.

Keywords: urban heat island, land surface temperature, LST modeling, GAM, Gilan province

Procedia PDF Downloads 48
5803 Deleterious SNP’s Detection Using Machine Learning

Authors: Hamza Zidoum

Abstract:

This paper investigates the impact of human genetic variation on the function of human proteins using machine-learning algorithms. Single-Nucleotide Polymorphism represents the most common form of human genome variation. We focus on the single amino-acid polymorphism located in the coding region as they can affect the protein function leading to pathologic phenotypic change. We use several supervised Machine Learning methods to identify structural properties correlated with increased risk of the missense mutation being damaging. SVM associated with Principal Component Analysis give the best performance.

Keywords: single-nucleotide polymorphism, machine learning, feature selection, SVM

Procedia PDF Downloads 350
5802 Real Estate Rigidities: The Effect of Cash Transactions and the Impact of Demonetisation on Them

Authors: Dishant Shahi, Aradhya Shandilya, Nand Kumar

Abstract:

We study here the impact of the black component referred to as X component in the text on Real estate transactions. The X component involved not only acts as friction in transaction but also leads to dysfunctionality in the capital market of real estate. The effect of the component is presented by using a model of economy which seeks resemblance with that of India involving property deals. The rigidities which hinder smooth transactions in property or land deals are depicted and their impact on the economy as a whole has been modelled. The effect of subprime crisis (2007) on Indian housing capital market and the role which the X component played during it, is also included in one of the sections. In the entire text, we have utilised 4 Quadrant graphs to study supply and demand causalities involved in commercial real estate. At the end we have included the impact of demonetisation as a move to counter the problem of overvaluation in the property assets arising due to the X component. The case of Demonetisation which has been the latest move by the Indian Government to control huge amount of black money in circulation has been included along with its impact on the housing and rent as well as the capital market.

Keywords: X-component, 4Q graph, real estate, capital markets, demonetisation, consumer sentiments

Procedia PDF Downloads 335
5801 Optimization of Slider Crank Mechanism Using Design of Experiments and Multi-Linear Regression

Authors: Galal Elkobrosy, Amr M. Abdelrazek, Bassuny M. Elsouhily, Mohamed E. Khidr

Abstract:

Crank shaft length, connecting rod length, crank angle, engine rpm, cylinder bore, mass of piston and compression ratio are the inputs that can control the performance of the slider crank mechanism and then its efficiency. Several combinations of these seven inputs are used and compared. The throughput engine torque predicted by the simulation is analyzed through two different regression models, with and without interaction terms, developed according to multi-linear regression using LU decomposition to solve system of algebraic equations. These models are validated. A regression model in seven inputs including their interaction terms lowered the polynomial degree from 3rd degree to 1st degree and suggested valid predictions and stable explanations.

Keywords: design of experiments, regression analysis, SI engine, statistical modeling

Procedia PDF Downloads 159
5800 Development of a Data-Driven Method for Diagnosing the State of Health of Battery Cells, Based on the Use of an Electrochemical Aging Model, with a View to Their Use in Second Life

Authors: Desplanches Maxime

Abstract:

Accurate estimation of the remaining useful life of lithium-ion batteries for electronic devices is crucial. Data-driven methodologies encounter challenges related to data volume and acquisition protocols, particularly in capturing a comprehensive range of aging indicators. To address these limitations, we propose a hybrid approach that integrates an electrochemical model with state-of-the-art data analysis techniques, yielding a comprehensive database. Our methodology involves infusing an aging phenomenon into a Newman model, leading to the creation of an extensive database capturing various aging states based on non-destructive parameters. This database serves as a robust foundation for subsequent analysis. Leveraging advanced data analysis techniques, notably principal component analysis and t-Distributed Stochastic Neighbor Embedding, we extract pivotal information from the data. This information is harnessed to construct a regression function using either random forest or support vector machine algorithms. The resulting predictor demonstrates a 5% error margin in estimating remaining battery life, providing actionable insights for optimizing usage. Furthermore, the database was built from the Newman model calibrated for aging and performance using data from a European project called Teesmat. The model was then initialized numerous times with different aging values, for instance, with varying thicknesses of SEI (Solid Electrolyte Interphase). This comprehensive approach ensures a thorough exploration of battery aging dynamics, enhancing the accuracy and reliability of our predictive model. Of particular importance is our reliance on the database generated through the integration of the electrochemical model. This database serves as a crucial asset in advancing our understanding of aging states. Beyond its capability for precise remaining life predictions, this database-driven approach offers valuable insights for optimizing battery usage and adapting the predictor to various scenarios. This underscores the practical significance of our method in facilitating better decision-making regarding lithium-ion battery management.

Keywords: Li-ion battery, aging, diagnostics, data analysis, prediction, machine learning, electrochemical model, regression

Procedia PDF Downloads 37
5799 Identification and Classification of Fiber-Fortified Semolina by Near-Infrared Spectroscopy (NIR)

Authors: Amanda T. Badaró, Douglas F. Barbin, Sofia T. Garcia, Maria Teresa P. S. Clerici, Amanda R. Ferreira

Abstract:

Food fortification is the intentional addition of a nutrient in a food matrix and has been widely used to overcome the lack of nutrients in the diet or increasing the nutritional value of food. Fortified food must meet the demand of the population, taking into account their habits and risks that these foods may cause. Wheat and its by-products, such as semolina, has been strongly indicated to be used as a food vehicle since it is widely consumed and used in the production of other foods. These products have been strategically used to add some nutrients, such as fibers. Methods of analysis and quantification of these kinds of components are destructive and require lengthy sample preparation and analysis. Therefore, the industry has searched for faster and less invasive methods, such as Near-Infrared Spectroscopy (NIR). NIR is a rapid and cost-effective method, however, it is based on indirect measurements, yielding high amount of data. Therefore, NIR spectroscopy requires calibration with mathematical and statistical tools (Chemometrics) to extract analytical information from the corresponding spectra, as Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). PCA is well suited for NIR, once it can handle many spectra at a time and be used for non-supervised classification. Advantages of the PCA, which is also a data reduction technique, is that it reduces the data spectra to a smaller number of latent variables for further interpretation. On the other hand, LDA is a supervised method that searches the Canonical Variables (CV) with the maximum separation among different categories. In LDA, the first CV is the direction of maximum ratio between inter and intra-class variances. The present work used a portable infrared spectrometer (NIR) for identification and classification of pure and fiber-fortified semolina samples. The fiber was added to semolina in two different concentrations, and after the spectra acquisition, the data was used for PCA and LDA to identify and discriminate the samples. The results showed that NIR spectroscopy associate to PCA was very effective in identifying pure and fiber-fortified semolina. Additionally, the classification range of the samples using LDA was between 78.3% and 95% for calibration and 75% and 95% for cross-validation. Thus, after the multivariate analysis such as PCA and LDA, it was possible to verify that NIR associated to chemometric methods is able to identify and classify the different samples in a fast and non-destructive way.

Keywords: Chemometrics, fiber, linear discriminant analysis, near-infrared spectroscopy, principal component analysis, semolina

Procedia PDF Downloads 186
5798 An Epsilon Hierarchical Fuzzy Twin Support Vector Regression

Authors: Arindam Chaudhuri

Abstract:

The research presents epsilon- hierarchical fuzzy twin support vector regression (epsilon-HFTSVR) based on epsilon-fuzzy twin support vector regression (epsilon-FTSVR) and epsilon-twin support vector regression (epsilon-TSVR). Epsilon-FTSVR is achieved by incorporating trapezoidal fuzzy numbers to epsilon-TSVR which takes care of uncertainty existing in forecasting problems. Epsilon-FTSVR determines a pair of epsilon-insensitive proximal functions by solving two related quadratic programming problems. The structural risk minimization principle is implemented by introducing regularization term in primal problems of epsilon-FTSVR. This yields dual stable positive definite problems which improves regression performance. Epsilon-FTSVR is then reformulated as epsilon-HFTSVR consisting of a set of hierarchical layers each containing epsilon-FTSVR. Experimental results on both synthetic and real datasets reveal that epsilon-HFTSVR has remarkable generalization performance with minimum training time.

Keywords: regression, epsilon-TSVR, epsilon-FTSVR, epsilon-HFTSVR

Procedia PDF Downloads 336
5797 Polycyclic Aromatic Hydrocarbons: Pollution and Ecological Risk Assessment in Surface Soil of the Tezpur Town, on the North Bank of the Brahmaputra River, Assam, India

Authors: Kali Prasad Sarma, Nibedita Baul, Jinu Deka

Abstract:

In the present study, pollution level of polycyclic aromatic hydrocarbon (PAH) in surface soil of historic Tezpur town located in the north bank of the River Brahmaputra were evaluated. In order to determine the seasonal distribution and concentration level of 16 USEPA priority PAHs surface soil samples were collected from 12 different sampling sites with various land use type. The total concentrations of 16 PAHs (∑16 PAHs) varied from 242.68µgkg-1to 7901.89µgkg-1. Concentration of total probable carcinogenic PAH ranged between 7.285µgkg-1 and 479.184 µgkg-1 in different seasons. However, the concentration of BaP, the most carcinogenic PAH, was found in the range of BDL to 50.01 µgkg-1. The composition profiles of PAHs in 3 different seasons were characterized by following two different types of ring: (1) 4-ring PAHs, contributed to highest percentage of total PAHs (43.75%) (2) while in pre- and post- monsoon season 3- ring compounds dominated the PAH profile, contributing 65.58% and 74.41% respectively. A high PAHs concentration with significant seasonality and high abundance of LMWPAHs was observed in Tezpur town. Soil PAHs toxicity was evaluated taking toxic equivalency factors (TEFs), which quantify the carcinogenic potential of other PAHs relative to BaP and estimate benzo[a]pyrene-equivalent concentration (BaPeq). The calculated BaPeq value signifies considerable risk to contact with soil PAHs. We applied cluster analysis and principal component analysis (PCA) with multivariate linear regression (MLR) to apportion sources of polycyclic aromatic hydrocarbons (PAHs) in surface soil of Tezpur town, based on the measured PAH concentrations. The results indicate that petrogenic and pyrogenic sources are the important sources of PAHs. A combination of chemometric and molecular indices were used to identify the sources of PAHs, which could be attributed to vehicle emissions, a mixed source input, natural gas combustion, wood or biomass burning and coal combustion. Source apportionment using absolute principle component scores–multiple linear regression showed that the main sources of PAHs are 22.3% mix sources comprising of diesel and biomass combustion and petroleum spill,13.55% from vehicle emission, 9.15% from diesel and natural gas burning, 38.05% from wood and biomass burning and 16.95% contribute coal combustion. Pyrogenic input was found to dominate source of PAHs origin with more contribution from vehicular exhaust. PAHs have often been found to co-emit with other environmental pollutants like heavy metals due to similar source of origin. A positive correlation was observed between PAH with Cr and Pb (r2 = 0.54 and 0.55 respectively) in monsoon season and PAH with Cd and Pb (r2 = 0.54 and 0.61 respectively) indicating their common source. Strong correlation was observed between PAH and OC during pre- and post- monsoon (r2=0.46 and r2=0.65 respectively) whereas during monsoon season no significant correlation was observed (r2=0.24).

Keywords: polycyclic aromatic hydrocarbon, Tezpur town, chemometric analysis, ecological risk assessment, pollution

Procedia PDF Downloads 189
5796 Impact of the Action Antropic in the Desertification of Steppe in Algeria

Authors: Kadi-Hanifi Halima

Abstract:

Stipa tenacissima is a plant with a big ecological value (against desertification) and economical stake (paper industry). It is important by its pastoral value due to the inflorescence. It occupied large areas between the Tellian atlas and the Saharian atlas, at the present, these areas of alfa have regressed a lot. This regression is estimated at 1% per year. The principal cause is a human responsibility. The drought is just an aggravating circumstance. The eradication of such a kind of species will have serious consequences upon the equilibrium of all the steppic ecosystem. Thus, we have thought necessary and urgent to know the alfa ecosystem, under all its aspects (climatic, floristic, and edaphic), this diagnostic could direct the fight actions against desertification

Keywords: desertification, anthropic action, soils, Stipa tenacissima

Procedia PDF Downloads 283
5795 Islamic Equity Markets Response to Volatility of Bitcoin

Authors: Zakaria S. G. Hegazy, Walid M. A. Ahmed

Abstract:

This paper examines the dependence structure of Islamic stock markets on Bitcoin’s realized volatility components in bear, normal, and bull market periods. A quantile regression approach is employed, after adjusting raw returns with respect to a broad set of relevant global factors and accounting for structural breaks in the data. The results reveal that upside volatility tends to exert negative influences on Islamic developed-market returns more in bear than in bull market conditions, while downside volatility positively affects returns during bear and bull conditions. For emerging markets, we find that the upside (downside) component exerts lagged negative (positive) effects on returns in bear (all) market regimes. By and large, the dependence structures turn out to be asymmetric. Our evidence provides essential implications for investors.

Keywords: cryptocurrency markets, bitcoin, realized volatility measures, asymmetry, quantile regression

Procedia PDF Downloads 155
5794 Disability and Quality of Life in Low Back Pain: A Cross-Sectional Study

Authors: Zarina Zahari, Maria Justine, Kamaria Kamaruddin

Abstract:

Low back pain (LBP) is a major musculoskeletal problem in global population. This study aimed to examine the relationship between pain, disability and quality of life in patients with non-specific low back pain (LBP). One hundred LBP participants were recruited in this cross-sectional study (mean age = 42.23±11.34 years old). Pain was measured using Numerical Rating Scale (11-point). Disability was assessed using the revised Oswestry low back pain disability questionnaire (ODQ) and quality of life (QoL) was evaluated using the SF-36 v2. Majority of participants (58%) presented with moderate pain and 49% experienced severe disability. Thus, the pain and disability were found significant with negative correlation (r= -0.712, p<0.05). The pain and QoL also showed significant and positive correlation with both Physical Health Component Summary (PHCS) (r= .840, p<0.05) and Mental Health Component Summary (MHCS) (r= 0.446, p<0.05). Regression analysis indicated that pain emerged as an indicator of both disability and QoL (PHCS and MHCS) accounting for 51%, 71% and 21% of the variances respectively. This indicates that pain is an important factor in predicting disability and QoL in LBP sufferers.

Keywords: disability, low back pain, pain, quality of life

Procedia PDF Downloads 504
5793 Nonparametric Truncated Spline Regression Model on the Data of Human Development Index in Indonesia

Authors: Kornelius Ronald Demu, Dewi Retno Sari Saputro, Purnami Widyaningsih

Abstract:

Human Development Index (HDI) is a standard measurement for a country's human development. Several factors may have influenced it, such as life expectancy, gross domestic product (GDP) based on the province's annual expenditure, the number of poor people, and the percentage of an illiterate people. The scatter plot between HDI and the influenced factors show that the plot does not follow a specific pattern or form. Therefore, the HDI's data in Indonesia can be applied with a nonparametric regression model. The estimation of the regression curve in the nonparametric regression model is flexible because it follows the shape of the data pattern. One of the nonparametric regression's method is a truncated spline. Truncated spline regression is one of the nonparametric approach, which is a modification of the segmented polynomial functions. The estimator of a truncated spline regression model was affected by the selection of the optimal knots point. Knot points is a focus point of spline truncated functions. The optimal knots point was determined by the minimum value of generalized cross validation (GCV). In this article were applied the data of Human Development Index with a truncated spline nonparametric regression model. The results of this research were obtained the best-truncated spline regression model to the HDI's data in Indonesia with the combination of optimal knots point 5-5-5-4. Life expectancy and the percentage of an illiterate people were the significant factors depend to the HDI in Indonesia. The coefficient of determination is 94.54%. This means the regression model is good enough to applied on the data of HDI in Indonesia.

Keywords: generalized cross validation (GCV), Human Development Index (HDI), knots point, nonparametric regression, truncated spline

Procedia PDF Downloads 307
5792 Regression Model Evaluation on Depth Camera Data for Gaze Estimation

Authors: James Purnama, Riri Fitri Sari

Abstract:

We investigate the machine learning algorithm selection problem in the term of a depth image based eye gaze estimation, with respect to its essential difficulty in reducing the number of required training samples and duration time of training. Statistics based prediction accuracy are increasingly used to assess and evaluate prediction or estimation in gaze estimation. This article evaluates Root Mean Squared Error (RMSE) and R-Squared statistical analysis to assess machine learning methods on depth camera data for gaze estimation. There are 4 machines learning methods have been evaluated: Random Forest Regression, Regression Tree, Support Vector Machine (SVM), and Linear Regression. The experiment results show that the Random Forest Regression has the lowest RMSE and the highest R-Squared, which means that it is the best among other methods.

Keywords: gaze estimation, gaze tracking, eye tracking, kinect, regression model, orange python

Procedia PDF Downloads 511
5791 Pyramid Binary Pattern for Age Invariant Face Verification

Authors: Saroj Bijarnia, Preety Singh

Abstract:

We propose a simple and effective biometrics system based on face verification across aging using a new variant of texture feature, Pyramid Binary Pattern. This employs Local Binary Pattern along with its hierarchical information. Dimension reduction of generated texture feature vector is done using Principal Component Analysis. Support Vector Machine is used for classification. Our proposed method achieves an accuracy of 92:24% and can be used in an automated age-invariant face verification system.

Keywords: biometrics, age invariant, verification, support vector machine

Procedia PDF Downloads 319
5790 Rd-PLS Regression: From the Analysis of Two Blocks of Variables to Path Modeling

Authors: E. Tchandao Mangamana, V. Cariou, E. Vigneau, R. Glele Kakai, E. M. Qannari

Abstract:

A new definition of a latent variable associated with a dataset makes it possible to propose variants of the PLS2 regression and the multi-block PLS (MB-PLS). We shall refer to these variants as Rd-PLS regression and Rd-MB-PLS respectively because they are inspired by both Redundancy analysis and PLS regression. Usually, a latent variable t associated with a dataset Z is defined as a linear combination of the variables of Z with the constraint that the length of the loading weights vector equals 1. Formally, t=Zw with ‖w‖=1. Denoting by Z' the transpose of Z, we define herein, a latent variable by t=ZZ’q with the constraint that the auxiliary variable q has a norm equal to 1. This new definition of a latent variable entails that, as previously, t is a linear combination of the variables in Z and, in addition, the loading vector w=Z’q is constrained to be a linear combination of the rows of Z. More importantly, t could be interpreted as a kind of projection of the auxiliary variable q onto the space generated by the variables in Z, since it is collinear to the first PLS1 component of q onto Z. Consider the situation in which we aim to predict a dataset Y from another dataset X. These two datasets relate to the same individuals and are assumed to be centered. Let us consider a latent variable u=YY’q to which we associate the variable t= XX’YY’q. Rd-PLS consists in seeking q (and therefore u and t) so that the covariance between t and u is maximum. The solution to this problem is straightforward and consists in setting q to the eigenvector of YY’XX’YY’ associated with the largest eigenvalue. For the determination of higher order components, we deflate X and Y with respect to the latent variable t. Extending Rd-PLS to the context of multi-block data is relatively easy. Starting from a latent variable u=YY’q, we consider its ‘projection’ on the space generated by the variables of each block Xk (k=1, ..., K) namely, tk= XkXk'YY’q. Thereafter, Rd-MB-PLS seeks q in order to maximize the average of the covariances of u with tk (k=1, ..., K). The solution to this problem is given by q, eigenvector of YY’XX’YY’, where X is the dataset obtained by horizontally merging datasets Xk (k=1, ..., K). For the determination of latent variables of order higher than 1, we use a deflation of Y and Xk with respect to the variable t= XX’YY’q. In the same vein, extending Rd-MB-PLS to the path modeling setting is straightforward. Methods are illustrated on the basis of case studies and performance of Rd-PLS and Rd-MB-PLS in terms of prediction is compared to that of PLS2 and MB-PLS.

Keywords: multiblock data analysis, partial least squares regression, path modeling, redundancy analysis

Procedia PDF Downloads 113