Search results for: multivariate linear regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 6106

Search results for: multivariate linear regression

5836 Factor Associated with Uncertainty Undergoing Hematopoietic Stem Cell Transplantation

Authors: Sandra Adarve, Jhon Osorio

Abstract:

Uncertainty has been studied in patients with different types of cancer, except in patients with hematologic cancer and undergoing transplantation. The purpose of this study was to identify factors associated with uncertainty in adults patients with malignant hemato-oncology diseases who are scheduled to undergo hematopoietic stem cell transplantation based on Merle Mishel´s Uncertainty theory. This was a cross-sectional study with an analytical purpose. The study sample included 50 patients with leukemia, myeloma, and lymphoma selected by non-probability sampling by convenience and intention. Sociodemographic and clinical variables were measured. Mishel´s Scale of Uncertainty in Illness was used for the measurement of uncertainty. A bivariate and multivariate analyses were performed to explore the relationships and associations between the different variables and uncertainty level. For this analysis, the distribution of the uncertainty scale values was evaluated through the Shapiro-Wilk normality test to identify statistical tests to be used. A multivariate analysis was conducted through a logistic regression using step-by-step technique. Patients were 18-74 years old, with a mean age of 44.8. Over time, the disease course had a median of 9.5 months, an opportunity was found in the performance of the transplantation of < 20 days for 50% of the patients. Regarding the uncertainty scale, a mean score of 95.46 was identified. When the dimensions of the scale were analyzed, the mean score of the framework of stimuli was 25.6, of cognitive ability was 47.4 and structure providers was 22.8. Age was identified to correlate with the total uncertainty score (p=0.012). Additionally, a statistically significant difference was evidenced between different religious creeds and uncertainty score (p=0.023), education level (p=0.012), family history of cancer (p=0.001), the presence of comorbidities (p=0.023) and previous radiotherapy treatment (p=0.022). After performing logistic regression, previous radiotherapy treatment (OR=0.04 IC95% (0.004-0.48)) and family history of cancer (OR=30.7 IC95% (2.7-349)) were found to be factors associated with the high level of uncertainty. Uncertainty is present in high levels in patients who are going to be subjected to bone marrow transplantation, and it is the responsibility of the nurse to assess the levels of uncertainty and the presence of factors that may contribute to their presence. Once it has been valued, the uncertainty must be intervened from the identified associated factors, especially all those that have to do with the cognitive capacity. This implies the implementation and design of intervention strategies to improve the knowledge related to the disease and the therapeutic procedures to which the patients will be subjected. All interventions should favor the adaptation of these patients to their current experience and contribute to seeing uncertainty as an opportunity for growth and transcendence.

Keywords: hematopoietic stem cell transplantation, hematologic diseases, nursing, uncertainty

Procedia PDF Downloads 119
5835 Integrated Nested Laplace Approximations For Quantile Regression

Authors: Kajingulu Malandala, Ranganai Edmore

Abstract:

The asymmetric Laplace distribution (ADL) is commonly used as the likelihood function of the Bayesian quantile regression, and it offers different families of likelihood method for quantile regression. Notwithstanding their popularity and practicality, ADL is not smooth and thus making it difficult to maximize its likelihood. Furthermore, Bayesian inference is time consuming and the selection of likelihood may mislead the inference, as the Bayes theorem does not automatically establish the posterior inference. Furthermore, ADL does not account for greater skewness and Kurtosis. This paper develops a new aspect of quantile regression approach for count data based on inverse of the cumulative density function of the Poisson, binomial and Delaporte distributions using the integrated nested Laplace Approximations. Our result validates the benefit of using the integrated nested Laplace Approximations and support the approach for count data.

Keywords: quantile regression, Delaporte distribution, count data, integrated nested Laplace approximation

Procedia PDF Downloads 134
5834 The Use of Geographically Weighted Regression for Deforestation Analysis: Case Study in Brazilian Cerrado

Authors: Ana Paula Camelo, Keila Sanches

Abstract:

The Geographically Weighted Regression (GWR) was proposed in geography literature to allow relationship in a regression model to vary over space. In Brazil, the agricultural exploitation of the Cerrado Biome is the main cause of deforestation. In this study, we propose a methodology using geostatistical methods to characterize the spatial dependence of deforestation in the Cerrado based on agricultural production indicators. Therefore, it was used the set of exploratory spatial data analysis tools (ESDA) and confirmatory analysis using GWR. It was made the calibration a non-spatial model, evaluation the nature of the regression curve, election of the variables by stepwise process and multicollinearity analysis. After the evaluation of the non-spatial model was processed the spatial-regression model, statistic evaluation of the intercept and verification of its effect on calibration. In an analysis of Spearman’s correlation the results between deforestation and livestock was +0.783 and with soybeans +0.405. The model presented R²=0.936 and showed a strong spatial dependence of agricultural activity of soybeans associated to maize and cotton crops. The GWR is a very effective tool presenting results closer to the reality of deforestation in the Cerrado when compared with other analysis.

Keywords: deforestation, geographically weighted regression, land use, spatial analysis

Procedia PDF Downloads 329
5833 Dietary Pattern derived by Reduced Rank Regression is Associated with Reduced Cognitive Impairment Risk in Singaporean Older Adults

Authors: Kaisy Xinhong Ye, Su Lin Lim, Jialiang Li, Lei Feng

Abstract:

background: Multiple healthful dietary patterns have been linked with dementia, but limited studies have looked at the role of diet in cognitive health in Asians whose eating habits are very different from their counterparts in the west. This study aimed to derive a dietary pattern that is associated with the risk of cognitive impairment (CI) in the Singaporean population. Method: The analysis was based on 719 community older adults aged 60 and above. Dietary intake was measured using a validated semi-quantitative food-frequency questionnaire (FFQ). Reduced rank regression (RRR) was used to extract dietary pattern from 45 food groups, specifying sugar, dietary fiber, vitamin A, calcium, and the ratio of polyunsaturated fat to saturated fat intake (P:S ratio) as response variables. The RRR-derived dietary patterns were subsequently investigated using multivariate logistic regression models to look for associations with the risk of CI. Results: A dietary pattern characterized by greater intakes of green leafy vegetables, red-orange vegetables, wholegrains, tofu, nuts, and lower intakes of biscuits, pastries, local sweets, coffee, poultry with skin, sugar added to beverages, malt beverages, roti, butter, and fast food was associated with reduced risk of CI [multivariable-adjusted OR comparing extreme quintiles, 0.29 (95% CI: 0.11, 0.77); P-trend =0.03]. This pattern was positively correlated with P:S ratio, vitamin A, and dietary fiber and negatively correlated with sugar. Conclusion: A dietary pattern providing high P:S ratio, vitamin A and dietary fiber, and a low level of sugar may reduce the risk of cognitive impairment in old age. The findings have significance in guiding local Singaporeans to dementia prevention through food-based dietary approaches.

Keywords: dementia, cognitive impairment, diet, nutrient, elderly

Procedia PDF Downloads 46
5832 Calculation of Pressure-Varying Langmuir and Brunauer-Emmett-Teller Isotherm Adsorption Parameters

Authors: Trevor C. Brown, David J. Miron

Abstract:

Gas-solid physical adsorption methods are central to the characterization and optimization of the effective surface area, pore size and porosity for applications such as heterogeneous catalysis, and gas separation and storage. Properties such as adsorption uptake, capacity, equilibrium constants and Gibbs free energy are dependent on the composition and structure of both the gas and the adsorbent. However, challenges remain, in accurately calculating these properties from experimental data. Gas adsorption experiments involve measuring the amounts of gas adsorbed over a range of pressures under isothermal conditions. Various constant-parameter models, such as Langmuir and Brunauer-Emmett-Teller (BET) theories are used to provide information on adsorbate and adsorbent properties from the isotherm data. These models typically do not provide accurate interpretations across the full range of pressures and temperatures. The Langmuir adsorption isotherm is a simple approximation for modelling equilibrium adsorption data and has been effective in estimating surface areas and catalytic rate laws, particularly for high surface area solids. The Langmuir isotherm assumes the systematic filling of identical adsorption sites to a monolayer coverage. The BET model is based on the Langmuir isotherm and allows for the formation of multiple layers. These additional layers do not interact with the first layer and the energetics are equal to the adsorbate as a bulk liquid. This BET method is widely used to measure the specific surface area of materials. Both Langmuir and BET models assume that the affinity of the gas for all adsorption sites are identical and so the calculated adsorbent uptake at the monolayer and equilibrium constant are independent of coverage and pressure. Accurate representations of adsorption data have been achieved by extending the Langmuir and BET models to include pressure-varying uptake capacities and equilibrium constants. These parameters are determined using a novel regression technique called flexible least squares for time-varying linear regression. For isothermal adsorption the adsorption parameters are assumed to vary slowly and smoothly with increasing pressure. The flexible least squares for pressure-varying linear regression (FLS-PVLR) approach assumes two distinct types of discrepancy terms, dynamic and measurement for all parameters in the linear equation used to simulate the data. Dynamic terms account for pressure variation in successive parameter vectors, and measurement terms account for differences between observed and theoretically predicted outcomes via linear regression. The resultant pressure-varying parameters are optimized by minimizing both dynamic and measurement residual squared errors. Validation of this methodology has been achieved by simulating adsorption data for n-butane and isobutane on activated carbon at 298 K, 323 K and 348 K and for nitrogen on mesoporous alumina at 77 K with pressure-varying Langmuir and BET adsorption parameters (equilibrium constants and uptake capacities). This modeling provides information on the adsorbent (accessible surface area and micropore volume), adsorbate (molecular areas and volumes) and thermodynamic (Gibbs free energies) variations of the adsorption sites.

Keywords: Langmuir adsorption isotherm, BET adsorption isotherm, pressure-varying adsorption parameters, adsorbate and adsorbent properties and energetics

Procedia PDF Downloads 194
5831 Blood Glucose Measurement and Analysis: Methodology

Authors: I. M. Abd Rahim, H. Abdul Rahim, R. Ghazali

Abstract:

There is numerous non-invasive blood glucose measurement technique developed by researchers, and near infrared (NIR) is the potential technique nowadays. However, there are some disagreements on the optimal wavelength range that is suitable to be used as the reference of the glucose substance in the blood. This paper focuses on the experimental data collection technique and also the analysis method used to analyze the data gained from the experiment. The selection of suitable linear and non-linear model structure is essential in prediction system, as the system developed need to be conceivably accurate.

Keywords: linear, near-infrared (NIR), non-invasive, non-linear, prediction system

Procedia PDF Downloads 428
5830 Forecasting Stock Indexes Using Bayesian Additive Regression Tree

Authors: Darren Zou

Abstract:

Forecasting the stock market is a very challenging task. Various economic indicators such as GDP, exchange rates, interest rates, and unemployment have a substantial impact on the stock market. Time series models are the traditional methods used to predict stock market changes. In this paper, a machine learning method, Bayesian Additive Regression Tree (BART) is used in predicting stock market indexes based on multiple economic indicators. BART can be used to model heterogeneous treatment effects, and thereby works well when models are misspecified. It also has the capability to handle non-linear main effects and multi-way interactions without much input from financial analysts. In this research, BART is proposed to provide a reliable prediction on day-to-day stock market activities. By comparing the analysis results from BART and with time series method, BART can perform well and has better prediction capability than the traditional methods.

Keywords: BART, Bayesian, predict, stock

Procedia PDF Downloads 96
5829 Ranking Effective Factors on Strategic Planning to Achieve Organization Objectives in Fuzzy Multivariate Decision-Making Technique

Authors: Elahe Memari, Ahmad Aslizadeh, Ahmad Memari

Abstract:

Today strategic planning is counted as the most important duties of senior directors in each organization. Strategic planning allows the organizations to implement compiled strategies and reach higher competitive benefits than their competitors. The present research work tries to prepare and rank the strategies form effective factors on strategic planning in fulfillment of the State Road Management and Transportation Organization in order to indicate the role of organizational factors in efficiency of the process to organization managers. Connection between six main factors in fulfillment of State Road Management and Transportation Organization were studied here, including Improvement of Strategic Thinking in senior managers, improvement of the organization business process, rationalization of resources allocation in different parts of the organization, coordination and conformity of strategic plan with organization needs, adjustment of organization activities with environmental changes, reinforcement of organizational culture. All said factors approved by implemented tests and then ranked using fuzzy multivariate decision-making technique.

Keywords: Fuzzy TOPSIS, improvement of organization business process, multivariate decision-making, strategic planning

Procedia PDF Downloads 375
5828 Temperature Rises Characteristics of Distinct Double-Sided Flat Permanent Magnet Linear Generator for Free Piston Engines for Hybrid Vehicles

Authors: Ismail Rahama Adam Hamid

Abstract:

This paper presents the development of a thermal model for a flat, double-sided linear generator designed for use in free-piston engines. The study conducted in this paper examines the influence of temperature on the performance of the permeant magnet linear generator, an integral and pivotal component within the system. This research places particular emphasis on the Neodymium Iron Boron (NdFeB) permanent magnet, which serves as a source of magnetic field for the linear generator. In this study, an internal combustion engine that tends to produce heat is connected to a generator. Considering the temperatures rise from both the combustion process and the thermal contributions of current-carrying conductors and frictional forces. Utilizing Computational Fluid Dynamics (CFD) method, a thermal model of the (NdFeB) magnet within the linear generator is constructed and analyzed. Furthermore, the temperature field is examined to ensure that the linear generator operates under stable conditions without the risk of demagnetization.

Keywords: free piston engine, permanent magnet, linear generator, demagnetization, simulation

Procedia PDF Downloads 13
5827 Weighted Rank Regression with Adaptive Penalty Function

Authors: Kang-Mo Jung

Abstract:

The use of regularization for statistical methods has become popular. The least absolute shrinkage and selection operator (LASSO) framework has become the standard tool for sparse regression. However, it is well known that the LASSO is sensitive to outliers or leverage points. We consider a new robust estimation which is composed of the weighted loss function of the pairwise difference of residuals and the adaptive penalty function regulating the tuning parameter for each variable. Rank regression is resistant to regression outliers, but not to leverage points. By adopting a weighted loss function, the proposed method is robust to leverage points of the predictor variable. Furthermore, the adaptive penalty function gives us good statistical properties in variable selection such as oracle property and consistency. We develop an efficient algorithm to compute the proposed estimator using basic functions in program R. We used an optimal tuning parameter based on the Bayesian information criterion (BIC). Numerical simulation shows that the proposed estimator is effective for analyzing real data set and contaminated data.

Keywords: adaptive penalty function, robust penalized regression, variable selection, weighted rank regression

Procedia PDF Downloads 431
5826 Applying Multivariate and Univariate Analysis of Variance on Socioeconomic, Health, and Security Variables in Jordan

Authors: Faisal G. Khamis, Ghaleb A. El-Refae

Abstract:

Many researchers have studied socioeconomic, health, and security variables in the developed countries; however, very few studies used multivariate analysis in developing countries. The current study contributes to the scarce literature about the determinants of the variance in socioeconomic, health, and security factors. Questions raised were whether the independent variables (IVs) of governorate and year impact the socioeconomic, health, and security dependent variables (DVs) in Jordan, whether the marginal mean of each DV in each governorate and in each year is significant, which governorates are similar in difference means of each DV, and whether these DVs vary. The main objectives were to determine the source of variances in DVs, collectively and separately, testing which governorates are similar and which diverge for each DV. The research design was time series and cross-sectional analysis. The main hypotheses are that IVs affect DVs collectively and separately. Multivariate and univariate analyses of variance were carried out to test these hypotheses. The population of 12 governorates in Jordan and the available data of 15 years (2000–2015) accrued from several Jordanian statistical yearbooks. We investigated the effect of two factors of governorate and year on the four DVs of divorce rate, mortality rate, unemployment percentage, and crime rate. All DVs were transformed to multivariate normal distribution. We calculated descriptive statistics for each DV. Based on the multivariate analysis of variance, we found a significant effect in IVs on DVs with p < .001. Based on the univariate analysis, we found a significant effect of IVs on each DV with p < .001, except the effect of the year factor on unemployment was not significant with p = .642. The grand and marginal means of each DV in each governorate and each year were significant based on a 95% confidence interval. Most governorates are not similar in DVs with p < .001. We concluded that the two factors produce significant effects on DVs, collectively and separately. Based on these findings, the government can distribute its financial and physical resources to governorates more efficiently. By identifying the sources of variance that contribute to the variation in DVs, insights can help inform focused variation prevention efforts.

Keywords: ANOVA, crime, divorce, governorate, hypothesis test, Jordan, MANOVA, means, mortality, unemployment, year

Procedia PDF Downloads 239
5825 Solving Extended Linear Complementarity Problems (XLCP) - Wood and Environment

Authors: Liberto Pombal, Christian Dieter Jaekel

Abstract:

The objective of this work is to establish theoretical and numerical conditions for Solving Extended Linear Complementarity Problems (XLCP), with emphasis on the Horizontal Linear Complementarity Problem (HLCP). Two new strategies for solving complementarity problems are presented, using differentiable and penalized functions, which resulted in a natural formalization for the Linear Horizontal case. The computational results of all suggested strategies are also discussed in depth in this paper. The implication in practice allows solving and optimizing, in an innovative way, the (forestry) problems of the value chain of the industrial wood sector in Angola.

Keywords: complementarity, box constrained, optimality conditions, wood and environment

Procedia PDF Downloads 23
5824 Donoho-Stark’s and Hardy’s Uncertainty Principles for the Short-Time Quaternion Offset Linear Canonical Transform

Authors: Mohammad Younus Bhat

Abstract:

The quaternion offset linear canonical transform (QOLCT), which isa time-shifted and frequency-modulated version of the quaternion linear canonical transform (QLCT), provides a more general framework of most existing signal processing tools. For the generalized QOLCT, the classical Heisenberg’s and Lieb’s uncertainty principles have been studied recently. In this paper, we first define the short-time quaternion offset linear canonical transform (ST-QOLCT) and drive its relationship with the quaternion Fourier transform (QFT). The crux of the paper lies in the generalization of several well-known uncertainty principles for the ST-QOLCT, including Donoho-Stark’s uncertainty principle, Hardy’s uncertainty principle, Beurling’s uncertainty principle, and the logarithmic uncertainty principle.

Keywords: Quaternion Fourier transform, Quaternion offset linear canonical transform, short-time quaternion offset linear canonical transform, uncertainty principle

Procedia PDF Downloads 160
5823 Healthy Lifestyle and Risky Behaviors amongst Students of Physical Education High Schools

Authors: Amin Amani, Masomeh Reihany Shirvan, Mahla Nabizadeh Mashizi, Mohadese Khoshtinat, Mohammad Elyas Ansarinia

Abstract:

The purpose of this study is the relationship between a healthy lifestyle and risky behavior in physical education students of Bojnourd schools. The study sample consisted of teenagers studying in second and third grade of Bojnourd's high schools. According to level sampling, 604 students studying in the second grade, and 600 students studying in third grade were tested from physical education schools in Bojnourd. For sample selection, populations were divided into 4 area including north, East, West and South. Then according to the number of students of each area, sample size of each level was determined. Two questionnaires were used to collect data in this study which were consisted of three parts: The demographic data, Iranian teenagers' risk taking (IARS) and prevention methods with emphasize on the importance of family role were examined. The Central and dispersion indices, such as standard deviation, multiple variance analysis, and multivariate regression analysis were used. Results showed that the observed F is significant (P ≤ 0.01) and 21% of variance related to risky behavior is explained by the lack of awareness. Given the significance of the regression, the coefficients of risky behavior in teenagers in prediction equation showed that each of teenagers' risky behavior can have an impact on healthy lifestyle.

Keywords: healthy lifestyle, high-risk behavior, students, physical education

Procedia PDF Downloads 157
5822 Evaluation of Short-Term Load Forecasting Techniques Applied for Smart Micro-Grids

Authors: Xiaolei Hu, Enrico Ferrera, Riccardo Tomasi, Claudio Pastrone

Abstract:

Load Forecasting plays a key role in making today's and future's Smart Energy Grids sustainable and reliable. Accurate power consumption prediction allows utilities to organize in advance their resources or to execute Demand Response strategies more effectively, which enables several features such as higher sustainability, better quality of service, and affordable electricity tariffs. It is easy yet effective to apply Load Forecasting at larger geographic scale, i.e. Smart Micro Grids, wherein the lower available grid flexibility makes accurate prediction more critical in Demand Response applications. This paper analyses the application of short-term load forecasting in a concrete scenario, proposed within the EU-funded GreenCom project, which collect load data from single loads and households belonging to a Smart Micro Grid. Three short-term load forecasting techniques, i.e. linear regression, artificial neural networks, and radial basis function network, are considered, compared, and evaluated through absolute forecast errors and training time. The influence of weather conditions in Load Forecasting is also evaluated. A new definition of Gain is introduced in this paper, which innovatively serves as an indicator of short-term prediction capabilities of time spam consistency. Two models, 24- and 1-hour-ahead forecasting, are built to comprehensively compare these three techniques.

Keywords: short-term load forecasting, smart micro grid, linear regression, artificial neural networks, radial basis function network, gain

Procedia PDF Downloads 438
5821 Linear Codes Afforded by the Permutation Representations of Finite Simple Groups and Their Support Designs

Authors: Amin Saeidi

Abstract:

Using a representation-theoretic approach and considering G to be a finite primitive permutation group of degree n, our aim is to determine linear codes of length n that admit G as a permutation automorphism group. We can show that in some cases, every binary linear code admitting G as a permutation automorphism group is a submodule of a permutation module defined by a primitive action of G. As an illustration of the method, we consider the sporadic simple group M₁₁ and the unitary group U(3,3). We also construct some point- and block-primitive 1-designs from the supports of some codewords of the codes in the discussion.

Keywords: linear code, permutation representation, support design, simple group

Procedia PDF Downloads 47
5820 Study on the DC Linear Stepper Motor to Industrial Applications

Authors: Nolvi Francisco Baggio Filho, Roniele Belusso

Abstract:

Many industrial processes require a precise linear motion. Usually, this movement is achieved with the use of rotary motors combined with electrical control systems and mechanical systems such as gears, pulleys and bearings. Other types of devices are based on linear motors, where the linear motion is obtained directly. The Linear Stepper Motor (MLP) is an excellent solution for industrial applications that require precise positioning and high speed. This study presents an MLP formed by a linear structure and static ferromagnetic material, and a mover structure in which three coils are mounted. Mechanical suspension systems allow a linear movement between static and mover parts, maintaining a constant air gap. The operating principle is based on the tendency of alignment of magnetic flux through the path of least reluctance. The force proportional to the intensity of the electric current and the speed proportional to the frequency of the excitation coils. The study of this device is still based on the use of a numerical and experimental analysis to verify the relationship among electric current applied and planar force developed. In addition, the magnetic field in the air gap region is also monitored.

Keywords: linear stepper motor, planar traction force, reluctance magnetic, industry applications

Procedia PDF Downloads 475
5819 MapReduce Logistic Regression Algorithms with RHadoop

Authors: Byung Ho Jung, Dong Hoon Lim

Abstract:

Logistic regression is a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. Logistic regression is used extensively in numerous disciplines, including the medical and social science fields. In this paper, we address the problem of estimating parameters in the logistic regression based on MapReduce framework with RHadoop that integrates R and Hadoop environment applicable to large scale data. There exist three learning algorithms for logistic regression, namely Gradient descent method, Cost minimization method and Newton-Rhapson's method. The Newton-Rhapson's method does not require a learning rate, while gradient descent and cost minimization methods need to manually pick a learning rate. The experimental results demonstrated that our learning algorithms using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also compared the performance of our Newton-Rhapson's method with gradient descent and cost minimization methods. The results showed that our newton's method appeared to be the most robust to all data tested.

Keywords: big data, logistic regression, MapReduce, RHadoop

Procedia PDF Downloads 245
5818 Chemometric Analysis of Raw Milk Quality Originating from Conventional and Organic Dairy Farming in AP Vojvodina, Serbia

Authors: Sanja Podunavac-Kuzmanović, Denis Kučević, Strahinja Kovačević, Milica Karadžić, Lidija Jevrić

Abstract:

The present study describes the application of chemometric methods in analysis of milk samples which were collected in a conventional dairy farm and an organic dairy farm in AP Vojvodina, Republic of Serbia. The chemometric analysis included the application of univariate regression modeling and Analysis of Variance (ANOVA) method. The ANOVA was used in order to determine the differences in fatty acids content in the milk samples from conventional and organic farm. The results of the ANOVA testing indicate that there is a highly statistically significant difference between the content of fatty acid (saturated fatty acid vs. unsaturated fatty acids) in different dairy farming. Besides, the linear univariate models have been obtained as a result of modeling the linear relationships between the milk fat content and saturated fatty acids content, and the linear relationships between the milk fat content and unsaturated fatty acids content. The models obtained on the basis of the milk samples which originate from the organic farming are statistically better than the models based on the milk samples from conventional farming.

Keywords: hemometrics, milk, organic farming, quality control

Procedia PDF Downloads 209
5817 Distinguishing between Bacterial and Viral Infections Based on Peripheral Human Blood Tests Using Infrared Microscopy and Multivariate Analysis

Authors: H. Agbaria, A. Salman, M. Huleihel, G. Beck, D. H. Rich, S. Mordechai, J. Kapelushnik

Abstract:

Viral and bacterial infections are responsible for variety of diseases. These infections have similar symptoms like fever, sneezing, inflammation, vomiting, diarrhea and fatigue. Thus, physicians may encounter difficulties in distinguishing between viral and bacterial infections based on these symptoms. Bacterial infections differ from viral infections in many other important respects regarding the response to various medications and the structure of the organisms. In many cases, it is difficult to know the origin of the infection. The physician orders a blood, urine test, or 'culture test' of tissue to diagnose the infection type when it is necessary. Using these methods, the time that elapses between the receipt of patient material and the presentation of the test results to the clinician is typically too long ( > 24 hours). This time is crucial in many cases for saving the life of the patient and for planning the right medical treatment. Thus, rapid identification of bacterial and viral infections in the lab is of great importance for effective treatment especially in cases of emergency. Blood was collected from 50 patients with confirmed viral infection and 50 with confirmed bacterial infection. White blood cells (WBCs) and plasma were isolated and deposited on a zinc selenide slide, dried and measured under a Fourier transform infrared (FTIR) microscope to obtain their infrared absorption spectra. The acquired spectra of WBCs and plasma were analyzed in order to differentiate between the two types of infections. In this study, the potential of FTIR microscopy in tandem with multivariate analysis was evaluated for the identification of the agent that causes the human infection. The method was used to identify the infectious agent type as either bacterial or viral, based on an analysis of the blood components [i.e., white blood cells (WBC) and plasma] using their infrared vibrational spectra. The time required for the analysis and evaluation after obtaining the blood sample was less than one hour. In the analysis, minute spectral differences in several bands of the FTIR spectra of WBCs were observed between groups of samples with viral and bacterial infections. By employing the techniques of feature extraction with linear discriminant analysis (LDA), a sensitivity of ~92 % and a specificity of ~86 % for an infection type diagnosis was achieved. The present preliminary study suggests that FTIR spectroscopy of WBCs is a potentially feasible and efficient tool for the diagnosis of the infection type.

Keywords: viral infection, bacterial infection, linear discriminant analysis, plasma, white blood cells, infrared spectroscopy

Procedia PDF Downloads 190
5816 A Generalized Weighted Loss for Support Vextor Classification and Multilayer Perceptron

Authors: Filippo Portera

Abstract:

Usually standard algorithms employ a loss where each error is the mere absolute difference between the true value and the prediction, in case of a regression task. In the present, we present several error weighting schemes that are a generalization of the consolidated routine. We study both a binary classification model for Support Vextor Classification and a regression net for Multylayer Perceptron. Results proves that the error is never worse than the standard procedure and several times it is better.

Keywords: loss, binary-classification, MLP, weights, regression

Procedia PDF Downloads 63
5815 Use of Multivariate Statistical Techniques for Water Quality Monitoring Network Assessment, Case of Study: Jequetepeque River Basin

Authors: Jose Flores, Nadia Gamboa

Abstract:

A proper water quality management requires the establishment of a monitoring network. Therefore, evaluation of the efficiency of water quality monitoring networks is needed to ensure high-quality data collection of critical quality chemical parameters. Unfortunately, in some Latin American countries water quality monitoring programs are not sustainable in terms of recording historical data or environmentally representative sites wasting time, money and valuable information. In this study, multivariate statistical techniques, such as principal components analysis (PCA) and hierarchical cluster analysis (HCA), are applied for identifying the most significant monitoring sites as well as critical water quality parameters in the monitoring network of the Jequetepeque River basin, in northern Peru. The Jequetepeque River basin, like others in Peru, shows socio-environmental conflicts due to economical activities developed in this area. Water pollution by trace elements in the upper part of the basin is mainly related with mining activity, and agricultural land lost due to salinization is caused by the extensive use of groundwater in the lower part of the basin. Since the 1980s, the water quality in the basin has been non-continuously assessed by public and private organizations, and recently the National Water Authority had established permanent water quality networks in 45 basins in Peru. Despite many countries use multivariate statistical techniques for assessing water quality monitoring networks, those instruments have never been applied for that purpose in Peru. For this reason, the main contribution of this study is to demonstrate that application of the multivariate statistical techniques could serve as an instrument that allows the optimization of monitoring networks using least number of monitoring sites as well as the most significant water quality parameters, which would reduce costs concerns and improve the water quality management in Peru. Main socio-economical activities developed and the principal stakeholders related to the water management in the basin are also identified. Finally, water quality management programs will also be discussed in terms of their efficiency and sustainability.

Keywords: PCA, HCA, Jequetepeque, multivariate statistical

Procedia PDF Downloads 329
5814 Interference among Lambsquarters and Oil Rapeseed Cultivars

Authors: Reza Siyami, Bahram Mirshekari

Abstract:

Seed and oil yield of rapeseed is considerably affected by weeds interference including mustard (Sinapis arvensis L.), lambsquarters (Chenopodium album L.) and redroot pigweed (Amaranthus retroflexus L.) throughout the East Azerbaijan province in Iran. To formulate the relationship between four independent growth variables measured in our experiment with a dependent variable, multiple regression analysis was carried out for the weed leaves number per plant (X1), green cover percentage (X2), LAI (X3) and leaf area per plant (X4) as independent variables and rapeseed oil yield as a dependent variable. The multiple regression equation is shown as follows: Seed essential oil yield (kg/ha) = 0.156 + 0.0325 (X1) + 0.0489 (X2) + 0.0415 (X3) + 0.133 (X4). Furthermore, the stepwise regression analysis was also carried out for the data obtained to test the significance of the independent variables affecting the oil yield as a dependent variable. The resulted stepwise regression equation is shown as follows: Oil yield = 4.42 + 0.0841 (X2) + 0.0801 (X3); R2 = 81.5. The stepwise regression analysis verified that the green cover percentage and LAI of weed had a marked increasing effect on the oil yield of rapeseed.

Keywords: green cover percentage, independent variable, interference, regression

Procedia PDF Downloads 389
5813 Machine Learning Techniques for Estimating Ground Motion Parameters

Authors: Farid Khosravikia, Patricia Clayton

Abstract:

The main objective of this study is to evaluate the advantages and disadvantages of various machine learning techniques in forecasting ground-motion intensity measures given source characteristics, source-to-site distance, and local site condition. Intensity measures such as peak ground acceleration and velocity (PGA and PGV, respectively) as well as 5% damped elastic pseudospectral accelerations at different periods (PSA), are indicators of the strength of shaking at the ground surface. Estimating these variables for future earthquake events is a key step in seismic hazard assessment and potentially subsequent risk assessment of different types of structures. Typically, linear regression-based models, with pre-defined equations and coefficients, are used in ground motion prediction. However, due to the restrictions of the linear regression methods, such models may not capture more complex nonlinear behaviors that exist in the data. Thus, this study comparatively investigates potential benefits from employing other machine learning techniques as a statistical method in ground motion prediction such as Artificial Neural Network, Random Forest, and Support Vector Machine. The algorithms are adjusted to quantify event-to-event and site-to-site variability of the ground motions by implementing them as random effects in the proposed models to reduce the aleatory uncertainty. All the algorithms are trained using a selected database of 4,528 ground-motions, including 376 seismic events with magnitude 3 to 5.8, recorded over the hypocentral distance range of 4 to 500 km in Oklahoma, Kansas, and Texas since 2005. The main reason of the considered database stems from the recent increase in the seismicity rate of these states attributed to petroleum production and wastewater disposal activities, which necessities further investigation in the ground motion models developed for these states. Accuracy of the models in predicting intensity measures, generalization capability of the models for future data, as well as usability of the models are discussed in the evaluation process. The results indicate the algorithms satisfy some physically sound characteristics such as magnitude scaling distance dependency without requiring pre-defined equations or coefficients. Moreover, it is shown that, when sufficient data is available, all the alternative algorithms tend to provide more accurate estimates compared to the conventional linear regression-based method, and particularly, Random Forest outperforms the other algorithms. However, the conventional method is a better tool when limited data is available.

Keywords: artificial neural network, ground-motion models, machine learning, random forest, support vector machine

Procedia PDF Downloads 92
5812 Low SPOP Expression and High MDM2 expression Are Associated with Tumor Progression and Predict Poor Prognosis in Hepatocellular Carcinoma

Authors: Chang Liang, Weizhi Gong, Yan Zhang

Abstract:

Purpose: Hepatocellular carcinoma (HCC) is a malignant tumor with a high mortality rate and poor prognosis worldwide. Murine double minute 2 (MDM2) regulates the tumor suppressor p53, increasing cancer risk and accelerating tumor progression. Speckle-type POX virus and zinc finger protein (SPOP), a key of subunit of Cullin-Ring E3 ligase, inhibits tumor genesis and progression by the ubiquitination of its downstream substrates. This study aimed to clarify whether SPOP and MDM2 are mutually regulated in HCC and the correlation between SPOP and MDM2 and the prognosis of HCC patients. Methods: First, the expression of SPOP and MDM2 in HCC tissues were detected by TCGA database. Then, 53 paired samples of HCC tumor and adjacent tissues were collected to evaluate the expression of SPOP and MDM2 using immunohistochemistry. Chi-square test or Fisher’s exact test were used to analyze the relationship between clinicopathological features and the expression levels of SPOP and MDM2. In addition, Kaplan‒Meier curve analysis and log-rank test were used to investigate the effects of SPOP and MDM2 on the survival of HCC patients. Last, the Multivariate Cox proportional risk regression model analyzed whether the different expression levels of SPOP and MDM2 were independent risk factors for the prognosis of HCC patients. Results: Bioinformatics analysis revealed the low expression of SPOP and high expression of MDM2 were related to worse prognosis of HCC patients. The relationship between the expression of SPOP and MDM2 and tumor stem-like features showed an opposite trend. The immunohistochemistry showed the expression of SPOP protein was significantly downregulated while MDM2 protein significantly upregulated in HCC tissue compared to that in para-cancerous tissue. Tumors with low SPOP expression were related to worse T stage and Barcelona Clinic Liver Cancer (BCLC) stage, but tumors with high MDM2 expression were related to worse T stage, M stage, and BCLC stage. Kaplan–Meier curves showed HCC patients with high SPOP expression and low MDM2 expression had better survival than those with low SPOP expression and high MDM2 expression (P < 0.05). A multivariate Cox proportional risk regression model confirmed that a high MDM2 expression level was an independent risk factor for poor prognosis in HCC patients (P <0.05). Conclusion: The expression of SPOP protein was significantly downregulated, while the expression of MDM2 significantly upregulated in HCC. The low expression of SPOP and high expression. of MDM2 were associated with malignant progression and poor prognosis of HCC patients, indicating a potential therapeutic target for HCC patients.

Keywords: hepatocellular carcinoma, murine double minute 2, speckle-type POX virus and zinc finger protein, ubiquitination

Procedia PDF Downloads 103
5811 Imputing Missing Data in Electronic Health Records: A Comparison of Linear and Non-Linear Imputation Models

Authors: Alireza Vafaei Sadr, Vida Abedi, Jiang Li, Ramin Zand

Abstract:

Missing data is a common challenge in medical research and can lead to biased or incomplete results. When the data bias leaks into models, it further exacerbates health disparities; biased algorithms can lead to misclassification and reduced resource allocation and monitoring as part of prevention strategies for certain minorities and vulnerable segments of patient populations, which in turn further reduce data footprint from the same population – thus, a vicious cycle. This study compares the performance of six imputation techniques grouped into Linear and Non-Linear models on two different realworld electronic health records (EHRs) datasets, representing 17864 patient records. The mean absolute percentage error (MAPE) and root mean squared error (RMSE) are used as performance metrics, and the results show that the Linear models outperformed the Non-Linear models in terms of both metrics. These results suggest that sometimes Linear models might be an optimal choice for imputation in laboratory variables in terms of imputation efficiency and uncertainty of predicted values.

Keywords: EHR, machine learning, imputation, laboratory variables, algorithmic bias

Procedia PDF Downloads 48
5810 Copula-Based Estimation of Direct and Indirect Effects in Path Analysis Model

Authors: Alam Ali, Ashok Kumar Pathak

Abstract:

Path analysis is a statistical technique used to evaluate the strength of the direct and indirect effects of variables. One or more structural regression equations are used to estimate a series of parameters in order to find the better fit of data. Sometimes, exogenous variables do not show a significant strength of their direct and indirect effect when the assumption of classical regression (ordinary least squares (OLS)) are violated by the nature of the data. The main motive of this article is to investigate the efficacy of the copula-based regression approach over the classical regression approach and calculate the direct and indirect effects of variables when data violates the OLS assumption and variables are linked through an elliptical copula. We perform this study using a well-organized numerical scheme. Finally, a real data application is also presented to demonstrate the performance of the superiority of the copula approach.

Keywords: path analysis, copula-based regression models, direct and indirect effects, k-fold cross validation technique

Procedia PDF Downloads 46
5809 Performance Analysis of Proprietary and Non-Proprietary Tools for Regression Testing Using Genetic Algorithm

Authors: K. Hema Shankari, R. Thirumalaiselvi, N. V. Balasubramanian

Abstract:

The present paper addresses to the research in the area of regression testing with emphasis on automated tools as well as prioritization of test cases. The uniqueness of regression testing and its cyclic nature is pointed out. The difference in approach between industry, with business model as basis, and academia, with focus on data mining, is highlighted. Test Metrics are discussed as a prelude to our formula for prioritization; a case study is further discussed to illustrate this methodology. An industrial case study is also described in the paper, where the number of test cases is so large that they have to be grouped as Test Suites. In such situations, a genetic algorithm proposed by us can be used to reconfigure these Test Suites in each cycle of regression testing. The comparison is made between a proprietary tool and an open source tool using the above-mentioned metrics. Our approach is clarified through several tables.

Keywords: APFD metric, genetic algorithm, regression testing, RFT tool, test case prioritization, selenium tool

Procedia PDF Downloads 401
5808 Statistical Convergence for the Approximation of Linear Positive Operators

Authors: Neha Bhardwaj

Abstract:

In this paper, we consider positive linear operators and study the Voronovskaya type result of the operator then obtain an error estimate in terms of the higher order modulus of continuity of the function being approximated and its A-statistical convergence. Also, we compute the corresponding rate of A-statistical convergence for the linear positive operators.

Keywords: Poisson distribution, Voronovskaya, modulus of continuity, a-statistical convergence

Procedia PDF Downloads 298
5807 A Hybrid Model Tree and Logistic Regression Model for Prediction of Soil Shear Strength in Clay

Authors: Ehsan Mehryaar, Seyed Armin Motahari Tabari

Abstract:

Without a doubt, soil shear strength is the most important property of the soil. The majority of fatal and catastrophic geological accidents are related to shear strength failure of the soil. Therefore, its prediction is a matter of high importance. However, acquiring the shear strength is usually a cumbersome task that might need complicated laboratory testing. Therefore, prediction of it based on common and easy to get soil properties can simplify the projects substantially. In this paper, A hybrid model based on the classification and regression tree algorithm and logistic regression is proposed where each leaf of the tree is an independent regression model. A database of 189 points for clay soil, including Moisture content, liquid limit, plastic limit, clay content, and shear strength, is collected. The performance of the developed model compared to the existing models and equations using root mean squared error and coefficient of correlation.

Keywords: model tree, CART, logistic regression, soil shear strength

Procedia PDF Downloads 165