Search results for: regression uncertainty
3942 Application of IF Rough Data on Knowledge Towards Malaria of Rural Tribal Communities in Tripura
Authors: Chhaya Gangwal, R. N. Bhaumik, Shishir Kumar
Abstract:
Handling uncertainty and impreciseness of knowledge appears to be a challenging task in Information Systems. Intuitionistic fuzzy (IF) and rough set theory enhances databases by allowing it for the management of uncertainty and impreciseness. This paper presents a new efficient query optimization technique for the multi-valued or imprecise IF rough database. The usefulness of this technique was illustrated on malaria knowledge from the rural tribal communities of Tripura where most of the information is multi-valued and imprecise. Then, the querying about knowledge on malaria is executed into SQL server to make the implementation of IF rough data querying simpler.Keywords: intuitionistic fuzzy set, rough set, relational database, IF rough relational database
Procedia PDF Downloads 4453941 A Probability Analysis of Construction Project Schedule Using Risk Management Tool
Authors: A. L. Agarwal, D. A. Mahajan
Abstract:
Construction industry tumbled along with other industry/sectors during recent economic crash. Construction business could not regain thereafter and still pass through slowdown phase, resulted many real estate as well as infrastructure projects not completed on schedule and within budget. There are many theories, tools, techniques with software packages available in the market to analyze construction schedule. This study focuses on the construction project schedule and uncertainties associated with construction activities. The infrastructure construction project has been considered for the analysis of uncertainty on project activities affecting project duration and analysis is done using @RISK software. Different simulation results arising from three probability distribution functions are compiled to benefit construction project managers to plan more realistic schedule of various construction activities as well as project completion to document in the contract and avoid compensations or claims arising out of missing the planned schedule.Keywords: construction project, distributions, project schedule, uncertainty
Procedia PDF Downloads 3503940 Ordinal Regression with Fenton-Wilkinson Order Statistics: A Case Study of an Orienteering Race
Authors: Joonas Pääkkönen
Abstract:
In sports, individuals and teams are typically interested in final rankings. Final results, such as times or distances, dictate these rankings, also known as places. Places can be further associated with ordered random variables, commonly referred to as order statistics. In this work, we introduce a simple, yet accurate order statistical ordinal regression function that predicts relay race places with changeover-times. We call this function the Fenton-Wilkinson Order Statistics model. This model is built on the following educated assumption: individual leg-times follow log-normal distributions. Moreover, our key idea is to utilize Fenton-Wilkinson approximations of changeover-times alongside an estimator for the total number of teams as in the notorious German tank problem. This original place regression function is sigmoidal and thus correctly predicts the existence of a small number of elite teams that significantly outperform the rest of the teams. Our model also describes how place increases linearly with changeover-time at the inflection point of the log-normal distribution function. With real-world data from Jukola 2019, a massive orienteering relay race, the model is shown to be highly accurate even when the size of the training set is only 5% of the whole data set. Numerical results also show that our model exhibits smaller place prediction root-mean-square-errors than linear regression, mord regression and Gaussian process regression.Keywords: Fenton-Wilkinson approximation, German tank problem, log-normal distribution, order statistics, ordinal regression, orienteering, sports analytics, sports modeling
Procedia PDF Downloads 1253939 The Predictors of Student Engagement: Instructional Support vs Emotional Support
Authors: Tahani Salman Alangari
Abstract:
Student success can be impacted by internal factors such as their emotional well-being and external factors such as organizational support and instructional support in the classroom. This study is to identify at least one factor that forecasts student engagement. It is a cross-sectional, conducted on 6206 teachers and encompassed three years of data collection and observations of math instruction in approximately 50 schools and 300 classrooms. A multiple linear regression revealed that a model predicting student engagement from emotional support, classroom organization, and instructional support was significant. Four linear regression models were tested using hierarchical regression to examine the effects of independent variables: emotional support was the highest predictor of student engagement while instructional support was the lowest.Keywords: student engagement, emotional support, organizational support, instructional support, well-being
Procedia PDF Downloads 813938 Modeling Standpipe Pressure Using Multivariable Regression Analysis by Combining Drilling Parameters and a Herschel-Bulkley Model
Authors: Seydou Sinde
Abstract:
The aims of this paper are to formulate mathematical expressions that can be used to estimate the standpipe pressure (SPP). The developed formulas take into account the main factors that, directly or indirectly, affect the behavior of SPP values. Fluid rheology and well hydraulics are some of these essential factors. Mud Plastic viscosity, yield point, flow power, consistency index, flow rate, drillstring, and annular geometries are represented by the frictional pressure (Pf), which is one of the input independent parameters and is calculated, in this paper, using Herschel-Bulkley rheological model. Other input independent parameters include the rate of penetration (ROP), applied load or weight on the bit (WOB), bit revolutions per minute (RPM), bit torque (TRQ), and hole inclination and direction coupled in the hole curvature or dogleg (DL). The technique of repeating parameters and Buckingham PI theorem are used to reduce the number of the input independent parameters into the dimensionless revolutions per minute (RPMd), the dimensionless torque (TRQd), and the dogleg, which is already in the dimensionless form of radians. Multivariable linear and polynomial regression technique using PTC Mathcad Prime 4.0 is used to analyze and determine the exact relationships between the dependent parameter, which is SPP, and the remaining three dimensionless groups. Three models proved sufficiently satisfactory to estimate the standpipe pressure: multivariable linear regression model 1 containing three regression coefficients for vertical wells; multivariable linear regression model 2 containing four regression coefficients for deviated wells; and multivariable polynomial quadratic regression model containing six regression coefficients for both vertical and deviated wells. Although that the linear regression model 2 (with four coefficients) is relatively more complex and contains an additional term over the linear regression model 1 (with three coefficients), the former did not really add significant improvements to the later except for some minor values. Thus, the effect of the hole curvature or dogleg is insignificant and can be omitted from the input independent parameters without significant losses of accuracy. The polynomial quadratic regression model is considered the most accurate model due to its relatively higher accuracy for most of the cases. Data of nine wells from the Middle East were used to run the developed models with satisfactory results provided by all of them, even if the multivariable polynomial quadratic regression model gave the best and most accurate results. Development of these models is useful not only to monitor and predict, with accuracy, the values of SPP but also to early control and check for the integrity of the well hydraulics as well as to take the corrective actions should any unexpected problems appear, such as pipe washouts, jet plugging, excessive mud losses, fluid gains, kicks, etc.Keywords: standpipe, pressure, hydraulics, nondimensionalization, parameters, regression
Procedia PDF Downloads 843937 Estimation of Functional Response Model by Supervised Functional Principal Component Analysis
Authors: Hyon I. Paek, Sang Rim Kim, Hyon A. Ryu
Abstract:
In functional linear regression, one typical problem is to reduce dimension. Compared with multivariate linear regression, functional linear regression is regarded as an infinite-dimensional case, and the main task is to reduce dimensions of functional response and functional predictors. One common approach is to adapt functional principal component analysis (FPCA) on functional predictors and then use a few leading functional principal components (FPC) to predict the functional model. The leading FPCs estimated by the typical FPCA explain a major variation of the functional predictor, but these leading FPCs may not be mostly correlated with the functional response, so they may not be significant in the prediction for response. In this paper, we propose a supervised functional principal component analysis method for a functional response model with FPCs obtained by considering the correlation of the functional response. Our method would have a better prediction accuracy than the typical FPCA method.Keywords: supervised, functional principal component analysis, functional response, functional linear regression
Procedia PDF Downloads 763936 Magneto-Rheological Damper Based Semi-Active Robust H∞ Control of Civil Structures with Parametric Uncertainties
Authors: Vedat Senol, Gursoy Turan, Anders Helmersson, Vortechz Andersson
Abstract:
In developing a mathematical model of a real structure, the simulation results of the model may not match the real structural response. This is a general problem that arises during dynamic motion of the structure, which may be modeled by means of parameter variations in the stiffness, damping, and mass matrices. These changes in parameters need to be estimated, and the mathematical model is updated to obtain higher control performances and robustness. In this study, a linear fractional transformation (LFT) is utilized for uncertainty modeling. Further, a general approach to the design of an H∞ control of a magneto-rheological damper (MRD) for vibration reduction in a building with mass, damping, and stiffness uncertainties is presented.Keywords: uncertainty modeling, structural control, MR Damper, H∞, robust control
Procedia PDF Downloads 1383935 Analyzing the Influence of Hydrometeorlogical Extremes, Geological Setting, and Social Demographic on Public Health
Authors: Irfan Ahmad Afip
Abstract:
This main research objective is to accurately identify the possibility for a Leptospirosis outbreak severity of a certain area based on its input features into a multivariate regression model. The research question is the possibility of an outbreak in a specific area being influenced by this feature, such as social demographics and hydrometeorological extremes. If the occurrence of an outbreak is being subjected to these features, then the epidemic severity for an area will be different depending on its environmental setting because the features will influence the possibility and severity of an outbreak. Specifically, this research objective was three-fold, namely: (a) to identify the relevant multivariate features and visualize the patterns data, (b) to develop a multivariate regression model based from the selected features and determine the possibility for Leptospirosis outbreak in an area, and (c) to compare the predictive ability of multivariate regression model and machine learning algorithms. Several secondary data features were collected locations in the state of Negeri Sembilan, Malaysia, based on the possibility it would be relevant to determine the outbreak severity in the area. The relevant features then will become an input in a multivariate regression model; a linear regression model is a simple and quick solution for creating prognostic capabilities. A multivariate regression model has proven more precise prognostic capabilities than univariate models. The expected outcome from this research is to establish a correlation between the features of social demographic and hydrometeorological with Leptospirosis bacteria; it will also become a contributor for understanding the underlying relationship between the pathogen and the ecosystem. The relationship established can be beneficial for the health department or urban planner to inspect and prepare for future outcomes in event detection and system health monitoring.Keywords: geographical information system, hydrometeorological, leptospirosis, multivariate regression
Procedia PDF Downloads 1153934 Uncertainty in Near-Term Global Surface Warming Linked to Pacific Trade Wind Variability
Authors: M. Hadi Bordbar, Matthew England, Alex Sen Gupta, Agus Santoso, Andrea Taschetto, Thomas Martin, Wonsun Park, Mojib Latif
Abstract:
Climate models generally simulate long-term reductions in the Pacific Walker Circulation with increasing atmospheric greenhouse gases. However, over two recent decades (1992-2011) there was a strong intensification of the Pacific Trade Winds that is linked with a slowdown in global surface warming. Using large ensembles of multiple climate models forced by increasing atmospheric greenhouse gas concentrations and starting from different ocean and/or atmospheric initial conditions, we reveal very diverse 20-year trends in the tropical Pacific climate associated with a considerable uncertainty in the globally averaged surface air temperature (SAT) in each model ensemble. This result suggests low confidence in our ability to accurately predict SAT trends over 20-year timescale only from external forcing. We show, however, that the uncertainty can be reduced when the initial oceanic state is adequately known and well represented in the model. Our analyses suggest that internal variability in the Pacific trade winds can mask the anthropogenic signal over a 20-year time frame, and drive transitions between periods of accelerated global warming and temporary slowdown periods.Keywords: trade winds, walker circulation, hiatus in the global surface warming, internal climate variability
Procedia PDF Downloads 2683933 On Estimating the Headcount Index by Using the Logistic Regression Estimator
Authors: Encarnación Álvarez, Rosa M. García-Fernández, Juan F. Muñoz, Francisco J. Blanco-Encomienda
Abstract:
The problem of estimating a proportion has important applications in the field of economics, and in general, in many areas such as social sciences. A common application in economics is the estimation of the headcount index. In this paper, we define the general headcount index as a proportion. Furthermore, we introduce a new quantitative method for estimating the headcount index. In particular, we suggest to use the logistic regression estimator for the problem of estimating the headcount index. Assuming a real data set, results derived from Monte Carlo simulation studies indicate that the logistic regression estimator can be more accurate than the traditional estimator of the headcount index.Keywords: poverty line, poor, risk of poverty, Monte Carlo simulations, sample
Procedia PDF Downloads 4233932 Stating Best Commercialization Method: An Unanswered Question from Scholars and Practitioners
Authors: Saheed A. Gbadegeshin
Abstract:
Commercialization method is a means to make inventions available at the market for final consumption. It is described as an important tool for keeping business enterprises sustainable and improving national economic growth. Thus, there are several scholarly publications on it, either presenting or testing different methods for commercialization. However, young entrepreneurs, technologists and scientists would like to know the best method to commercialize their innovations. Then, this question arises: What is the best commercialization method? To answer the question, a systematic literature review was conducted, and practitioners were interviewed. The literary results revealed that there are many methods but new methods are needed to improve commercialization especially during these times of economic crisis and political uncertainty. Similarly, the empirical results showed there are several methods, but the best method is the one that reduces costs, reduces the risks associated with uncertainty, and improves customer participation and acceptability. Therefore, it was concluded that new commercialization method is essential for today's high technologies and a method was presented.Keywords: commercialization method, technology, knowledge, intellectual property, innovation, invention
Procedia PDF Downloads 3433931 Heart Attack Prediction Using Several Machine Learning Methods
Authors: Suzan Anwar, Utkarsh Goyal
Abstract:
Heart rate (HR) is a predictor of cardiovascular, cerebrovascular, and all-cause mortality in the general population, as well as in patients with cardio and cerebrovascular diseases. Machine learning (ML) significantly improves the accuracy of cardiovascular risk prediction, increasing the number of patients identified who could benefit from preventive treatment while avoiding unnecessary treatment of others. This research examines relationship between the individual's various heart health inputs like age, sex, cp, trestbps, thalach, oldpeaketc, and the likelihood of developing heart disease. Machine learning techniques like logistic regression and decision tree, and Python are used. The results of testing and evaluating the model using the Heart Failure Prediction Dataset show the chance of a person having a heart disease with variable accuracy. Logistic regression has yielded an accuracy of 80.48% without data handling. With data handling (normalization, standardscaler), the logistic regression resulted in improved accuracy of 87.80%, decision tree 100%, random forest 100%, and SVM 100%.Keywords: heart rate, machine learning, SVM, decision tree, logistic regression, random forest
Procedia PDF Downloads 1383930 Efficient Model Selection in Linear and Non-Linear Quantile Regression by Cross-Validation
Authors: Yoonsuh Jung, Steven N. MacEachern
Abstract:
Check loss function is used to define quantile regression. In the prospect of cross validation, it is also employed as a validation function when underlying truth is unknown. However, our empirical study indicates that the validation with check loss often leads to choosing an over estimated fits. In this work, we suggest a modified or L2-adjusted check loss which rounds the sharp corner in the middle of check loss. It has a large effect of guarding against over fitted model in some extent. Through various simulation settings of linear and non-linear regressions, the improvement of check loss by L2 adjustment is empirically examined. This adjustment is devised to shrink to zero as sample size grows.Keywords: cross-validation, model selection, quantile regression, tuning parameter selection
Procedia PDF Downloads 4383929 Instability Index Method and Logistic Regression to Assess Landslide Susceptibility in County Route 89, Taiwan
Authors: Y. H. Wu, Ji-Yuan Lin, Yu-Ming Liou
Abstract:
This study aims to set up the landslide susceptibility map of County Route 89 at Ren-Ai Township in Nantou County using the Instability Index Method and Logistic regression. Seven susceptibility factors including Slope Angle, Aspect, Elevation, Distance to fold, Distance to River, Distance to Road and Accumulated Rainfall were obtained by GIS based on the Typhoon Toraji landslide area identified by Industrial Technology Research Institute in 2001. To calculate the landslide percentage of each factor and acquire the weight and grade the grid by means of Instability Index Method. In this study, landslide susceptibility can be classified into four grades: high, medium high, medium low and low, in order to determine the advantages and disadvantages of the two models. The precision of this model is verified by classification error matrix and SRC curve. These results suggest that the logistic regression model is a preferred method than instability index in the assessment of landslide susceptibility. It is suitable for the landslide prediction and precaution in this area in the future.Keywords: instability index method, logistic regression, landslide susceptibility, SRC curve
Procedia PDF Downloads 2923928 Objective Assessment of the Evolution of Microplastic Contamination in Sediments from a Vast Coastal Area
Authors: Vanessa Morgado, Ricardo Bettencourt da Silva, Carla Palma
Abstract:
The environmental pollution by microplastics is well recognized. Microplastics were already detected in various matrices from distinct environmental compartments worldwide, some from remote areas. Various methodologies and techniques have been used to determine microplastic in such matrices, for instance, sediment samples from the ocean bottom. In order to determine microplastics in a sediment matrix, the sample is typically sieved through a 5 mm mesh, digested to remove the organic matter, and density separated to isolate microplastics from the denser part of the sediment. The physical analysis of microplastic consists of visual analysis under a stereomicroscope to determine particle size, colour, and shape. The chemical analysis is performed by an infrared spectrometer coupled to a microscope (micro-FTIR), allowing to the identification of the chemical composition of microplastic, i.e., the type of polymer. Creating legislation and policies to control and manage (micro)plastic pollution is essential to protect the environment, namely the coastal areas. The regulation is defined from the known relevance and trends of the pollution type. This work discusses the assessment of contamination trends of a 700 km² oceanic area affected by contamination heterogeneity, sampling representativeness, and the uncertainty of the analysis of collected samples. The methodology developed consists of objectively identifying meaningful variations of microplastic contamination by the Monte Carlo simulation of all uncertainty sources. This work allowed us to unequivocally conclude that the contamination level of the studied area did not vary significantly between two consecutive years (2018 and 2019) and that PET microplastics are the major type of polymer. The comparison of contamination levels was performed for a 99% confidence level. The developed know-how is crucial for the objective and binding determination of microplastic contamination in relevant environmental compartments.Keywords: measurement uncertainty, micro-ATR-FTIR, microplastics, ocean contamination, sampling uncertainty
Procedia PDF Downloads 903927 Location Uncertainty – A Probablistic Solution for Automatic Train Control
Authors: Monish Sengupta, Benjamin Heydecker, Daniel Woodland
Abstract:
New train control systems rely mainly on Automatic Train Protection (ATP) and Automatic Train Operation (ATO) dynamically to control the speed and hence performance. The ATP and the ATO form the vital element within the CBTC (Communication Based Train Control) and within the ERTMS (European Rail Traffic Management System) system architectures. Reliable and accurate measurement of train location, speed and acceleration are vital to the operation of train control systems. In the past, all CBTC and ERTMS system have deployed a balise or equivalent to correct the uncertainty element of the train location. Typically a CBTC train is allowed to miss only one balise on the track, after which the Automatic Train Protection (ATP) system applies emergency brake to halt the service. This is because the location uncertainty, which grows within the train control system, cannot tolerate missing more than one balise. Balises contribute a significant amount towards wayside maintenance and studies have shown that balises on the track also forms a constraint for future track layout change and change in speed profile.This paper investigates the causes of the location uncertainty that is currently experienced and considers whether it is possible to identify an effective filter to ascertain, in conjunction with appropriate sensors, more accurate speed, distance and location for a CBTC driven train without the need of any external balises. An appropriate sensor fusion algorithm and intelligent sensor selection methodology will be deployed to ascertain the railway location and speed measurement at its highest precision. Similar techniques are already in use in aviation, satellite, submarine and other navigation systems. Developing a model for the speed control and the use of Kalman filter is a key element in this research. This paper will summarize the research undertaken and its significant findings, highlighting the potential for introducing alternative approaches to train positioning that would enable removal of all trackside location correction balises, leading to huge reduction in maintenances and more flexibility in future track design.Keywords: ERTMS, CBTC, ATP, ATO
Procedia PDF Downloads 4103926 Robust Stabilization of Rotational Motion of Underwater Robots against Parameter Uncertainties
Authors: Riku Hayashida, Tomoaki Hashimoto
Abstract:
This paper provides a robust stabilization method for rotational motion of underwater robots against parameter uncertainties. Underwater robots are expected to be used for various work assignments. The large variety of applications of underwater robots motivates researchers to develop control systems and technologies for underwater robots. Several control methods have been proposed so far for the stabilization of nominal system model of underwater robots with no parameter uncertainty. Parameter uncertainties are considered to be obstacles in implementation of the such nominal control methods for underwater robots. The objective of this study is to establish a robust stabilization method for rotational motion of underwater robots against parameter uncertainties. The effectiveness of the proposed method is verified by numerical simulations.Keywords: robust control, stabilization method, underwater robot, parameter uncertainty
Procedia PDF Downloads 1603925 Uncertainty Quantification of Fuel Compositions on Premixed Bio-Syngas Combustion at High-Pressure
Abstract:
Effect of fuel variabilities on premixed combustion of bio-syngas mixtures is of great importance in bio-syngas utilisation. The uncertainties of concentrations of fuel constituents such as H2, CO and CH4 may lead to unpredictable combustion performances, combustion instabilities and hot spots which may deteriorate and damage the combustion hardware. Numerical modelling and simulations can assist in understanding the behaviour of bio-syngas combustion with pre-defined species concentrations, while the evaluation of variabilities of concentrations is expensive. To be more specific, questions such as ‘what is the burning velocity of bio-syngas at specific equivalence ratio?’ have been answered either experimentally or numerically, while questions such as ‘what is the likelihood of burning velocity when precise concentrations of bio-syngas compositions are unknown, but the concentration ranges are pre-described?’ have not yet been answered. Uncertainty quantification (UQ) methods can be used to tackle such questions and assess the effects of fuel compositions. An efficient probabilistic UQ method based on Polynomial Chaos Expansion (PCE) techniques is employed in this study. The method relies on representing random variables (combustion performances) with orthogonal polynomials such as Legendre or Gaussian polynomials. The constructed PCE via Galerkin Projection provides easy access to global sensitivities such as main, joint and total Sobol indices. In this study, impacts of fuel compositions on combustion (adiabatic flame temperature and laminar flame speed) of bio-syngas fuel mixtures are presented invoking this PCE technique at several equivalence ratios. High-pressure effects on bio-syngas combustion instability are obtained using detailed chemical mechanism - the San Diego Mechanism. Guidance on reducing combustion instability from upstream biomass gasification process is provided by quantifying the significant contributions of composition variations to variance of physicochemical properties of bio-syngas combustion. It was found that flame speed is very sensitive to hydrogen variability in bio-syngas, and reducing hydrogen uncertainty from upstream biomass gasification processes can greatly reduce bio-syngas combustion instability. Variation of methane concentration, although thought to be important, has limited impacts on laminar flame instabilities especially for lean combustion. Further studies on the UQ of percentage concentration of hydrogen in bio-syngas can be conducted to guide the safer use of bio-syngas.Keywords: bio-syngas combustion, clean energy utilisation, fuel variability, PCE, targeted uncertainty reduction, uncertainty quantification
Procedia PDF Downloads 2763924 [Keynote Talk]: Evidence Fusion in Decision Making
Authors: Mohammad Abdullah-Al-Wadud
Abstract:
In the current era of automation and artificial intelligence, different systems have been increasingly keeping on depending on decision-making capabilities of machines. Such systems/applications may range from simple classifiers to sophisticated surveillance systems based on traditional sensors and related equipment which are becoming more common in the internet of things (IoT) paradigm. However, the available data for such problems are usually imprecise and incomplete, which leads to uncertainty in decisions made based on traditional probability-based classifiers. This requires a robust fusion framework to combine the available information sources with some degree of certainty. The theory of evidence can provide with such a method for combining evidence from different (may be unreliable) sources/observers. This talk will address the employment of the Dempster-Shafer Theory of evidence in some practical applications.Keywords: decision making, dempster-shafer theory, evidence fusion, incomplete data, uncertainty
Procedia PDF Downloads 4253923 Regret-Regression for Multi-Armed Bandit Problem
Authors: Deyadeen Ali Alshibani
Abstract:
In the literature, the multi-armed bandit problem as a statistical decision model of an agent trying to optimize his decisions while improving his information at the same time. There are several different algorithms models and their applications on this problem. In this paper, we evaluate the Regret-regression through comparing with Q-learning method. A simulation on determination of optimal treatment regime is presented in detail.Keywords: optimal, bandit problem, optimization, dynamic programming
Procedia PDF Downloads 4533922 The Strengths and Limitations of the Statistical Modeling of Complex Social Phenomenon: Focusing on SEM, Path Analysis, or Multiple Regression Models
Authors: Jihye Jeon
Abstract:
This paper analyzes the conceptual framework of three statistical methods, multiple regression, path analysis, and structural equation models. When establishing research model of the statistical modeling of complex social phenomenon, it is important to know the strengths and limitations of three statistical models. This study explored the character, strength, and limitation of each modeling and suggested some strategies for accurate explaining or predicting the causal relationships among variables. Especially, on the studying of depression or mental health, the common mistakes of research modeling were discussed.Keywords: multiple regression, path analysis, structural equation models, statistical modeling, social and psychological phenomenon
Procedia PDF Downloads 6533921 QSRR Analysis of 17-Picolyl and 17-Picolinylidene Androstane Derivatives Based on Partial Least Squares and Principal Component Regression
Authors: Sanja Podunavac-Kuzmanović, Strahinja Kovačević, Lidija Jevrić, Evgenija Djurendić, Jovana Ajduković
Abstract:
There are several methods for determination of the lipophilicity of biologically active compounds, however chromatography has been shown as a very suitable method for this purpose. Chromatographic (C18-RP-HPLC) analysis of a series of 24 17-picolyl and 17-picolinylidene androstane derivatives was carried out. The obtained retention indices (logk, methanol (90%) / water (10%)) were correlated with calculated physicochemical and lipophilicity descriptors. The QSRR analysis was carried out applying principal component regression (PCR) and partial least squares regression (PLS). The PCR and PLS model were selected on the basis of the highest variance and the lowest root mean square error of cross-validation. The obtained PCR and PLS model successfully correlate the calculated molecular descriptors with logk parameter indicating the significance of the lipophilicity of compounds in chromatographic process. On the basis of the obtained results it can be concluded that the obtained logk parameters of the analyzed androstane derivatives can be considered as their chromatographic lipophilicity. These results are the part of the project No. 114-451-347/2015-02, financially supported by the Provincial Secretariat for Science and Technological Development of Vojvodina and CMST COST Action CM1105.Keywords: androstane derivatives, chromatography, molecular structure, principal component regression, partial least squares regression
Procedia PDF Downloads 2773920 Detecting Earnings Management via Statistical and Neural Networks Techniques
Authors: Mohammad Namazi, Mohammad Sadeghzadeh Maharluie
Abstract:
Predicting earnings management is vital for the capital market participants, financial analysts and managers. The aim of this research is attempting to respond to this query: Is there a significant difference between the regression model and neural networks’ models in predicting earnings management, and which one leads to a superior prediction of it? In approaching this question, a Linear Regression (LR) model was compared with two neural networks including Multi-Layer Perceptron (MLP), and Generalized Regression Neural Network (GRNN). The population of this study includes 94 listed companies in Tehran Stock Exchange (TSE) market from 2003 to 2011. After the results of all models were acquired, ANOVA was exerted to test the hypotheses. In general, the summary of statistical results showed that the precision of GRNN did not exhibit a significant difference in comparison with MLP. In addition, the mean square error of the MLP and GRNN showed a significant difference with the multi variable LR model. These findings support the notion of nonlinear behavior of the earnings management. Therefore, it is more appropriate for capital market participants to analyze earnings management based upon neural networks techniques, and not to adopt linear regression models.Keywords: earnings management, generalized linear regression, neural networks multi-layer perceptron, Tehran stock exchange
Procedia PDF Downloads 4223919 Minimizing the Impact of Covariate Detection Limit in Logistic Regression
Authors: Shahadut Hossain, Jacek Wesolowski, Zahirul Hoque
Abstract:
In many epidemiological and environmental studies covariate measurements are subject to the detection limit. In most applications, covariate measurements are usually truncated from below which is known as left-truncation. Because the measuring device, which we use to measure the covariate, fails to detect values falling below the certain threshold. In regression analyses, it causes inflated bias and inaccurate mean squared error (MSE) to the estimators. This paper suggests a response-based regression calibration method to correct the deleterious impact introduced by the covariate detection limit in the estimators of the parameters of simple logistic regression model. Compared to the maximum likelihood method, the proposed method is computationally simpler, and hence easier to implement. It is robust to the violation of distributional assumption about the covariate of interest. In producing correct inference, the performance of the proposed method compared to the other competing methods has been investigated through extensive simulations. A real-life application of the method is also shown using data from a population-based case-control study of non-Hodgkin lymphoma.Keywords: environmental exposure, detection limit, left truncation, bias, ad-hoc substitution
Procedia PDF Downloads 2373918 Communication of Expected Survival Time to Cancer Patients: How It Is Done and How It Should Be Done
Authors: Geir Kirkebøen
Abstract:
Most patients with serious diagnoses want to know their prognosis, in particular their expected survival time. As part of the informed consent process, physicians are legally obligated to communicate such information to patients. However, there is no established (evidence based) ‘best practice’ for how to do this. The two questions explored in this study are: How do physicians communicate expected survival time to patients, and how should it be done? We explored the first, descriptive question in a study with Norwegian oncologists as participants. The study had a scenario and a survey part. In the scenario part, the doctors should imagine that a patient, recently diagnosed with a serious cancer diagnosis, has asked them: ‘How long can I expect to live with such a diagnosis? I want an honest answer from you!’ The doctors should assume that the diagnosis is certain, and that from an extensive recent study they had optimal statistical knowledge, described in detail as a right-skewed survival curve, about how long such patients with this kind of diagnosis could be expected to live. The main finding was that very few of the oncologists would explain to the patient the variation in survival time as described by the survival curve. The majority would not give the patient an answer at all. Of those who gave an answer, the typical answer was that survival time varies a lot, that it is hard to say in a specific case, that we will come back to it later etc. The survey part of the study clearly indicates that the main reason why the oncologists would not deliver the mortality prognosis was discomfort with its uncertainty. The scenario part of the study confirmed this finding. The majority of the oncologists explicitly used the uncertainty, the variation in survival time, as a reason to not give the patient an answer. Many studies show that patients want realistic information about their mortality prognosis, and that they should be given hope. The question then is how to communicate the uncertainty of the prognosis in a realistic and optimistic – hopeful – way. Based on psychological research, our hypothesis is that the best way to do this is by explicitly describing the variation in survival time, the (usually) right skewed survival curve of the prognosis, and emphasize to the patient the (small) possibility of being a ‘lucky outlier’. We tested this hypothesis in two scenario studies with lay people as participants. The data clearly show that people prefer to receive expected survival time as a median value together with explicit information about the survival curve’s right skewedness (e.g., concrete examples of ‘positive outliers’), and that communicating expected survival time this way not only provides people with hope, but also gives them a more realistic understanding compared with the typical way expected survival time is communicated. Our data indicate that it is not the existence of the uncertainty regarding the mortality prognosis that is the problem for patients, but how this uncertainty is, or is not, communicated and explained.Keywords: cancer patients, decision psychology, doctor-patient communication, mortality prognosis
Procedia PDF Downloads 3303917 Comparative Study od Three Artificial Intelligence Techniques for Rain Domain in Precipitation Forecast
Authors: Nabilah Filzah Mohd Radzuan, Andi Putra, Zalinda Othman, Azuraliza Abu Bakar, Abdul Razak Hamdan
Abstract:
Precipitation forecast is important to avoid natural disaster incident which can cause losses in the involved area. This paper reviews three techniques logistic regression, decision tree, and random forest which are used in making precipitation forecast. These combination techniques through the vector auto-regression (VAR) model help in finding the advantages and strengths of each technique in the forecast process. The data-set contains variables of the rain’s domain. Adaptation of artificial intelligence techniques involved in rain domain enables the forecast process to be easier and systematic for precipitation forecast.Keywords: logistic regression, decisions tree, random forest, VAR model
Procedia PDF Downloads 4463916 A Large Dataset Imputation Approach Applied to Country Conflict Prediction Data
Authors: Benjamin Leiby, Darryl Ahner
Abstract:
This study demonstrates an alternative stochastic imputation approach for large datasets when preferred commercial packages struggle to iterate due to numerical problems. A large country conflict dataset motivates the search to impute missing values well over a common threshold of 20% missingness. The methodology capitalizes on correlation while using model residuals to provide the uncertainty in estimating unknown values. Examination of the methodology provides insight toward choosing linear or nonlinear modeling terms. Static tolerances common in most packages are replaced with tailorable tolerances that exploit residuals to fit each data element. The methodology evaluation includes observing computation time, model fit, and the comparison of known values to replaced values created through imputation. Overall, the country conflict dataset illustrates promise with modeling first-order interactions while presenting a need for further refinement that mimics predictive mean matching.Keywords: correlation, country conflict, imputation, stochastic regression
Procedia PDF Downloads 1203915 Flexural Analysis of Palm Fiber Reinforced Hybrid Polymer Matrix Composite
Authors: G.Venkatachalam, Gautham Shankar, Dasarath Raghav, Krishna Kuar, Santhosh Kiran, Bhargav Mahesh
Abstract:
Uncertainty in the availability of fossil fuels in the future and global warming increased the need for more environment-friendly materials. In this work, an attempt is made to fabricate a hybrid polymer matrix composite. The blend is a mixture of General Purpose Resin and Cashew Nut Shell Liquid, a natural resin extracted from cashew plant. Palm fiber, which has high strength, is used as a reinforcement material. The fiber is treated with alkali (NaOH) solution to increase its strength and adhesiveness. Parametric study of flexure strength is carried out by varying alkali concentration, duration of alkali treatment and fiber volume. Taguchi L9 Orthogonal array is followed in the design of experiments procedure for simplification. With the help of ANOVA technique, regression equations are obtained which gives the level of influence of each parameter on the flexure strength of the composite.Keywords: Adhesion, CNSL, Flexural Analysis, Hybrid Matrix Composite, Palm Fiber
Procedia PDF Downloads 4053914 The Computational Psycholinguistic Situational-Fuzzy Self-Controlled Brain and Mind System Under Uncertainty
Authors: Ben Khayut, Lina Fabri, Maya Avikhana
Abstract:
The models of the modern Artificial Narrow Intelligence (ANI) cannot: a) independently and continuously function without of human intelligence, used for retraining and reprogramming the ANI’s models, and b) think, understand, be conscious, cognize, infer, and more in state of Uncertainty, and changes in situations, and environmental objects. To eliminate these shortcomings and build a new generation of Artificial Intelligence systems, the paper proposes a Conception, Model, and Method of Computational Psycholinguistic Cognitive Situational-Fuzzy Self-Controlled Brain and Mind System (CPCSFSCBMSUU) using a neural network as its computational memory, operating under uncertainty, and activating its functions by perception, identification of real objects, fuzzy situational control, forming images of these objects, modeling their psychological, linguistic, cognitive, and neural values of properties and features, the meanings of which are identified, interpreted, generated, and formed taking into account the identified subject area, using the data, information, knowledge, and images, accumulated in the Memory. The functioning of the CPCSFSCBMSUU is carried out by its subsystems of the: fuzzy situational control of all processes, computational perception, identifying of reactions and actions, Psycholinguistic Cognitive Fuzzy Logical Inference, Decision making, Reasoning, Systems Thinking, Planning, Awareness, Consciousness, Cognition, Intuition, Wisdom, analysis and processing of the psycholinguistic, subject, visual, signal, sound and other objects, accumulation and using the data, information and knowledge in the Memory, communication, and interaction with other computing systems, robots and humans in order of solving the joint tasks. To investigate the functional processes of the proposed system, the principles of Situational Control, Fuzzy Logic, Psycholinguistics, Informatics, and modern possibilities of Data Science were applied. The proposed self-controlled System of Brain and Mind is oriented on use as a plug-in in multilingual subject Applications.Keywords: computational brain, mind, psycholinguistic, system, under uncertainty
Procedia PDF Downloads 1773913 A Study of User Awareness and Attitudes Towards Civil-ID Authentication in Oman’s Electronic Services
Authors: Raya Al Khayari, Rasha Al Jassim, Muna Al Balushi, Fatma Al Moqbali, Said El Hajjar
Abstract:
This study utilizes linear regression analysis to investigate the correlation between user account passwords and the probability of civil ID exposure, offering statistical insights into civil ID security. The study employs multiple linear regression (MLR) analysis to further investigate the elements that influence consumers’ views of civil ID security. This aims to increase awareness and improve preventive measures. The results obtained from the MLR analysis provide a thorough comprehension and can guide specific educational and awareness campaigns aimed at promoting improved security procedures. In summary, the study’s results offer significant insights for improving existing security measures and developing more efficient tactics to reduce risks related to civil ID security in Oman. By identifying key factors that impact consumers’ perceptions, organizations can tailor their strategies to address vulnerabilities effectively. Additionally, the findings can inform policymakers on potential regulatory changes to enhance civil ID security in the country.Keywords: civil-id disclosure, awareness, linear regression, multiple regression
Procedia PDF Downloads 59