Search results for: conflicting claim on credit of discovery of ridge regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4632

Search results for: conflicting claim on credit of discovery of ridge regression

4482 Zika Virus NS5 Protein Potential Inhibitors: An Enhanced in silico Approach in Drug Discovery

Authors: Pritika Ramharack, Mahmoud E. S. Soliman

Abstract:

The re-emerging Zika virus is an arthropod-borne virus that has been described to have explosive potential as a worldwide pandemic. The initial transmission of the virus was through a mosquito vector, however, evolving modes of transmission has allowed the spread of the disease over continents. The virus already been linked to irreversible chronic central nervous system (CNS) conditions. The concerns of the scientific and clinical community are the consequences of Zika viral mutations, thus suggesting the urgent need for viral inhibitors. There have been large strides in vaccine development against the virus but there are still no FDA-approved drugs available. Rapid rational drug design and discovery research is fundamental in the production of potent inhibitors against the virus that will not just mask the virus, but destroy it completely. In silico drug design allows for this prompt screening of potential leads, thus decreasing the consumption of precious time and resources. This study demonstrates an optimized and proven screening technique in the discovery of two potential small molecule inhibitors of Zika virus Methyltransferase and RNA-dependent RNA polymerase. This in silico “per-residue energy decomposition pharmacophore” virtual screening approach will be critical in aiding scientists in the discovery of not only effective inhibitors of Zika viral targets, but also a wide range of anti-viral agents.

Keywords: NS5 protein inhibitors, per-residue decomposition, pharmacophore model, virtual screening, Zika virus

Procedia PDF Downloads 212
4481 Knowledge Management: Why is So Difficult? From “A Good Idea” to Organizational Contribute

Authors: Lisandro Blas, Héctor Tamanini

Abstract:

From earliest 90 to now, no many companies or organization can “really” implement a knowledge management (KM) system that works (no only viewed from a measurement model, but in this continuity). Which are the reasons of that? Some of the reason maybe could be embedded in how KM is demanded (usefulness, priority, experts, a definition of KM) vs the importance and resources that the organizations afford (budget, responsible of a specific area of KM, intangibility). Many organizations “claim” the importance of Knowledge Management but thhese demands are not reflecting these claims in their future actions. With another’s tools or managements ideas the organizations put the economics and human resources to work. Why it´s not occur in KM? This paper tray to explain some of this reasons and tray to deal with this situations through a survey done in 2011 for a IAPG (Argentinean Institute from Oil & Gas) Congress.

Keywords: knowledge management into organizations, new perspectives, failure in implementation, claim

Procedia PDF Downloads 410
4480 Modeling Optimal Lipophilicity and Drug Performance in Ligand-Receptor Interactions: A Machine Learning Approach to Drug Discovery

Authors: Jay Ananth

Abstract:

The drug discovery process currently requires numerous years of clinical testing as well as money just for a single drug to earn FDA approval. For drugs that even make it this far in the process, there is a very slim chance of receiving FDA approval, resulting in detrimental hurdles to drug accessibility. To minimize these inefficiencies, numerous studies have implemented computational methods, although few computational investigations have focused on a crucial feature of drugs: lipophilicity. Lipophilicity is a physical attribute of a compound that measures its solubility in lipids and is a determinant of drug efficacy. This project leverages Artificial Intelligence to predict the impact of a drug’s lipophilicity on its performance by accounting for factors such as binding affinity and toxicity. The model predicted lipophilicity and binding affinity in the validation set with very high R² scores of 0.921 and 0.788, respectively, while also being applicable to a variety of target receptors. The results expressed a strong positive correlation between lipophilicity and both binding affinity and toxicity. The model helps in both drug development and discovery, providing every pharmaceutical company with recommended lipophilicity levels for drug candidates as well as a rapid assessment of early-stage drugs prior to any testing, eliminating significant amounts of time and resources currently restricting drug accessibility.

Keywords: drug discovery, lipophilicity, ligand-receptor interactions, machine learning, drug development

Procedia PDF Downloads 90
4479 The Role of Microfinance in Economic Development

Authors: Babak Salekmahdy

Abstract:

Microfinance is often seen as a means of repairing credit markets and unleashing the potential contribution of impoverished people who rely on self-employment. Since the 1990s, the microfinance industry has expanded rapidly, opening the path for additional kinds of social entrepreneurship and social investment. However, current data indicate relatively few average consumer effects, opposing pushback against microfinance. This research reconsiders microfinance statements, stressing the variety of data on impacts and the essential (but limited) role of reimbursements. The report finishes by explaining a shift in thinking: from microfinance as a strictly defined enterprise finance to microfinance as a more widely defined home finance. Microfinance, under this perspective, provides advantages by providing liquidity for various requirements rather than just by increasing income.

Keywords: microfinance, small business, economic development, credit markets

Procedia PDF Downloads 70
4478 Automated Fact-Checking by Incorporating Contextual Knowledge and Multi-Faceted Search

Authors: Wenbo Wang, Yi-Fang Brook Wu

Abstract:

The spread of misinformation and disinformation has become a major concern, particularly with the rise of social media as a primary source of information for many people. As a means to address this phenomenon, automated fact-checking has emerged as a safeguard against the spread of misinformation and disinformation. Existing fact-checking approaches aim to determine whether a news claim is true or false, and they have achieved decent veracity prediction accuracy. However, the state-of-the-art methods rely on manually verified external information to assist the checking model in making judgments, which requires significant human resources. This study introduces a framework, SAC, which focuses on 1) augmenting the representation of a claim by incorporating additional context using general-purpose, comprehensive, and authoritative data; 2) developing a search function to automatically select relevant, new, and credible references; 3) focusing on the important parts of the representations of a claim and its reference that are most relevant to the fact-checking task. The experimental results demonstrate that 1) Augmenting the representations of claims and references through the use of a knowledge base, combined with the multi-head attention technique, contributes to improved performance of fact-checking. 2) SAC with auto-selected references outperforms existing fact-checking approaches with manual selected references. Future directions of this study include I) exploring knowledge graphs in Wikidata to dynamically augment the representations of claims and references without introducing too much noise, II) exploring semantic relations in claims and references to further enhance fact-checking.

Keywords: fact checking, claim verification, deep learning, natural language processing

Procedia PDF Downloads 48
4477 Non-Parametric Regression over Its Parametric Couterparts with Large Sample Size

Authors: Jude Opara, Esemokumo Perewarebo Akpos

Abstract:

This paper is on non-parametric linear regression over its parametric counterparts with large sample size. Data set on anthropometric measurement of primary school pupils was taken for the analysis. The study used 50 randomly selected pupils for the study. The set of data was subjected to normality test, and it was discovered that the residuals are not normally distributed (i.e. they do not follow a Gaussian distribution) for the commonly used least squares regression method for fitting an equation into a set of (x,y)-data points using the Anderson-Darling technique. The algorithms for the nonparametric Theil’s regression are stated in this paper as well as its parametric OLS counterpart. The use of a programming language software known as “R Development” was used in this paper. From the analysis, the result showed that there exists a significant relationship between the response and the explanatory variable for both the parametric and non-parametric regression. To know the efficiency of one method over the other, the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) are used, and it is discovered that the nonparametric regression performs better than its parametric regression counterparts due to their lower values in both the AIC and BIC. The study however recommends that future researchers should study a similar work by examining the presence of outliers in the data set, and probably expunge it if detected and re-analyze to compare results.

Keywords: Theil’s regression, Bayesian information criterion, Akaike information criterion, OLS

Procedia PDF Downloads 295
4476 Development and Validation of a Coronary Heart Disease Risk Score in Indian Type 2 Diabetes Mellitus Patients

Authors: Faiz N. K. Yusufi, Aquil Ahmed, Jamal Ahmad

Abstract:

Diabetes in India is growing at an alarming rate and the complications caused by it need to be controlled. Coronary heart disease (CHD) is one of the complications that will be discussed for prediction in this study. India has the second most number of diabetes patients in the world. To the best of our knowledge, there is no CHD risk score for Indian type 2 diabetes patients. Any form of CHD has been taken as the event of interest. A sample of 750 was determined and randomly collected from the Rajiv Gandhi Centre for Diabetes and Endocrinology, J.N.M.C., A.M.U., Aligarh, India. Collected variables include patients data such as sex, age, height, weight, body mass index (BMI), blood sugar fasting (BSF), post prandial sugar (PP), glycosylated haemoglobin (HbA1c), diastolic blood pressure (DBP), systolic blood pressure (SBP), smoking, alcohol habits, total cholesterol (TC), triglycerides (TG), high density lipoprotein (HDL), low density lipoprotein (LDL), very low density lipoprotein (VLDL), physical activity, duration of diabetes, diet control, history of antihypertensive drug treatment, family history of diabetes, waist circumference, hip circumference, medications, central obesity and history of CHD. Predictive risk scores of CHD events are designed by cox proportional hazard regression. Model calibration and discrimination is assessed from Hosmer Lemeshow and area under receiver operating characteristic (ROC) curve. Overfitting and underfitting of the model is checked by applying regularization techniques and best method is selected between ridge, lasso and elastic net regression. Youden’s index is used to choose the optimal cut off point from the scores. Five year probability of CHD is predicted by both survival function and Markov chain two state model and the better technique is concluded. The risk scores for CHD developed can be calculated by doctors and patients for self-control of diabetes. Furthermore, the five-year probabilities can be implemented as well to forecast and maintain the condition of patients.

Keywords: coronary heart disease, cox proportional hazard regression, ROC curve, type 2 diabetes Mellitus

Procedia PDF Downloads 210
4475 Finding Optimal Solutions to Management Problems with the use of Econometric and Multiobjective Programming

Authors: M. Moradi Dalini, M. R. Talebi

Abstract:

This research revolves around a technical method according to combines econometric and multiobjective programming to select and obtain optimal solutions to management problems. It is taken for a generation that; it is important to analyze which combination of values of the explanatory variables -in an econometric method- would point to the simultaneous achievement of the best values of the response variables. In this case, if a certain degree of conflict is viewed among the response variables, we suggest a multiobjective method in order to the results obtained from a regression analysis. In fact, with the use of a multiobjective method, we will have the best decision about the conflicting relationship between the response variables and the optimal solution. The combined multiobjective programming and econometrics benefit is an assessment of a balanced “optimal” situation among them because a find of information can hardly be extracted just by econometric techniques.

Keywords: econometrics, multiobjective optimization, management problem, optimization

Procedia PDF Downloads 72
4474 Semiparametric Regression Of Truncated Spline Biresponse On Farmer Loyalty And Attachment Modeling

Authors: Adji Achmad Rinaldo Fernandes

Abstract:

Regression analysis is a statistical method that is able to describe and predict causal relationships between individuals. Not all relationships have a known curve shape; often, there are relationship patterns that cannot be known in the shape of the curve; besides that, a cause can have an impact on more than one effect, so that between effects can also have a close relationship in it. Regression analysis that can be done to find out the relationship can be brought closer to the semiparametric regression of truncated spline biresponse. The purpose of this study is to examine the function estimator and determine the best model of truncated spline biresponse semiparametric regression. The results of the secondary data study showed that the best model with the highest order of quadratic and a maximum of two knots with a Goodness of fit value in the form of Adjusted R2 of 88.5%.

Keywords: biresponse, farmer attachment, farmer loyalty, truncated spline

Procedia PDF Downloads 20
4473 Internet Purchases in European Union Countries: Multiple Linear Regression Approach

Authors: Ksenija Dumičić, Anita Čeh Časni, Irena Palić

Abstract:

This paper examines economic and Information and Communication Technology (ICT) development influence on recently increasing Internet purchases by individuals for European Union member states. After a growing trend for Internet purchases in EU27 was noticed, all possible regression analysis was applied using nine independent variables in 2011. Finally, two linear regression models were studied in detail. Conducted simple linear regression analysis confirmed the research hypothesis that the Internet purchases in analysed EU countries is positively correlated with statistically significant variable Gross Domestic Product per capita (GDPpc). Also, analysed multiple linear regression model with four regressors, showing ICT development level, indicates that ICT development is crucial for explaining the Internet purchases by individuals, confirming the research hypothesis.

Keywords: European union, Internet purchases, multiple linear regression model, outlier

Procedia PDF Downloads 292
4472 Contemporary Mexican Shadow Politics: The War on Drugs and the Issue of Security

Authors: Lisdey Espinoza Pedraza

Abstract:

Organised crime in Mexico evolves faster that our capacity to understand and explain it. Organised gangs have become successful entrepreneurs in many ways ad they have somehow mimicked the working ways of the authorities and in many cases, they have successfully infiltrated the governmental spheres. This business model is only possible under a clear scheme of rampant impunity. Impunity, however, is not exclusive to the PRI. Nor the PRI, PAN, or PRD can claim the monopoly of corruption, but what is worse is that none can claim full honesty in their acts either. The current security crisis in Mexico shows a crisis in the Mexican political party system. Corruption today is not only a problem of dishonesty and the correct use of public resources. It is the principal threat to Mexican democracy, governance, and national security.

Keywords: security, war on drugs, drug trafficking, Mexico, Latin America, United States

Procedia PDF Downloads 408
4471 A Comparative Study of Advaita Vedanta’s Doctrine of Illusion (Māyāvāda) as the Basis for the Claim That ‘I Am Brahman’

Authors: Boran Akin Demir

Abstract:

Notions such as ‘I’, ‘self’, and ‘mind’ are typically used synonymously in Western dualist philosophy, in a way that distances itself from the material world. This has rendered it increasingly difficult for the dualist Western philosopher to truly understand the Vedantic claim that all is one, and ultimately that ‘I am Brahman’. In Advaita Vedanta, we are introduced to one of the most exhilarating theories of non-dualism through its Doctrine of Illusion. This paper approaches the issue through a comparative study between seemingly unrelated thinkers – namely, Jalaluddin Rumi, Lao Tzu, and Plato. The broadness of this research in such alternative schools of thought aims to show the underlying unity that successfully presents itself through time and space, thus upholding the philosophy of Advaita Vedanta from all corners of the world.

Keywords: Advaita Vedanta, Brahman, Lao Tzu, Plato, Rumi

Procedia PDF Downloads 137
4470 Profitability Analysis of Investment in Oil Palm Value Chain in Osun State, Nigeria

Authors: Moyosooore A. Babalola, Ayodeji S. Ogunleye

Abstract:

The main focus of the study was to determine the profitability of investment in the Oil Palm value chain of Osun State, Nigeria in 2015. The specific objectives were to describe the socio-economic characteristics of Oil Palm investors (producers, processors and marketers), to determine the profitability of the investment to investors in the Oil Palm value chain, and to determine the factors affecting the profitability of the investment of the oil palm investors in Osun state. A sample of 100 respondents was selected in this cross-sectional survey. Multiple stage sampling procedure was used for data collection of producers and processors while purposive sampling was used for marketers. Data collected was analyzed using the following analytical tools: descriptive statistics, budgetary analysis and regression analysis. The results of the gross margin showed that the producers and processors were more profitable than the marketers in the oil palm value chain with their benefit-cost ratios as 1.93, 1.82 and 1.11 respectively. The multiple regression analysis showed that education and years of experience were significant among marketers and producers while age and years of experience had significant influence on the gross margin of processors. Based on these findings, improvement on the level of education of oil palm investors is recommended in order to address the relatively low access to post-primary education among the oil palm investors in Osun State. In addition to this, it is important that training be made available to oil palm investors. This will improve the quality of their years of experience, ensuring that it has a positive influence on their gross margin. Low access to credit among processors and producer could be corrected by making extension services available to them. Marketers would also greatly benefit from subsidized prices on oil palm products to increase their gross margin, as the huge percentage of their total cost comes from acquiring palm oil.

Keywords: oil palm, profitability analysis, regression analysis, value chain

Procedia PDF Downloads 349
4469 Copula-Based Estimation of Direct and Indirect Effects in Path Analysis Models

Authors: Alam Ali, Ashok Kumar Pathak

Abstract:

Path analysis is a statistical technique used to evaluate the direct and indirect effects of variables in path models. One or more structural regression equations are used to estimate a series of parameters in path models to find the better fit of data. However, sometimes the assumptions of classical regression models, such as ordinary least squares (OLS), are violated by the nature of the data, resulting in insignificant direct and indirect effects of exogenous variables. This article aims to explore the effectiveness of a copula-based regression approach as an alternative to classical regression, specifically when variables are linked through an elliptical copula.

Keywords: path analysis, copula-based regression models, direct and indirect effects, k-fold cross validation technique

Procedia PDF Downloads 20
4468 The Impact of Financial Risk on Banks’ Financial Performance: A Comparative Study of Islamic Banks and Conventional Banks in Pakistan

Authors: Mohammad Yousaf Safi Mohibullah Afghan

Abstract:

The study made on Islamic and conventional banks scrutinizes the risks interconnected with credit and liquidity on the productivity performance of Islamic and conventional banks that operate in Pakistan. Among the banks, only 4 Islamic and 18 conventional banks have been selected to enrich the result of our study on Islamic banks performance in connection to conventional banks. The selection of the banks to the panel is based on collecting quarterly unbalanced data ranges from the first quarter of 2007 to the last quarter of 2017. The data are collected from the Bank’s web sites and State Bank of Pakistan. The data collection is carried out based on Delta-method test. The mentioned test is used to find out the empirical results. In the study, while collecting data on the banks, the return on assets and return on equity have been major factors that are used assignificant proxies in determining the profitability of the banks. Moreover, another major proxy is used in measuring credit and liquidity risks, the loan loss provision to total loan and the ratio of liquid assets to total liability. Meanwhile, with consideration to the previous literature, some other variables such as bank size, bank capital, bank branches, and bank employees have been used to tentatively control the impact of those factors whose direct and indirect effects on profitability is understood. In conclusion, the study emphasizes that credit risk affects return on asset and return on equity positively, and there is no significant difference in term of credit risk between Islamic and conventional banks. Similarly, the liquidity risk has a significant impact on the bank’s profitability, though the marginal effect of liquidity risk is higher for Islamic banks than conventional banks.

Keywords: islamic & conventional banks, performance return on equity, return on assets, pakistan banking sectors, profitibility

Procedia PDF Downloads 146
4467 Household Size and Poverty Rate: Evidence from Nepal

Authors: Basan Shrestha

Abstract:

The relationship between the household size and the poverty is not well understood. Malthus followers advocate that the increasing population add pressure to the dwindling resource base due to increasing demand that would lead to poverty. Others claim that bigger households are richer due to availability of household labour for income generation activities. Facts from Nepal were analyzed to examine the relationship between the household size and poverty rate. The analysis of data from 3,968 Village Development Committee (VDC)/ municipality (MP) located in 75 districts of all five development regions revealed that the average household size had moderate positive correlation with the poverty rate (Karl Pearson's correlation coefficient=0.44). In a regression analysis, the household size determined 20% of the variation in the poverty rate. Higher positive correlation was observed in eastern Nepal (Karl Pearson's correlation coefficient=0.66). The regression analysis showed that the household size determined 43% of the variation in the poverty rate in east. The relation was poor in far-west. It could be because higher incidence of poverty was there irrespective of household size. Overall, the facts revealed that the bigger households were relatively poorer. With the increasing level of awareness and interventions for family planning, it is anticipated that the household size will decrease leading to the decreased poverty rate. In addition, the government needs to devise a mechanism to create employment opportunities for the household labour force to reduce poverty.

Keywords: household size, poverty rate, nepal, regional development

Procedia PDF Downloads 351
4466 Optimization of Slider Crank Mechanism Using Design of Experiments and Multi-Linear Regression

Authors: Galal Elkobrosy, Amr M. Abdelrazek, Bassuny M. Elsouhily, Mohamed E. Khidr

Abstract:

Crank shaft length, connecting rod length, crank angle, engine rpm, cylinder bore, mass of piston and compression ratio are the inputs that can control the performance of the slider crank mechanism and then its efficiency. Several combinations of these seven inputs are used and compared. The throughput engine torque predicted by the simulation is analyzed through two different regression models, with and without interaction terms, developed according to multi-linear regression using LU decomposition to solve system of algebraic equations. These models are validated. A regression model in seven inputs including their interaction terms lowered the polynomial degree from 3rd degree to 1st degree and suggested valid predictions and stable explanations.

Keywords: design of experiments, regression analysis, SI engine, statistical modeling

Procedia PDF Downloads 172
4465 Determinants of Rural Household Effective Demand for Biogas Technology in Southern Ethiopia

Authors: Mesfin Nigussie

Abstract:

The objectives of the study were to identify factors affecting rural households’ willingness to install biogas plant and amount willingness to pay in order to examine determinants of effective demand for biogas technology. A multistage sampling technique was employed to select 120 respondents for the study. The binary probit regression model was employed to identify factors affecting rural households’ decision to install biogas technology. The probit model result revealed that household size, total household income, access to extension services related to biogas, access to credit service, proximity to water sources, perception of households about the quality of biogas, perception index about attributes of biogas, perception of households about installation cost of biogas and availability of energy source were statistically significant in determining household’s decision to install biogas. Tobit model was employed to examine determinants of rural household’s amount of willingness to pay. Based on the model result, age of the household head, total annual income of the household, access to extension service and availability of other energy source were significant variables that influence willingness to pay. Providing due considerations for extension services, availability of credit or subsidy, improving the quality of biogas technology design and minimizing cost of installation by using locally available materials are the main suggestions of this research that help to create effective demand for biogas technology.

Keywords: biogas technology, effective demand, probit model, tobit model, willingnes to pay

Procedia PDF Downloads 127
4464 The Emotional Implication of the Phraseological Fund Applied in Cognitive Business Negotiation

Authors: Kristine Dzagnidze

Abstract:

The paper equally centers on both the structural and cognitive linguistics in light of phraseologism and its emotional implication. Accordingly, the methods elaborated within the framework of both the systematic-structural and linguo-cognitive theories are identically relevant to the research of mine. In other words, through studying the negotiation process, our attention is drawn upon defining negotiations’ peculiarities, emotion, style and specifics of cognition, motives, aims, contextual characterizations and the quality of cultural context and integration. Besides, the totality of the concepts and methods is also referred to, which is connected with the stage of the development of the emotional linguistic thinking. The latter contextually correlates with the dominance of anthropocentric–communicative paradigm. The synthesis of structuralistic and cognitive perspectives has turned out to be relevant to our research, carried out in the form of intellectual action, that is, on the one hand, the adequacy of the research purpose to the expected results. On the other hand, the validity of methodology for formulating the objective conclusions needed for emotional connotation beyond phraseologism. The mechanism mentioned does not make a claim about a discovery of a new truth. Though, it gives the possibility of a novel interpretation of the content in existence.

Keywords: cognitivism, communication, implication, negotiation

Procedia PDF Downloads 250
4463 An Epsilon Hierarchical Fuzzy Twin Support Vector Regression

Authors: Arindam Chaudhuri

Abstract:

The research presents epsilon- hierarchical fuzzy twin support vector regression (epsilon-HFTSVR) based on epsilon-fuzzy twin support vector regression (epsilon-FTSVR) and epsilon-twin support vector regression (epsilon-TSVR). Epsilon-FTSVR is achieved by incorporating trapezoidal fuzzy numbers to epsilon-TSVR which takes care of uncertainty existing in forecasting problems. Epsilon-FTSVR determines a pair of epsilon-insensitive proximal functions by solving two related quadratic programming problems. The structural risk minimization principle is implemented by introducing regularization term in primal problems of epsilon-FTSVR. This yields dual stable positive definite problems which improves regression performance. Epsilon-FTSVR is then reformulated as epsilon-HFTSVR consisting of a set of hierarchical layers each containing epsilon-FTSVR. Experimental results on both synthetic and real datasets reveal that epsilon-HFTSVR has remarkable generalization performance with minimum training time.

Keywords: regression, epsilon-TSVR, epsilon-FTSVR, epsilon-HFTSVR

Procedia PDF Downloads 358
4462 Identifying Diabetic Retinopathy Complication by Predictive Techniques in Indian Type 2 Diabetes Mellitus Patients

Authors: Faiz N. K. Yusufi, Aquil Ahmed, Jamal Ahmad

Abstract:

Predicting the risk of diabetic retinopathy (DR) in Indian type 2 diabetes patients is immensely necessary. India, being the second largest country after China in terms of a number of diabetic patients, to the best of our knowledge not a single risk score for complications has ever been investigated. Diabetic retinopathy is a serious complication and is the topmost reason for visual impairment across countries. Any type or form of DR has been taken as the event of interest, be it mild, back, grade I, II, III, and IV DR. A sample was determined and randomly collected from the Rajiv Gandhi Centre for Diabetes and Endocrinology, J.N.M.C., A.M.U., Aligarh, India. Collected variables include patients data such as sex, age, height, weight, body mass index (BMI), blood sugar fasting (BSF), post prandial sugar (PP), glycosylated haemoglobin (HbA1c), diastolic blood pressure (DBP), systolic blood pressure (SBP), smoking, alcohol habits, total cholesterol (TC), triglycerides (TG), high density lipoprotein (HDL), low density lipoprotein (LDL), very low density lipoprotein (VLDL), physical activity, duration of diabetes, diet control, history of antihypertensive drug treatment, family history of diabetes, waist circumference, hip circumference, medications, central obesity and history of DR. Cox proportional hazard regression is used to design risk scores for the prediction of retinopathy. Model calibration and discrimination are assessed from Hosmer Lemeshow and area under receiver operating characteristic curve (ROC). Overfitting and underfitting of the model are checked by applying regularization techniques and best method is selected between ridge, lasso and elastic net regression. Optimal cut off point is chosen by Youden’s index. Five-year probability of DR is predicted by both survival function, and Markov chain two state model and the better technique is concluded. The risk scores developed can be applied by doctors and patients themselves for self evaluation. Furthermore, the five-year probabilities can be applied as well to forecast and maintain the condition of patients. This provides immense benefit in real application of DR prediction in T2DM.

Keywords: Cox proportional hazard regression, diabetic retinopathy, ROC curve, type 2 diabetes mellitus

Procedia PDF Downloads 170
4461 Nonparametric Truncated Spline Regression Model on the Data of Human Development Index in Indonesia

Authors: Kornelius Ronald Demu, Dewi Retno Sari Saputro, Purnami Widyaningsih

Abstract:

Human Development Index (HDI) is a standard measurement for a country's human development. Several factors may have influenced it, such as life expectancy, gross domestic product (GDP) based on the province's annual expenditure, the number of poor people, and the percentage of an illiterate people. The scatter plot between HDI and the influenced factors show that the plot does not follow a specific pattern or form. Therefore, the HDI's data in Indonesia can be applied with a nonparametric regression model. The estimation of the regression curve in the nonparametric regression model is flexible because it follows the shape of the data pattern. One of the nonparametric regression's method is a truncated spline. Truncated spline regression is one of the nonparametric approach, which is a modification of the segmented polynomial functions. The estimator of a truncated spline regression model was affected by the selection of the optimal knots point. Knot points is a focus point of spline truncated functions. The optimal knots point was determined by the minimum value of generalized cross validation (GCV). In this article were applied the data of Human Development Index with a truncated spline nonparametric regression model. The results of this research were obtained the best-truncated spline regression model to the HDI's data in Indonesia with the combination of optimal knots point 5-5-5-4. Life expectancy and the percentage of an illiterate people were the significant factors depend to the HDI in Indonesia. The coefficient of determination is 94.54%. This means the regression model is good enough to applied on the data of HDI in Indonesia.

Keywords: generalized cross validation (GCV), Human Development Index (HDI), knots point, nonparametric regression, truncated spline

Procedia PDF Downloads 322
4460 Regression Model Evaluation on Depth Camera Data for Gaze Estimation

Authors: James Purnama, Riri Fitri Sari

Abstract:

We investigate the machine learning algorithm selection problem in the term of a depth image based eye gaze estimation, with respect to its essential difficulty in reducing the number of required training samples and duration time of training. Statistics based prediction accuracy are increasingly used to assess and evaluate prediction or estimation in gaze estimation. This article evaluates Root Mean Squared Error (RMSE) and R-Squared statistical analysis to assess machine learning methods on depth camera data for gaze estimation. There are 4 machines learning methods have been evaluated: Random Forest Regression, Regression Tree, Support Vector Machine (SVM), and Linear Regression. The experiment results show that the Random Forest Regression has the lowest RMSE and the highest R-Squared, which means that it is the best among other methods.

Keywords: gaze estimation, gaze tracking, eye tracking, kinect, regression model, orange python

Procedia PDF Downloads 527
4459 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, WangQun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSQL), and gives 6 data cleaning methods based on these algorithms.

Keywords: data cleaning, dependency rules, violation data discovery, data repair

Procedia PDF Downloads 553
4458 Evolving Credit Scoring Models using Genetic Programming and Language Integrated Query Expression Trees

Authors: Alexandru-Ion Marinescu

Abstract:

There exist a plethora of methods in the scientific literature which tackle the well-established task of credit score evaluation. In its most abstract form, a credit scoring algorithm takes as input several credit applicant properties, such as age, marital status, employment status, loan duration, etc. and must output a binary response variable (i.e. “GOOD” or “BAD”) stating whether the client is susceptible to payment return delays. Data imbalance is a common occurrence among financial institution databases, with the majority being classified as “GOOD” clients (clients that respect the loan return calendar) alongside a small percentage of “BAD” clients. But it is the “BAD” clients we are interested in since accurately predicting their behavior is crucial in preventing unwanted loss for loan providers. We add to this whole context the constraint that the algorithm must yield an actual, tractable mathematical formula, which is friendlier towards financial analysts. To this end, we have turned to genetic algorithms and genetic programming, aiming to evolve actual mathematical expressions using specially tailored mutation and crossover operators. As far as data representation is concerned, we employ a very flexible mechanism – LINQ expression trees, readily available in the C# programming language, enabling us to construct executable pieces of code at runtime. As the title implies, they model trees, with intermediate nodes being operators (addition, subtraction, multiplication, division) or mathematical functions (sin, cos, abs, round, etc.) and leaf nodes storing either constants or variables. There is a one-to-one correspondence between the client properties and the formula variables. The mutation and crossover operators work on a flattened version of the tree, obtained via a pre-order traversal. A consequence of our chosen technique is that we can identify and discard client properties which do not take part in the final score evaluation, effectively acting as a dimensionality reduction scheme. We compare ourselves with state of the art approaches, such as support vector machines, Bayesian networks, and extreme learning machines, to name a few. The data sets we benchmark against amount to a total of 8, of which we mention the well-known Australian credit and German credit data sets, and the performance indicators are the following: percentage correctly classified, area under curve, partial Gini index, H-measure, Brier score and Kolmogorov-Smirnov statistic, respectively. Finally, we obtain encouraging results, which, although placing us in the lower half of the hierarchy, drive us to further refine the algorithm.

Keywords: expression trees, financial credit scoring, genetic algorithm, genetic programming, symbolic evolution

Procedia PDF Downloads 105
4457 Risk, Capital Buffers, and Bank Lending: The Adjustment of Euro Area Banks

Authors: Laurent Maurin, Mervi Toivanen

Abstract:

This paper estimates euro area banks’ internal target capital ratios and investigates whether banks’ adjustment to the targets have an impact on credit supply and holding of securities during the financial crisis in 2005-2011. Using data on listed banks and country-specific macro-variables a partial adjustment model is estimated in a panel context. The results indicate, firstly, that an increase in the riskiness of banks’ balance sheets influences positively on the target capital ratios. Secondly, the adjustment towards higher equilibrium capital ratios has a significant impact on banks’ assets. The impact is found to be more size-able on security holdings than on loans, thereby suggesting a pecking order.

Keywords: Euro area, capital ratios, credit supply, partial adjustment model

Procedia PDF Downloads 438
4456 Recovery of Damages by General Cargo Interest under Bill of Lading Carriage Contract

Authors: Eunice Chiamaka Allen-Ngbale

Abstract:

Cargo claims are brought by cargo interests against carriers when the goods are not delivered or delivered short or mis-delivered or delivered damaged. The objective of the cargo claimant is to seek recovery for the loss suffered through the award of damages against the carrier by a court of competent jurisdiction. Moreover, whether the vessel on which the goods were carried is or is not under charter, the bill of lading plays a central role in the cargo claim. Since the bill of lading is an important international transport document, this paper examines, by chronicling the progress of a cargo claim as governed by the English law of contract. It finds that other than by contract, there are other modes of recovery available to a consignee or endorsee of a bill of lading to obtain a remedy under the sui generis contract of carriage contained in or evidenced by a bill of lading.

Keywords: bill of lading, cargo interests, carriage contract, transfer of right of suit

Procedia PDF Downloads 135
4455 A Fourier Method for Risk Quantification and Allocation of Credit Portfolios

Authors: Xiaoyu Shen, Fang Fang, Chujun Qiu

Abstract:

Herewith we present a Fourier method for credit risk quantification and allocation in the factor-copula model framework. The key insight is that, compared to directly computing the cumulative distribution function of the portfolio loss via Monte Carlo simulation, it is, in fact, more efficient to calculate the transformation of the distribution function in the Fourier domain instead and inverting back to the real domain can be done in just one step and semi-analytically, thanks to the popular COS method (with some adjustments). We also show that the Euler risk allocation problem can be solved in the same way since it can be transformed into the problem of evaluating a conditional cumulative distribution function. Once the conditional or unconditional cumulative distribution function is known, one can easily calculate various risk metrics. The proposed method not only fills the niche in literature, to the best of our knowledge, of accurate numerical methods for risk allocation but may also serve as a much faster alternative to the Monte Carlo simulation method for risk quantification in general. It can cope with various factor-copula model choices, which we demonstrate via examples of a two-factor Gaussian copula and a two-factor Gaussian-t hybrid copula. The fast error convergence is proved mathematically and then verified by numerical experiments, in which Value-at-Risk, Expected Shortfall, and conditional Expected Shortfall are taken as examples of commonly used risk metrics. The calculation speed and accuracy are tested to be significantly superior to the MC simulation for real-sized portfolios. The computational complexity is, by design, primarily driven by the number of factors instead of the number of obligors, as in the case of Monte Carlo simulation. The limitation of this method lies in the "curse of dimension" that is intrinsic to multi-dimensional numerical integration, which, however, can be relaxed with the help of dimension reduction techniques and/or parallel computing, as we will demonstrate in a separate paper. The potential application of this method has a wide range: from credit derivatives pricing to economic capital calculation of the banking book, default risk charge and incremental risk charge computation of the trading book, and even to other risk types than credit risk.

Keywords: credit portfolio, risk allocation, factor copula model, the COS method, Fourier method

Procedia PDF Downloads 151
4454 Data Mining As A Tool For Knowledge Management: A Review

Authors: Maram Saleh

Abstract:

Knowledge has become an essential resource in today’s economy and become the most important asset of maintaining competition advantage in organizations. The importance of knowledge has made organizations to manage their knowledge assets and resources through all multiple knowledge management stages such as: Knowledge Creation, knowledge storage, knowledge sharing and knowledge use. Researches on data mining are continues growing over recent years on both business and educational fields. Data mining is one of the most important steps of the knowledge discovery in databases process aiming to extract implicit, unknown but useful knowledge and it is considered as significant subfield in knowledge management. Data miming have the great potential to help organizations to focus on extracting the most important information on their data warehouses. Data mining tools and techniques can predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions. This review paper explores the applications of data mining techniques in supporting knowledge management process as an effective knowledge discovery technique. In this paper, we identify the relationship between data mining and knowledge management, and then focus on introducing some application of date mining techniques in knowledge management for some real life domains.

Keywords: Data Mining, Knowledge management, Knowledge discovery, Knowledge creation.

Procedia PDF Downloads 196
4453 Generalized Extreme Value Regression with Binary Dependent Variable: An Application for Predicting Meteorological Drought Probabilities

Authors: Retius Chifurira

Abstract:

Logistic regression model is the most used regression model to predict meteorological drought probabilities. When the dependent variable is extreme, the logistic model fails to adequately capture drought probabilities. In order to adequately predict drought probabilities, we use the generalized linear model (GLM) with the quantile function of the generalized extreme value distribution (GEVD) as the link function. The method maximum likelihood estimation is used to estimate the parameters of the generalized extreme value (GEV) regression model. We compare the performance of the logistic and the GEV regression models in predicting drought probabilities for Zimbabwe. The performance of the regression models are assessed using the goodness-of-fit tests, namely; relative root mean square error (RRMSE) and relative mean absolute error (RMAE). Results show that the GEV regression model performs better than the logistic model, thereby providing a good alternative candidate for predicting drought probabilities. This paper provides the first application of GLM derived from extreme value theory to predict drought probabilities for a drought-prone country such as Zimbabwe.

Keywords: generalized extreme value distribution, general linear model, mean annual rainfall, meteorological drought probabilities

Procedia PDF Downloads 186