Search results for: regression models.
2815 Optimizing and Evaluating Performance Quality Control of the Production Process of Disposable Essentials Using Approach Vague Goal Programming
Authors: Hadi Gholizadeh, Ali Tajdin
Abstract:
To have effective production planning, it is necessary to control the quality of processes. This paper aims at improving the performance of the disposable essentials process using statistical quality control and goal programming in a vague environment. That is expressed uncertainty because there is always a measurement error in the real world. Therefore, in this study, the conditions are examined in a vague environment that is a distance-based environment. The disposable essentials process in Kach Company was studied. Statistical control tools were used to characterize the existing process for four factor responses including the average of disposable glasses’ weights, heights, crater diameters, and volumes. Goal programming was then utilized to find the combination of optimal factors setting in a vague environment which is measured to apply uncertainty of the initial information when some of the parameters of the models are vague; also, the fuzzy regression model is used to predict the responses of the four described factors. Optimization results show that the process capability index values for disposable glasses’ average of weights, heights, crater diameters and volumes were improved. Such increasing the quality of the products and reducing the waste, which will reduce the cost of the finished product, and ultimately will bring customer satisfaction, and this satisfaction, will mean increased sales.Keywords: Goal programming, quality control, vague environment, disposable glasses’ optimization, fuzzy regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10402814 AnQL: A Query Language for Annotation Documents
Authors: Neerja Bhatnagar, Ben A. Juliano, Renee S. Renner
Abstract:
This paper presents data annotation models at five levels of granularity (database, relation, column, tuple, and cell) of relational data to address the problem of unsuitability of most relational databases to express annotations. These models do not require any structural and schematic changes to the underlying database. These models are also flexible, extensible, customizable, database-neutral, and platform-independent. This paper also presents an SQL-like query language, named Annotation Query Language (AnQL), to query annotation documents. AnQL is simple to understand and exploits the already-existent wide knowledge and skill set of SQL.
Keywords: Annotation query language, data annotations, data annotation models, semantic data annotations.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18472813 Volatility Switching between Two Regimes
Authors: Josip Visković, Josip Arnerić, Ante Rozga
Abstract:
Based on the fact that volatility is time varying in high frequency data and that periods of high volatility tend to cluster, the most successful and popular models in modeling time varying volatility are GARCH type models. When financial returns exhibit sudden jumps that are due to structural breaks, standard GARCH models show high volatility persistence, i.e. integrated behavior of the conditional variance. In such situations models in which the parameters are allowed to change over time are more appropriate. This paper compares different GARCH models in terms of their ability to describe structural changes in returns caused by financial crisis at stock markets of six selected central and east European countries. The empirical analysis demonstrates that Markov regime switching GARCH model resolves the problem of excessive persistence and outperforms uni-regime GARCH models in forecasting volatility when sudden switching occurs in response to financial crisis.
Keywords: Central and east European countries, financial crisis, Markov switching GARCH model, transition probabilities.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25212812 ANN Models for Microstrip Line Synthesis and Analysis
Authors: Dr.K.Sri Rama Krishna, J.Lakshmi Narayana, Dr.L.Pratap Reddy
Abstract:
Microstrip lines, widely used for good reason, are broadband in frequency and provide circuits that are compact and light in weight. They are generally economical to produce since they are readily adaptable to hybrid and monolithic integrated circuit (IC) fabrication technologies at RF and microwave frequencies. Although, the existing EM simulation models used for the synthesis and analysis of microstrip lines are reasonably accurate, they are computationally intensive and time consuming. Neural networks recently gained attention as fast and flexible vehicles to microwave modeling, simulation and optimization. After learning and abstracting from microwave data, through a process called training, neural network models are used during microwave design to provide instant answers to the task learned.This paper presents simple and accurate ANN models for the synthesis and analysis of Microstrip lines to more accurately compute the characteristic parameters and the physical dimensions respectively for the required design specifications.Keywords: Neural Models, Algorithms, Microstrip Lines, Analysis, Synthesis
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21512811 Institutional Efficiency of Commonhold Industrial Parks Using a Polynomial Regression Model
Authors: Jeng-Wen Lin, Simon Chien-Yuan Chen
Abstract:
Based on assumptions of neo-classical economics and rational choice / public choice theory, this paper investigates the regulation of industrial land use in Taiwan by homeowners associations (HOAs) as opposed to traditional government administration. The comparison, which applies the transaction cost theory and a polynomial regression analysis, manifested that HOAs are superior to conventional government administration in terms of transaction costs and overall efficiency. A case study that compares Taiwan-s commonhold industrial park, NangKang Software Park, to traditional government counterparts using limited data on the costs and returns was analyzed. This empirical study on the relative efficiency of governmental and private institutions justified the important theoretical proposition. Numerical results prove the efficiency of the established model.Keywords: Homeowners Associations, Institutional Efficiency, Polynomial Regression, Transaction Cost.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15822810 An Owl Ontology for Commonkads Template Knowledge Models
Authors: B. A. Gobin, R. K. Subramanian
Abstract:
This paper gives an overview of how an OWL ontology has been created to represent template knowledge models defined in CML that are provided by CommonKADS. CommonKADS is a mature knowledge engineering methodology which proposes the use of template knowledge model for knowledge modelling. The aim of developing this ontology is to present the template knowledge model in a knowledge representation language that can be easily understood and shared in the knowledge engineering community. Hence OWL is used as it has become a standard for ontology and also it already has user friendly tools for viewing and editing.Keywords: Ontology, OWL, Template Knowledge Models, CommonKADS
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17942809 Comparison of Three Turbulence Models in Wear Prediction of Multi-Size Particulate Flow through Rotating Channel
Authors: Pankaj K. Gupta, Krishnan V. Pagalthivarthi
Abstract:
The present work compares the performance of three turbulence modeling approach (based on the two-equation k -ε model) in predicting erosive wear in multi-size dense slurry flow through rotating channel. All three turbulence models include rotation modification to the production term in the turbulent kineticenergy equation. The two-phase flow field obtained numerically using Galerkin finite element methodology relates the local flow velocity and concentration to the wear rate via a suitable wear model. The wear models for both sliding wear and impact wear mechanisms account for the particle size dependence. Results of predicted wear rates using the three turbulence models are compared for a large number of cases spanning such operating parameters as rotation rate, solids concentration, flow rate, particle size distribution and so forth. The root-mean-square error between FE-generated data and the correlation between maximum wear rate and the operating parameters is found less than 2.5% for all the three models.Keywords: Rotating channel, maximum wear rate, multi-sizeparticulate flow, k −ε turbulence models.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17722808 Modeling of Normal and Atherosclerotic Blood Vessels using Finite Element Methods and Artificial Neural Networks
Authors: K. Kamalanand, S. Srinivasan
Abstract:
Analysis of blood vessel mechanics in normal and diseased conditions is essential for disease research, medical device design and treatment planning. In this work, 3D finite element models of normal vessel and atherosclerotic vessel with 50% plaque deposition were developed. The developed models were meshed using finite number of tetrahedral elements. The developed models were simulated using actual blood pressure signals. Based on the transient analysis performed on the developed models, the parameters such as total displacement, strain energy density and entropy per unit volume were obtained. Further, the obtained parameters were used to develop artificial neural network models for analyzing normal and atherosclerotic blood vessels. In this paper, the objectives of the study, methodology and significant observations are presented.Keywords: Blood vessel, atherosclerosis, finite element model, artificial neural networks
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23082807 Interoperability in Component Based Software Development
Authors: M. Madiajagan, B. Vijayakumar
Abstract:
The ability of information systems to operate in conjunction with each other encompassing communication protocols, hardware, software, application, and data compatibility layers. There has been considerable work in industry on the development of component interoperability models, such as CORBA, (D)COM and JavaBeans. These models are intended to reduce the complexity of software development and to facilitate reuse of off-the-shelf components. The focus of these models is syntactic interface specification, component packaging, inter-component communications, and bindings to a runtime environment. What these models lack is a consideration of architectural concerns – specifying systems of communicating components, explicitly representing loci of component interaction, and exploiting architectural styles that provide well-understood global design solutions. The development of complex business applications is now focused on an assembly of components available on a local area network or on the net. These components must be localized and identified in terms of available services and communication protocol before any request. The first part of the article introduces the base concepts of components and middleware while the following sections describe the different up-todate models of communication and interaction and the last section shows how different models can communicate among themselves.
Keywords: Interoperability, component packaging, communication technology, heterogeneous platform, component interface, middleware.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27892806 Complex Condition Monitoring System of Aircraft Gas Turbine Engine
Authors: A. M. Pashayev, D. D. Askerov, C. Ardil, R. A. Sadiqov, P. S. Abdullayev
Abstract:
Researches show that probability-statistical methods application, especially at the early stage of the aviation Gas Turbine Engine (GTE) technical condition diagnosing, when the flight information has property of the fuzzy, limitation and uncertainty is unfounded. Hence the efficiency of application of new technology Soft Computing at these diagnosing stages with the using of the Fuzzy Logic and Neural Networks methods is considered. According to the purpose of this problem training with high accuracy of fuzzy multiple linear and non-linear models (fuzzy regression equations) which received on the statistical fuzzy data basis is made. For GTE technical condition more adequate model making dynamics of skewness and kurtosis coefficients- changes are analysed. Researches of skewness and kurtosis coefficients values- changes show that, distributions of GTE workand output parameters of the multiple linear and non-linear generalised models at presence of noise measured (the new recursive Least Squares Method (LSM)). The developed GTE condition monitoring system provides stage-by-stage estimation of engine technical conditions. As application of the given technique the estimation of the new operating aviation engine technical condition was made.Keywords: aviation gas turbine engine, technical condition, fuzzy logic, neural networks, fuzzy statistics
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25452805 Electricity Price Forecasting: A Comparative Analysis with Shallow-ANN and DNN
Authors: Fazıl Gökgöz, Fahrettin Filiz
Abstract:
Electricity prices have sophisticated features such as high volatility, nonlinearity and high frequency that make forecasting quite difficult. Electricity price has a volatile and non-random character so that, it is possible to identify the patterns based on the historical data. Intelligent decision-making requires accurate price forecasting for market traders, retailers, and generation companies. So far, many shallow-ANN (artificial neural networks) models have been published in the literature and showed adequate forecasting results. During the last years, neural networks with many hidden layers, which are referred to as DNN (deep neural networks) have been using in the machine learning community. The goal of this study is to investigate electricity price forecasting performance of the shallow-ANN and DNN models for the Turkish day-ahead electricity market. The forecasting accuracy of the models has been evaluated with publicly available data from the Turkish day-ahead electricity market. Both shallow-ANN and DNN approach would give successful result in forecasting problems. Historical load, price and weather temperature data are used as the input variables for the models. The data set includes power consumption measurements gathered between January 2016 and December 2017 with one-hour resolution. In this regard, forecasting studies have been carried out comparatively with shallow-ANN and DNN models for Turkish electricity markets in the related time period. The main contribution of this study is the investigation of different shallow-ANN and DNN models in the field of electricity price forecast. All models are compared regarding their MAE (Mean Absolute Error) and MSE (Mean Square) results. DNN models give better forecasting performance compare to shallow-ANN. Best five MAE results for DNN models are 0.346, 0.372, 0.392, 0,402 and 0.409.Keywords: Deep learning, artificial neural networks, energy price forecasting, Turkey.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10982804 Human Pose Estimation using Active Shape Models
Authors: Changhyuk Jang, Keechul Jung
Abstract:
Human pose estimation can be executed using Active Shape Models. The existing techniques for applying to human-body research using Active Shape Models, such as human detection, primarily take the form of silhouette of human body. This technique is not able to estimate accurately for human pose to concern two arms and legs, as the silhouette of human body represents the shape as out of round. To solve this problem, we applied the human body model as stick-figure, “skeleton". The skeleton model of human body can give consideration to various shapes of human pose. To obtain effective estimation result, we applied background subtraction and deformed matching algorithm of primary Active Shape Models in the fitting process. The images which were used to make the model were 600 human bodies, and the model has 17 landmark points which indicate body junction and key features of human pose. The maximum iteration for the fitting process was 30 times and the execution time was less than .03 sec.
Keywords: Active shape models, skeleton, pose estimation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24172803 Factors Affecting Slot Machine Performance in an Electronic Gaming Machine Facility
Authors: Etienne Provencal, David L. St-Pierre
Abstract:
A facility exploiting only electronic gambling machines (EGMs) opened in 2007 in Quebec City, Canada under the name of Salons de Jeux du Québec (SdjQ). This facility is one of the first worldwide to rely on that business model. This paper models the performance of such EGMs. The interest from a managerial point of view is to identify the variables that can be controlled or influenced so that a comprehensive model can help improve the overall performance of the business. The EGM individual performance model contains eight different variables under study (Game Title, Progressive jackpot, Bonus Round, Minimum Coin-in, Maximum Coin-in, Denomination, Slant Top and Position). Using data from Quebec City’s SdjQ, a linear regression analysis explains 90.80% of the EGM performance. Moreover, results show a behavior slightly different than that of a casino. The addition of GameTitle as a factor to predict the EGM performance is one of the main contributions of this paper. The choice of the game (GameTitle) is very important. Games having better position do not have significantly better performance than games located elsewhere on the gaming floor. Progressive jackpots have a positive and significant effect on the individual performance of EGMs. The impact of BonusRound on the dependent variable is significant but negative. The effect of Denomination is significant but weakly negative. As expected, the Language of an EGMS does not impact its individual performance. This paper highlights some possible improvements by indicating which features are performing well. Recommendations are given to increase the performance of the EGMs performance.
Keywords: EGM, linear regression, model prediction, slot operations.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15652802 The Impact of Governance on Happiness: Evidence from Quantile Regressions
Authors: Chiung-Ju Huang
Abstract:
This study utilizes the quantile regression analysis to examine the impact of governance (including democratic quality and technical quality) on happiness in 101 countries worldwide, classified as “developed countries” and “developing countries”. The empirical results show that the impact of democratic quality and technical quality on happiness is significantly positive for “developed countries”, while is insignificant for “developing countries”. The results suggest that the authorities in developed countries can enhance the level of individual happiness by means of improving the democracy quality and technical quality. However, for developing countries, promoting the quality of governance in order to enhance the level of happiness may not be effective. Policy makers in developed countries may pay more attention on increasing real GDP per capita instead of promoting the quality of governance to enhance individual happiness.
Keywords: Governance, happiness, multiple regression, quantile regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17022801 Non-Methane Hydrocarbons Emission during the Photocopying Process
Authors: Kiurski S. Jelena, Aksentijević M. Snežana, Kecić S. Vesna, Oros B. Ivana
Abstract:
Prosperity of electronic equipment in photocopying environment not only has improved work efficiency, but also has changed indoor air quality. Considering the number of photocopying employed, indoor air quality might be worse than in general office environments. Determining the contribution from any type of equipment to indoor air pollution is a complex matter. Non-methane hydrocarbons are known to have an important role on air quality due to their high reactivity. The presence of hazardous pollutants in indoor air has been detected in one photocopying shop in Novi Sad, Serbia. Air samples were collected and analyzed for five days, during 8-hr working time in three time intervals, whereas three different sampling points were determined. Using multiple linear regression model and software package STATISTICA 10 the concentrations of occupational hazards and microclimates parameters were mutually correlated. Based on the obtained multiple coefficients of determination (0.3751, 0.2389 and 0.1975), a weak positive correlation between the observed variables was determined. Small values of parameter F indicated that there was no statistically significant difference between the concentration levels of nonmethane hydrocarbons and microclimates parameters. The results showed that variable could be presented by the general regression model: y = b0 + b1xi1+ b2xi2. Obtained regression equations allow to measure the quantitative agreement between the variables and thus obtain more accurate knowledge of their mutual relations.Keywords: Indoor air quality, multiple regression analysis, nonmethane hydrocarbons, photocopying process.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19752800 A Robust LS-SVM Regression
Authors: József Valyon, Gábor Horváth
Abstract:
In comparison to the original SVM, which involves a quadratic programming task; LS–SVM simplifies the required computation, but unfortunately the sparseness of standard SVM is lost. Another problem is that LS-SVM is only optimal if the training samples are corrupted by Gaussian noise. In Least Squares SVM (LS–SVM), the nonlinear solution is obtained, by first mapping the input vector to a high dimensional kernel space in a nonlinear fashion, where the solution is calculated from a linear equation set. In this paper a geometric view of the kernel space is introduced, which enables us to develop a new formulation to achieve a sparse and robust estimate.Keywords: Support Vector Machines, Least Squares SupportVector Machines, Regression, Sparse approximation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20632799 Simulation of Reactive Distillation: Comparison of Equilibrium and Nonequilibrium Stage Models
Authors: Asfaw Gezae Daful
Abstract:
In the present study, two distinctly different approaches are followed for modeling of reactive distillation column, the equilibrium stage model and the nonequilibrium stage model. These models are simulated with a computer code developed in the present study using MATLAB programming. In the equilibrium stage models, the vapor and liquid phases are assumed to be in equilibrium and allowance is made for finite reaction rates, where as in the nonequilibrium stage models simultaneous mass transfer and reaction rates are considered. These simulated model results are validated from the experimental data reported in the literature. The simulated results of equilibrium and nonequilibrium models are compared for concentration, temperature and reaction rate profiles in a reactive distillation column for Methyl Tert Butyle Ether (MTBE) production. Both the models show similar trend for the concentration, temperature and reaction rate profiles but the nonequilibrium model predictions are higher and closer to the experimental values reported in the literature.
Keywords: Reactive Distillation, Equilibrium model, Nonequilibrium model, Methyl Tert-Butyl Ether
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 42072798 Classification of Business Models of Italian Bancassurance by Balance Sheet Indicators
Authors: Andrea Bellucci, Martina Tofi
Abstract:
The aim of paper is to analyze business models of bancassurance in Italy for life business. The life insurance business is very developed in the Italian market and banks branches have 80% of the market share. Given its maturity, the life insurance market needs to consolidate its organizational form to allow for the development of non-life business, which nowadays collects few premiums but represents a great opportunity to enlarge the market share of bancassurance using its strength in the distribution channel while the market share of independent agents is decreasing. Starting with the main business model of bancassurance for life business, this paper will analyze the performances of life companies in the Italian market by balance sheet indicators and by main discriminant variables of business models. The study will observe trends from 2013 to 2015 for the Italian market by exploiting a database managed by Associazione Nazionale delle Imprese di Assicurazione (ANIA). The applied approach is based on a bottom-up analysis starting with variables and indicators to define business models’ classification. The statistical classification algorithm proposed by Ward is employed to design business models’ profiles. Results from the analysis will be a representation of the main business models built by their profile related to indicators. In that way, an unsupervised analysis is developed that has the limit of its judgmental dimension based on research opinion, but it is possible to obtain a design of effective business models.
Keywords: Balance sheet indicators, Bancassurance, business models, ward algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12622797 Acute Coronary Syndrome Prediction Using Data Mining Techniques- An Application
Authors: Tahseen A. Jilani, Huda Yasin, Madiha Yasin, C. Ardil
Abstract:
In this paper we use data mining techniques to investigate factors that contribute significantly to enhancing the risk of acute coronary syndrome. We assume that the dependent variable is diagnosis – with dichotomous values showing presence or absence of disease. We have applied binary regression to the factors affecting the dependent variable. The data set has been taken from two different cardiac hospitals of Karachi, Pakistan. We have total sixteen variables out of which one is assumed dependent and other 15 are independent variables. For better performance of the regression model in predicting acute coronary syndrome, data reduction techniques like principle component analysis is applied. Based on results of data reduction, we have considered only 14 out of sixteen factors.
Keywords: Acute coronary syndrome (ACS), binary logistic regression analyses, myocardial ischemia (MI), principle component analysis, unstable angina (U.A.).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21142796 Mathematical Rescheduling Models for Railway Services
Authors: Zuraida Alwadood, Adibah Shuib, Norlida Abd Hamid
Abstract:
This paper presents the review of past studies concerning mathematical models for rescheduling passenger railway services, as part of delay management in the occurrence of railway disruption. Many past mathematical models highlighted were aimed at minimizing the service delays experienced by passengers during service disruptions. Integer programming (IP) and mixed-integer programming (MIP) models are critically discussed, focusing on the model approach, decision variables, sets and parameters. Some of them have been tested on real-life data of railway companies worldwide, while a few have been validated on fictive data. Based on selected literatures on train rescheduling, this paper is able to assist researchers in the model formulation by providing comprehensive analyses towards the model building. These analyses would be able to help in the development of new approaches in rescheduling strategies or perhaps to enhance the existing rescheduling models and make them more powerful or more applicable with shorter computing time.
Keywords: Mathematical modelling, Mixed-integer programming, Railway rescheduling, Service delays.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 32522795 New Regression Model and I-Kaz Method for Online Cutting Tool Wear Monitoring
Authors: Jaharah A. Ghani, Muhammad Rizal, Ahmad Sayuti, Mohd Zaki Nuawi, Mohd Nizam Ab. Rahman, Che Hassan Che Haron
Abstract:
This study presents a new method for detecting the cutting tool wear based on the measured cutting force signals using the regression model and I-kaz method. The detection of tool wear was done automatically using the in-house developed regression model and 3D graphic presentation of I-kaz 3D coefficient during machining process. The machining tests were carried out on a CNC turning machine Colchester Master Tornado T4 in dry cutting condition, and Kistler 9255B dynamometer was used to measure the cutting force signals, which then stored and displayed in the DasyLab software. The progression of the cutting tool flank wear land (VB) was indicated by the amount of the cutting force generated. Later, the I-kaz was used to analyze all the cutting force signals from beginning of the cut until the rejection stage of the cutting tool. Results of the IKaz analysis were represented by various characteristic of I-kaz 3D coefficient and 3D graphic presentation. The I-kaz 3D coefficient number decreases when the tool wear increases. This method can be used for real time tool wear monitoring.Keywords: mathematical model, I-kaz method, tool wear
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23992794 The Classification Model for Hard Disk Drive Functional Tests under Sparse Data Conditions
Authors: S. Pattanapairoj, D. Chetchotsak
Abstract:
This paper proposed classification models that would be used as a proxy for hard disk drive (HDD) functional test equitant which required approximately more than two weeks to perform the HDD status classification in either “Pass" or “Fail". These models were constructed by using committee network which consisted of a number of single neural networks. This paper also included the method to solve the problem of sparseness data in failed part, which was called “enforce learning method". Our results reveal that the constructed classification models with the proposed method could perform well in the sparse data conditions and thus the models, which used a few seconds for HDD classification, could be used to substitute the HDD functional tests.Keywords: Sparse data, Classifications, Committee network
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17362793 Developing an Advanced Algorithm Capable of Classifying News, Articles and Other Textual Documents Using Text Mining Techniques
Authors: R. B. Knudsen, O. T. Rasmussen, R. A. Alphinas
Abstract:
The reason for conducting this research is to develop an algorithm that is capable of classifying news articles from the automobile industry, according to the competitive actions that they entail, with the use of Text Mining (TM) methods. It is needed to test how to properly preprocess the data for this research by preparing pipelines which fits each algorithm the best. The pipelines are tested along with nine different classification algorithms in the realm of regression, support vector machines, and neural networks. Preliminary testing for identifying the optimal pipelines and algorithms resulted in the selection of two algorithms with two different pipelines. The two algorithms are Logistic Regression (LR) and Artificial Neural Network (ANN). These algorithms are optimized further, where several parameters of each algorithm are tested. The best result is achieved with the ANN. The final model yields an accuracy of 0.79, a precision of 0.80, a recall of 0.78, and an F1 score of 0.76. By removing three of the classes that created noise, the final algorithm is capable of reaching an accuracy of 94%.
Keywords: Artificial neural network, competitive dynamics, logistic regression, text classification, text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5362792 Imputing Missing Data in Electronic Health Records: A Comparison of Linear and Non-Linear Imputation Models
Authors: Alireza Vafaei Sadr, Vida Abedi, Jiang Li, Ramin Zand
Abstract:
Missing data is a common challenge in medical research and can lead to biased or incomplete results. When the data bias leaks into models, it further exacerbates health disparities; biased algorithms can lead to misclassification and reduced resource allocation and monitoring as part of prevention strategies for certain minorities and vulnerable segments of patient populations, which in turn further reduce data footprint from the same population – thus, a vicious cycle. This study compares the performance of six imputation techniques grouped into Linear and Non-Linear models, on two different real-world electronic health records (EHRs) datasets, representing 17864 patient records. The mean absolute percentage error (MAPE) and root mean squared error (RMSE) are used as performance metrics, and the results show that the Linear models outperformed the Non-Linear models in terms of both metrics. These results suggest that sometimes Linear models might be an optimal choice for imputation in laboratory variables in terms of imputation efficiency and uncertainty of predicted values.
Keywords: EHR, Machine Learning, imputation, laboratory variables, algorithmic bias.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1782791 Study on Optimal Control Strategy of PM2.5 in Wuhan, China
Authors: Qiuling Xie, Shanliang Zhu, Zongdi Sun
Abstract:
In this paper, we analyzed the correlation relationship among PM2.5 from other five Air Quality Indices (AQIs) based on the grey relational degree, and built a multivariate nonlinear regression equation model of PM2.5 and the five monitoring indexes. For the optimal control problem of PM2.5, we took the partial large Cauchy distribution of membership equation as satisfaction function. We established a nonlinear programming model with the goal of maximum performance to price ratio. And the optimal control scheme is given.
Keywords: Grey relational degree, multiple linear regression, membership function, nonlinear programming.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14082790 A Study on a Research and Development Cost-Estimation Model in Korea
Authors: Babakina Alexandra, Yong Soo Kim
Abstract:
In this study, we analyzed the factors that affect research funds using linear regression analysis to increase the effectiveness of investments in national research projects. We collected 7,916 items of data on research projects that were in the process of being finished or were completed between 2010 and 2011. Data pre-processing and visualization were performed to derive statistically significant results. We identified factors that affected funding using analysis of fit distributions and estimated increasing or decreasing tendencies based on these factors.
Keywords: R&D funding, Cost estimation, Linear regression, Preliminary feasibility study.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22482789 Comparison of Different k-NN Models for Speed Prediction in an Urban Traffic Network
Authors: Seyoung Kim, Jeongmin Kim, Kwang Ryel Ryu
Abstract:
A database that records average traffic speeds measured at five-minute intervals for all the links in the traffic network of a metropolitan city. While learning from this data the models that can predict future traffic speed would be beneficial for the applications such as the car navigation system, building predictive models for every link becomes a nontrivial job if the number of links in a given network is huge. An advantage of adopting k-nearest neighbor (k-NN) as predictive models is that it does not require any explicit model building. Instead, k-NN takes a long time to make a prediction because it needs to search for the k-nearest neighbors in the database at prediction time. In this paper, we investigate how much we can speed up k-NN in making traffic speed predictions by reducing the amount of data to be searched for without a significant sacrifice of prediction accuracy. The rationale behind this is that we had a better look at only the recent data because the traffic patterns not only repeat daily or weekly but also change over time. In our experiments, we build several different k-NN models employing different sets of features which are the current and past traffic speeds of the target link and the neighbor links in its up/down-stream. The performances of these models are compared by measuring the average prediction accuracy and the average time taken to make a prediction using various amounts of data.Keywords: Big data, k-NN, machine learning, traffic speed prediction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13762788 Air Pollution and Respiratory-Related Restricted Activity Days in Tunisia
Authors: Mokhtar Kouki Inès Rekik
Abstract:
This paper focuses on the assessment of the air pollution and morbidity relationship in Tunisia. Air pollution is measured by ozone air concentration and the morbidity is measured by the number of respiratory-related restricted activity days during the 2-week period prior to the interview. Socioeconomic data are also collected in order to adjust for any confounding covariates. Our sample is composed by 407 Tunisian respondents; 44.7% are women, the average age is 35.2, near 69% are living in a house built after 1980, and 27.8% have reported at least one day of respiratory-related restricted activity. The model consists on the regression of the number of respiratory-related restricted activity days on the air quality measure and the socioeconomic covariates. In order to correct for zero-inflation and heterogeneity, we estimate several models (Poisson, negative binomial, zero inflated Poisson, Poisson hurdle, negative binomial hurdle and finite mixture Poisson models). Bootstrapping and post-stratification techniques are used in order to correct for any sample bias. According to the Akaike information criteria, the hurdle negative binomial model has the greatest goodness of fit. The main result indicates that, after adjusting for socioeconomic data, the ozone concentration increases the probability of positive number of restricted activity days.
Keywords: Bootstrapping, hurdle negbin model, overdispersion, ozone concentration, respiratory-related restricted activity days.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21372787 A Hybrid System of Hidden Markov Models and Recurrent Neural Networks for Learning Deterministic Finite State Automata
Authors: Pavan K. Rallabandi, Kailash C. Patidar
Abstract:
In this paper, we present an optimization technique or a learning algorithm using the hybrid architecture by combining the most popular sequence recognition models such as Recurrent Neural Networks (RNNs) and Hidden Markov models (HMMs). In order to improve the sequence/pattern recognition/classification performance by applying a hybrid/neural symbolic approach, a gradient descent learning algorithm is developed using the Real Time Recurrent Learning of Recurrent Neural Network for processing the knowledge represented in trained Hidden Markov Models. The developed hybrid algorithm is implemented on automata theory as a sample test beds and the performance of the designed algorithm is demonstrated and evaluated on learning the deterministic finite state automata.Keywords: Hybrid systems, Hidden Markov Models, Recurrent neural networks, Deterministic finite state automata.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28852786 Transformation Method CIM to PIM: From Business Processes Models Defined in BPMN to Use Case and Class Models Defined in UML
Authors: Y. Rhazali, Y. Hadi, A. Mouloudi
Abstract:
This paper proposes a method to automatic transformation of CIM level to PIM level respecting the MDA approach. Our proposal is based on creating a good CIM level through well-defined rules allowing as achieving rich models that contain relevant information to facilitate the task of the transformation to the PIM level. We define, thereafter, an appropriate PIM level through various UML diagram. Next, we propose set rules to move from CIM to PIM. Our method follows the MDA approach by considering the business dimension in the CIM level through the use BPMN, standard modeling business of OMG, and the use of UML in PIM advocated by MDA in this level.
Keywords: Model transformation, MDA, CIM, PIM.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3673