Search results for: Posterior Predictive Distributions
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 625

Search results for: Posterior Predictive Distributions

85 Predictive Analytics of Student Performance Determinants in Education

Authors: Mahtab Davari, Charles Edward Okon, Somayeh Aghanavesi

Abstract:

Every institute of learning is usually interested in the performance of enrolled students. The level of these performances determines the approach an institute of study may adopt in rendering academic services. The focus of this paper is to evaluate students' academic performance in given courses of study using machine learning methods. This study evaluated various supervised machine learning classification algorithms such as Logistic Regression (LR), Support Vector Machine (SVM), Random Forest, Decision Tree, K-Nearest Neighbors, Linear Discriminant Analysis (LDA), and Quadratic Discriminant Analysis, using selected features to predict study performance. The accuracy, precision, recall, and F1 score obtained from a 5-Fold Cross-Validation were used to determine the best classification algorithm to predict students’ performances. SVM (using a linear kernel), LDA, and LR were identified as the best-performing machine learning methods. Also, using the LR model, this study identified students' educational habits such as reading and paying attention in class as strong determinants for a student to have an above-average performance. Other important features include the academic history of the student and work. Demographic factors such as age, gender, high school graduation, etc., had no significant effect on a student's performance.

Keywords: Student performance, supervised machine learning, prediction, classification, cross-validation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 547
84 Geotechnical Properties and Compressibility Behavior of Organic Dredged Soils

Authors: Inci Develioglu, Hasan Firat Pulat

Abstract:

Sustainable development is one of the most important topics in today's world, and it is also an important research topic for geoenvironmental engineering. Dredging process is performed to expand the river and port channel, flood control and accessing harbors. Every year large amount of sediment are dredged for these purposes. Dredged marine soils can be reused as filling materials, road and foundation embankments, construction materials and wildlife habitat developments. In this study, geotechnical engineering properties and compressibility behavior of dredged soil obtained from the Izmir Bay were investigated. The samples with four different organic matter contents were obtained and particle size distributions, consistency limits, pH and specific gravity tests were performed. The consolidation tests were conducted to examine organic matter content (OMC) effects on compressibility behavior of dredged soil. This study has shown that the OMC has an important effect on the engineering properties of dredged soils. The liquid and plastic limits increased with increasing OMC. The lowest specific gravity belonged to sample which has the maximum OMC. The specific gravity values ranged between 2.76 and 2.52. The maximum void ratio difference belongs to sample with the highest OMC (De11% = 0.38). As the organic matter content of the samples increases, the change in the void ratio has also increased. The compression index increases with increasing OMC.

Keywords: Compressibility, consolidation, geotechnical properties, organic matter content, organic soils.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1956
83 Modeling Non-Darcy Natural Convection Flow of a Micropolar Dusty Fluid with Convective Boundary Condition

Authors: F. M. Hady, A. Mahdy, R. A. Mohamed, Omima A. Abo Zaid

Abstract:

A numerical approach of the effectiveness of numerous parameters on magnetohydrodynamic (MHD) natural convection heat and mass transfer problem of a dusty micropolar fluid in a non-Darcy porous regime is prepared in the current paper. In addition, a convective boundary condition is scrutinized into the micropolar dusty fluid model. The governing boundary layer equations are converted utilizing similarity transformations to a system of dimensionless equations to be convenient for numerical treatment. The resulting equations for fluid phase and dust phases of momentum, angular momentum, energy, and concentration with the appropriate boundary conditions are solved numerically applying the Runge-Kutta method of fourth-order. In accordance with the numerical study, it is obtained that the magnitude of the velocity of both fluid phase and particle phase reduces with an increasing magnetic parameter, the mass concentration of the dust particles, and Forchheimer number. While rises due to an increment in convective parameter and Darcy number. Also, the results refer that high values of the magnetic parameter, convective parameter, and Forchheimer number support the temperature distributions. However, deterioration occurs as the mass concentration of the dust particles and Darcy number increases. The angular velocity behavior is described by progress when studying the effect of the magnetic parameter and microrotation parameter.

Keywords: Micropolar dusty fluid, convective heating, natural convection, MHD, porous media.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 939
82 Speaker Identification using Neural Networks

Authors: R.V Pawar, P.P.Kajave, S.N.Mali

Abstract:

The speech signal conveys information about the identity of the speaker. The area of speaker identification is concerned with extracting the identity of the person speaking the utterance. As speech interaction with computers becomes more pervasive in activities such as the telephone, financial transactions and information retrieval from speech databases, the utility of automatically identifying a speaker is based solely on vocal characteristic. This paper emphasizes on text dependent speaker identification, which deals with detecting a particular speaker from a known population. The system prompts the user to provide speech utterance. System identifies the user by comparing the codebook of speech utterance with those of the stored in the database and lists, which contain the most likely speakers, could have given that speech utterance. The speech signal is recorded for N speakers further the features are extracted. Feature extraction is done by means of LPC coefficients, calculating AMDF, and DFT. The neural network is trained by applying these features as input parameters. The features are stored in templates for further comparison. The features for the speaker who has to be identified are extracted and compared with the stored templates using Back Propogation Algorithm. Here, the trained network corresponds to the output; the input is the extracted features of the speaker to be identified. The network does the weight adjustment and the best match is found to identify the speaker. The number of epochs required to get the target decides the network performance.

Keywords: Average Mean Distance function, Backpropogation, Linear Predictive Coding, MultilayeredPerceptron,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1891
81 Prediction of Air-Water Two-Phase Frictional Pressure Drop Using Artificial Neural Network

Authors: H. B. Mehta, Vipul M. Patel, Jyotirmay Banerjee

Abstract:

The present paper discusses the prediction of gas-liquid two-phase frictional pressure drop in a 2.12 mm horizontal circular minichannel using Artificial Neural Network (ANN). The experimental results are obtained with air as gas phase and water as liquid phase. The superficial gas velocity is kept in the range of 0.0236 m/s to 0.4722 m/s while the values of 0.0944 m/s, 0.1416 m/s and 0.1889 m/s are considered for superficial liquid velocity. The experimental results are predicted using different Artificial Neural Network (ANN) models. Networks used for prediction are radial basis, generalised regression, linear layer, cascade forward back propagation, feed forward back propagation, feed forward distributed time delay, layer recurrent, and Elman back propagation. Transfer functions used for networks are Linear (PURELIN), Logistic sigmoid (LOGSIG), tangent sigmoid (TANSIG) and Gaussian RBF. Combination of networks and transfer functions give different possible neural network models. These models are compared for Mean Absolute Relative Deviation (MARD) and Mean Relative Deviation (MRD) to identify the best predictive model of ANN.

Keywords: Minichannel, Two-Phase Flow, Frictional Pressure Drop, ANN, MARD, MRD.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1404
80 A Real-Time Monitoring System of the Supply Chain Conditions, Products and Means of Transport

Authors: Dimitrios E. Kontaxis, George Litainas, Dimitrios P. Ptochos, Vaggelis P. Ptochos, Sotirios P. Ptochos, Dimitrios Beletsis, Konstantinos Kritikakis, Milan Sunaric

Abstract:

Real-time monitoring of the supply chain conditions and procedures is a critical element for the optimal coordination and safety of the deliveries, as well as for the minimization of the delivery time and cost. Real time monitoring requires IoT data streams, which are related to the conditions of the products and the means of transport (e.g., location, temperature/humidity conditions, kinematic state, ambient light conditions, etc.). These streams are generated by battery-based IoT tracking devices, equipped with appropriate sensors, and are transmitted to a cloud-based back-end system. Proper handling and processing of the IoT data streams, using predictive and artificial intelligence algorithms, can provide significant and useful results, which can be exploited by the supply chain stakeholders in order to enhance their financial benefits, as well as the efficiency, security, transparency, coordination and sustainability of the supply chain procedures. The technology, the features and the characteristics of a complete, proprietary system, including hardware, firmware and software tools - developed in the context of a co-funded R&D program - are addressed and presented in this paper. 

Keywords: IoT embedded electronics, real-time monitoring, tracking device, sensor platform

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 630
79 Monitorization of Junction Temperature Using a Thermal-Test-Device

Authors: B. Arzhanov, A. Correia, P. Delgado, J. Meireles

Abstract:

Due to the higher power loss levels in electronic components, the thermal design of PCBs (Printed Circuit Boards) of an assembled device becomes one of the most important quality factors in electronics. Nonetheless, some of leading causes of the microelectronic component failures are due to higher temperatures, the leakages or thermal-mechanical stress, which is a concern, is the reliability of microelectronic packages. This article presents an experimental approach to measure the junction temperature of exposed pad packages. The implemented solution is in a prototype phase, using a temperature-sensitive parameter (TSP) to measure temperature directly on the die, validating the numeric results provided by the Mechanical APDL (Ansys Parametric Design Language) under same conditions. The physical device-under-test is composed by a Thermal Test Chip (TTC-1002) and assembly in a QFN cavity, soldered to a test-board according to JEDEC Standards. Monitoring the voltage drop across a forward-biased diode, is an indirectly method but accurate to obtain the junction temperature of QFN component with an applied power range between 0,3W to 1.5W. The temperature distributions on the PCB test-board and QFN cavity surface were monitored by an infra-red thermal camera (Goby-384) controlled and images processed by the Xeneth software. The article provides a set-up to monitorize in real-time the junction temperature of ICs, namely devices with the exposed pad package (i.e. QFN). Presenting the PCB layout parameters that the designer should use to improve thermal performance, and evaluate the impact of voids in solder interface in the device junction temperature.

Keywords: Quad Flat No-Lead packages, exposed pads, junction temperature, thermal management, measurements.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1710
78 Alternative Methods to Rank the Impact of Object Oriented Metrics in Fault Prediction Modeling using Neural Networks

Authors: Kamaldeep Kaur, Arvinder Kaur, Ruchika Malhotra

Abstract:

The aim of this paper is to rank the impact of Object Oriented(OO) metrics in fault prediction modeling using Artificial Neural Networks(ANNs). Past studies on empirical validation of object oriented metrics as fault predictors using ANNs have focused on the predictive quality of neural networks versus standard statistical techniques. In this empirical study we turn our attention to the capability of ANNs in ranking the impact of these explanatory metrics on fault proneness. In ANNs data analysis approach, there is no clear method of ranking the impact of individual metrics. Five ANN based techniques are studied which rank object oriented metrics in predicting fault proneness of classes. These techniques are i) overall connection weights method ii) Garson-s method iii) The partial derivatives methods iv) The Input Perturb method v) the classical stepwise methods. We develop and evaluate different prediction models based on the ranking of the metrics by the individual techniques. The models based on overall connection weights and partial derivatives methods have been found to be most accurate.

Keywords: Artificial Neural Networks (ANNS), Backpropagation, Fault Prediction Modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1756
77 Profile Calculation in Water Phantom of Symmetric and Asymmetric Photon Beam

Authors: N. Chegeni, M. J. Tahmasebi Birgani

Abstract:

Nowadays, in most radiotherapy departments, the commercial treatment planning systems (TPS) used to calculate dose distributions needs to be verified; therefore, quick, easy-to-use and low cost dose distribution algorithms are desirable to test and verify the performance of the TPS. In this paper, we put forth an analytical method to calculate the phantom scatter contribution and depth dose on the central axis based on the equivalent square concept. Then, this method was generalized to calculate the profiles at any depth and for several field shapes regular or irregular fields under symmetry and asymmetry photon beam conditions. Varian 2100 C/D and Siemens Primus Plus Linacs with 6 and 18 MV photon beam were used for irradiations. Percentage depth doses (PDDs) were measured for a large number of square fields for both energies, and for 45º wedges which were employed to obtain the profiles in any depth. To assess the accuracy of the calculated profiles, several profile measurements were carried out for some treatment fields. The calculated and measured profiles were compared by gamma-index calculation. All γ–index calculations were based on a 3% dose criterion and a 3 mm dose-to-agreement (DTA) acceptance criterion. The γ values were less than 1 at most points. However, the maximum γ observed was about 1.10 in the penumbra region in most fields and in the central area for the asymmetric fields. This analytical approach provides a generally quick and fairly accurate algorithm to calculate dose distribution for some treatment fields in conventional radiotherapy.

Keywords: Dose distribution, equivalent field, asymmetric field, irregular field.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3042
76 A Discrete Event Simulation Model to Manage Bed Usage for Non-Elective Admissions in a Geriatric Medicine Speciality

Authors: Muhammed Ordu, Eren Demir, Chris Tofallis

Abstract:

Over the past decade, the non-elective admissions in the UK have increased significantly. Taking into account limited resources (i.e. beds), the related service managers are obliged to manage their resources effectively due to the non-elective admissions which are mostly admitted to inpatient specialities via A&E departments. Geriatric medicine is one of specialities that have long length of stay for the non-elective admissions. This study aims to develop a discrete event simulation model to understand how possible increases on non-elective demand over the next 12 months affect the bed occupancy rate and to determine required number of beds in a geriatric medicine speciality in a UK hospital. In our validated simulation model, we take into account observed frequency distributions which are derived from a big data covering the period April, 2009 to January, 2013, for the non-elective admission and the length of stay. An experimental analysis, which consists of 16 experiments, is carried out to better understand possible effects of case studies and scenarios related to increase on demand and number of bed. As a result, the speciality does not achieve the target level in the base model although the bed occupancy rate decreases from 125.94% to 96.41% by increasing the number of beds by 30%. In addition, the number of required beds is more than the number of beds considered in the scenario analysis in order to meet the bed requirement. This paper sheds light on bed management for service managers in geriatric medicine specialities.

Keywords: Bed management, bed occupancy rate, discrete event simulation, geriatric medicine, non-elective admission.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1907
75 Unsteady Flow Simulations for Microchannel Design and Its Fabrication for Nanoparticle Synthesis

Authors: Mrinalini Amritkar, Disha Patil, Swapna Kulkarni, Sukratu Barve, Suresh Gosavi

Abstract:

Micro-mixers play an important role in the lab-on-a-chip applications and micro total analysis systems to acquire the correct level of mixing for any given process. The mixing process can be classified as active or passive according to the use of external energy. Literature of microfluidics reports that most of the work is done on the models of steady laminar flow; however, the study of unsteady laminar flow is an active area of research at present. There are wide applications of this, out of which, we consider nanoparticle synthesis in micro-mixers. In this work, we have developed a model for unsteady flow to study the mixing performance of a passive micro mixer for reactants used for such synthesis. The model is developed in Finite Volume Method (FVM)-based software, OpenFOAM. The model is tested by carrying out the simulations at Re of 0.5. Mixing performance of the micro-mixer is investigated using simulated concentration values of mixed species across the width of the micro-mixer and calculating the variance across a line profile. Experimental validation is done by passing dyes through a Y shape micro-mixer fabricated using polydimethylsiloxane (PDMS) polymer and comparing variances with the simulated ones. Gold nanoparticles are later synthesized through the micro-mixer and collected at two different times leading to significantly different size distributions. These times match with the time scales over which reactant concentrations vary as obtained from simulations. Our simulations could thus be used to create design aids for passive micro-mixers used in nanoparticle synthesis.

Keywords: Lab-on-chip, micro-mixer, OpenFOAM, PDMS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 787
74 Computer Aided X-Ray Diffraction Intensity Analysis for Spinels: Hands-On Computing Experience

Authors: Ashish R. Tanna, Hiren H. Joshi

Abstract:

The mineral having chemical compositional formula MgAl2O4 is called “spinel". The ferrites crystallize in spinel structure are known as spinel-ferrites or ferro-spinels. The spinel structure has a fcc cage of oxygen ions and the metallic cations are distributed among tetrahedral (A) and octahedral (B) interstitial voids (sites). The X-ray diffraction (XRD) intensity of each Bragg plane is sensitive to the distribution of cations in the interstitial voids of the spinel lattice. This leads to the method of determination of distribution of cations in the spinel oxides through XRD intensity analysis. The computer program for XRD intensity analysis has been developed in C language and also tested for the real experimental situation by synthesizing the spinel ferrite materials Mg0.6Zn0.4AlxFe2- xO4 and characterized them by X-ray diffractometry. The compositions of Mg0.6Zn0.4AlxFe2-xO4(x = 0.0 to 0.6) ferrites have been prepared by ceramic method and powder X-ray diffraction patterns were recorded. Thus, the authenticity of the program is checked by comparing the theoretically calculated data using computer simulation with the experimental ones. Further, the deduced cation distributions were used to fit the magnetization data using Localized canting of spins approach to explain the “recovery" of collinear spin structure due to Al3+ - substitution in Mg-Zn ferrites which is the case if A-site magnetic dilution and non-collinear spin structure. Since the distribution of cations in the spinel ferrites plays a very important role with regard to their electrical and magnetic properties, it is essential to determine the cation distribution in spinel lattice.

Keywords: Spinel ferrites, Localized canting of spins, X-ray diffraction, Programming in Borland C.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3805
73 Novel Hybrid Method for Gene Selection and Cancer Prediction

Authors: Liping Jing, Michael K. Ng, Tieyong Zeng

Abstract:

Microarray data profiles gene expression on a whole genome scale, therefore, it provides a good way to study associations between gene expression and occurrence or progression of cancer. More and more researchers realized that microarray data is helpful to predict cancer sample. However, the high dimension of gene expressions is much larger than the sample size, which makes this task very difficult. Therefore, how to identify the significant genes causing cancer becomes emergency and also a hot and hard research topic. Many feature selection algorithms have been proposed in the past focusing on improving cancer predictive accuracy at the expense of ignoring the correlations between the features. In this work, a novel framework (named by SGS) is presented for stable gene selection and efficient cancer prediction . The proposed framework first performs clustering algorithm to find the gene groups where genes in each group have higher correlation coefficient, and then selects the significant genes in each group with Bayesian Lasso and important gene groups with group Lasso, and finally builds prediction model based on the shrinkage gene space with efficient classification algorithm (such as, SVM, 1NN, Regression and etc.). Experiment results on real world data show that the proposed framework often outperforms the existing feature selection and prediction methods, say SAM, IG and Lasso-type prediction model.

Keywords: Gene Selection, Cancer Prediction, Lasso, Clustering, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2041
72 Optimum Surface Roughness Prediction in Face Milling of High Silicon Stainless Steel

Authors: M. Farahnakian, M.R. Razfar, S. Elhami-Joosheghan

Abstract:

This paper presents an approach for the determination of the optimal cutting parameters (spindle speed, feed rate, depth of cut and engagement) leading to minimum surface roughness in face milling of high silicon stainless steel by coupling neural network (NN) and Electromagnetism-like Algorithm (EM). In this regard, the advantages of statistical experimental design technique, experimental measurements, artificial neural network, and Electromagnetism-like optimization method are exploited in an integrated manner. To this end, numerous experiments on this stainless steel were conducted to obtain surface roughness values. A predictive model for surface roughness is created by using a back propogation neural network, then the optimization problem was solved by using EM optimization. Additional experiments were performed to validate optimum surface roughness value predicted by EM algorithm. It is clearly seen that a good agreement is observed between the predicted values by EM coupled with feed forward neural network and experimental measurements. The obtained results show that the EM algorithm coupled with back propogation neural network is an efficient and accurate method in approaching the global minimum of surface roughness in face milling.

Keywords: cutting parameters, face milling, surface roughness, artificial neural network, Electromagnetism-like algorithm,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2585
71 Influence of the Granular Mixture Properties on the Rheological Properties of Concrete: Yield Stress Determination Using Modified Chateau et al. Model

Authors: Rachid Zentar, Mokrane Bala, Pascal Boustingorry

Abstract:

The prediction of the rheological behavior of concrete is at the center of current concerns of the concrete industry for different reasons. The shortage of good quality standard materials combined with variable properties of available materials imposes to improve existing models to take into account these variations at the design stage of concrete. The main reasons for improving the predictive models are, of course, saving time and cost at the design stage as well as to optimize concrete performances. In this study, we will highlight the different properties of the granular mixtures that affect the rheological properties of concrete. Our objective is to identify the intrinsic parameters of the aggregates which make it possible to predict the yield stress of concrete. The work was done using two typologies of grains: crushed and rolled aggregates. The experimental results have shown that the rheology of concrete is improved by increasing the packing density of the granular mixture using rolled aggregates. The experimental program realized allowed to model the yield stress of concrete by a modified model of Chateau et al. through a dimensionless parameter following Krieger-Dougherty law. The modelling confirms that the yield stress of concrete depends not only on the properties of cement paste but also on the packing density of the granular skeleton and the shape of grains.

Keywords: Crushed aggregates, intrinsic viscosity, packing density, rolled aggregates, slump, yield stress of concrete.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 594
70 A Stochastic Diffusion Process Based on the Two-Parameters Weibull Density Function

Authors: Meriem Bahij, Ahmed Nafidi, Boujemâa Achchab, Sílvio M. A. Gama, José A. O. Matos

Abstract:

Stochastic modeling concerns the use of probability to model real-world situations in which uncertainty is present. Therefore, the purpose of stochastic modeling is to estimate the probability of outcomes within a forecast, i.e. to be able to predict what conditions or decisions might happen under different situations. In the present study, we present a model of a stochastic diffusion process based on the bi-Weibull distribution function (its trend is proportional to the bi-Weibull probability density function). In general, the Weibull distribution has the ability to assume the characteristics of many different types of distributions. This has made it very popular among engineers and quality practitioners, who have considered it the most commonly used distribution for studying problems such as modeling reliability data, accelerated life testing, and maintainability modeling and analysis. In this work, we start by obtaining the probabilistic characteristics of this model, as the explicit expression of the process, its trends, and its distribution by transforming the diffusion process in a Wiener process as shown in the Ricciaardi theorem. Then, we develop the statistical inference of this model using the maximum likelihood methodology. Finally, we analyse with simulated data the computational problems associated with the parameters, an issue of great importance in its application to real data with the use of the convergence analysis methods. Overall, the use of a stochastic model reflects only a pragmatic decision on the part of the modeler. According to the data that is available and the universe of models known to the modeler, this model represents the best currently available description of the phenomenon under consideration.

Keywords: Diffusion process, discrete sampling, likelihood estimation method, simulation, stochastic diffusion equation, trends functions, bi-parameters Weibull density function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1966
69 Using Data Mining Techniques for Finding Cardiac Outlier Patients

Authors: Farhan Ismaeel Dakheel, Raoof Smko, K. Negrat, Abdelsalam Almarimi

Abstract:

In this paper we used data mining techniques to identify outlier patients who are using large amount of drugs over a long period of time. Any healthcare or health insurance system should deal with the quantities of drugs utilized by chronic diseases patients. In Kingdom of Bahrain, about 20% of health budget is spent on medications. For the managers of healthcare systems, there is no enough information about the ways of drug utilization by chronic diseases patients, is there any misuse or is there outliers patients. In this work, which has been done in cooperation with information department in the Bahrain Defence Force hospital; we select the data for Cardiac patients in the period starting from 1/1/2008 to December 31/12/2008 to be the data for the model in this paper. We used three techniques for finding the drug utilization for cardiac patients. First we applied a clustering technique, followed by measuring of clustering validity, and finally we applied a decision tree as classification algorithm. The clustering results is divided into three clusters according to the drug utilization, for 1603 patients, who received 15,806 prescriptions during this period can be partitioned into three groups, where 23 patients (2.59%) who received 1316 prescriptions (8.32%) are classified to be outliers. The classification algorithm shows that the use of average drug utilization and the age, and the gender of the patient can be considered to be the main predictive factors in the induced model.

Keywords: Data Mining, Clustering, Classification, Drug Utilization..

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1897
68 Movie Genre Preference Prediction Using Machine Learning for Customer-Based Information

Authors: Haifeng Wang, Haili Zhang

Abstract:

Most movie recommendation systems have been developed for customers to find items of interest. This work introduces a predictive model usable by small and medium-sized enterprises (SMEs) who are in need of a data-based and analytical approach to stock proper movies for local audiences and retain more customers. We used classification models to extract features from thousands of customers’ demographic, behavioral and social information to predict their movie genre preference. In the implementation, a Gaussian kernel support vector machine (SVM) classification model and a logistic regression model were established to extract features from sample data and their test error-in-sample were compared. Comparison of error-out-sample was also made under different Vapnik–Chervonenkis (VC) dimensions in the machine learning algorithm to find and prevent overfitting. Gaussian kernel SVM prediction model can correctly predict movie genre preferences in 85% of positive cases. The accuracy of the algorithm increased to 93% with a smaller VC dimension and less overfitting. These findings advance our understanding of how to use machine learning approach to predict customers’ preferences with a small data set and design prediction tools for these enterprises.

Keywords: Computational social science, movie preference, machine learning, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1648
67 Post Pandemic Mobility Analysis through Indexing and Sharding in MongoDB: Performance Optimization and Insights

Authors: Karan Vishavjit, Aakash Lakra, Shafaq Khan

Abstract:

The COVID-19 pandemic has pushed healthcare professionals to use big data analytics as a vital tool for tracking and evaluating the effects of contagious viruses. To effectively analyse huge datasets, efficient NoSQL databases are needed. The analysis of post-COVID-19 health and well-being outcomes and the evaluation of the effectiveness of government efforts during the pandemic is made possible by this research’s integration of several datasets, which cuts down on query processing time and creates predictive visual artifacts. We recommend applying sharding and indexing technologies to improve query effectiveness and scalability as the dataset expands. Effective data retrieval and analysis are made possible by spreading the datasets into a sharded database and doing indexing on individual shards. Analysis of connections between governmental activities, poverty levels, and post-pandemic wellbeing is the key goal. We want to evaluate the effectiveness of governmental initiatives to improve health and lower poverty levels. We will do this by utilising advanced data analysis and visualisations. The findings provide relevant data that support the advancement of UN sustainable objectives, future pandemic preparation, and evidence-based decision-making. This study shows how Big Data and NoSQL databases may be used to address problems with global health.

Keywords: COVID-19, big data, data analysis, indexing, NoSQL, sharding, scalability, poverty.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 66
66 Highly Optimized Novel High Speed Low Power Barrel Shifter at 22nm Hi K Metal Gate Strained Si Technology Node

Authors: Shobha Sharma, Amita Dev

Abstract:

This research paper presents highly optimized barrel shifter at 22nm Hi K metal gate strained Si technology node. This barrel shifter is having a unique combination of static and dynamic body bias which gives lowest power delay product. This power delay product is compared with the same circuit at same technology node with static forward biasing at ‘supply/2’ and also with normal reverse substrate biasing and still found to be the lowest. The power delay product of this barrel sifter is .39362X10-17J and is lowered by approximately 78% to reference proposed barrel shifter at 32nm bulk CMOS technology. Power delay product of barrel shifter at 22nm Hi K Metal gate technology with normal reverse substrate bias is 2.97186933X10-17J and can be compared with this design’s PDP of .39362X10-17J. This design uses both static and dynamic substrate biasing and also has approximately 96% lower power delay product compared to only forward body biased at half of supply voltage. The NMOS model used are predictive technology models of Arizona state university and the simulations to be carried out using HSPICE simulator.

Keywords: Dynamic body biasing, highly optimized barrel shifter, PDP, Static body biasing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1882
65 Evidence Theory Enabled Quickest Change Detection Using Big Time-Series Data from Internet of Things

Authors: Hossein Jafari, Xiangfang Li, Lijun Qian, Alexander Aved, Timothy Kroecker

Abstract:

Traditionally in sensor networks and recently in the Internet of Things, numerous heterogeneous sensors are deployed in distributed manner to monitor a phenomenon that often can be model by an underlying stochastic process. The big time-series data collected by the sensors must be analyzed to detect change in the stochastic process as quickly as possible with tolerable false alarm rate. However, sensors may have different accuracy and sensitivity range, and they decay along time. As a result, the big time-series data collected by the sensors will contain uncertainties and sometimes they are conflicting. In this study, we present a framework to take advantage of Evidence Theory (a.k.a. Dempster-Shafer and Dezert-Smarandache Theories) capabilities of representing and managing uncertainty and conflict to fast change detection and effectively deal with complementary hypotheses. Specifically, Kullback-Leibler divergence is used as the similarity metric to calculate the distances between the estimated current distribution with the pre- and post-change distributions. Then mass functions are calculated and related combination rules are applied to combine the mass values among all sensors. Furthermore, we applied the method to estimate the minimum number of sensors needed to combine, so computational efficiency could be improved. Cumulative sum test is then applied on the ratio of pignistic probability to detect and declare the change for decision making purpose. Simulation results using both synthetic data and real data from experimental setup demonstrate the effectiveness of the presented schemes.

Keywords: CUSUM, evidence theory, KL divergence, quickest change detection, time series data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 993
64 Combining the Deep Neural Network with the K-Means for Traffic Accident Prediction

Authors: Celso L. Fernando, Toshio Yoshii, Takahiro Tsubota

Abstract:

Understanding the causes of a road accident and predicting their occurrence is key to prevent deaths and serious injuries from road accident events. Traditional statistical methods such as the Poisson and the Logistics regressions have been used to find the association of the traffic environmental factors with the accident occurred; recently, an artificial neural network, ANN, a computational technique that learns from historical data to make a more accurate prediction, has emerged. Although the ability to make accurate predictions, the ANN has difficulty dealing with highly unbalanced attribute patterns distribution in the training dataset; in such circumstances, the ANN treats the minority group as noise. However, in the real world data, the minority group is often the group of interest; e.g., in the road traffic accident data, the events of the accident are the group of interest. This study proposes a combination of the k-means with the ANN to improve the predictive ability of the neural network model by alleviating the effect of the unbalanced distribution of the attribute patterns in the training dataset. The results show that the proposed method improves the ability of the neural network to make a prediction on a highly unbalanced distributed attribute patterns dataset; however, on an even distributed attribute patterns dataset, the proposed method performs almost like a standard neural network. 

Keywords: Accident risks estimation, artificial neural network, deep learning, K-mean, road safety.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 968
63 Financing Decision and Productivity Growth for the Venture Capital Industry Using High-Order Fuzzy Time Series

Authors: Shang-En Yu

Abstract:

Human society, there are many uncertainties, such as economic growth rate forecast of the financial crisis, many scholars have, since the the Song Chissom two scholars in 1993 the concept of the so-called fuzzy time series (Fuzzy Time Series)different mode to deal with these problems, a previous study, however, usually does not consider the relevant variables selected and fuzzy process based solely on subjective opinions the fuzzy semantic discrete, so can not objectively reflect the characteristics of the data set, in addition to carrying outforecasts are often fuzzy rules as equally important, failed to consider the importance of each fuzzy rule. For these reasons, the variable selection (Factor Selection) through self-organizing map (Self-Organizing Map, SOM) and proposed high-end weighted multivariate fuzzy time series model based on fuzzy neural network (Fuzzy-BPN), and using the the sequential weighted average operator (Ordered Weighted Averaging operator, OWA) weighted prediction. Therefore, in order to verify the proposed method, the Taiwan stock exchange (Taiwan Stock Exchange Corporation) Taiwan Weighted Stock Index (Taiwan Stock Exchange Capitalization Weighted Stock Index, TAIEX) as experimental forecast target, in order to filter the appropriate variables in the experiment Finally, included in other studies in recent years mode in conjunction with this study, the results showed that the predictive ability of this study further improve.

Keywords: Heterogeneity, residential mortgage loans, foreclosure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1387
62 Neural Network Models for Actual Cost and Actual Duration Estimation in Construction Projects: Findings from Greece

Authors: Panagiotis Karadimos, Leonidas Anthopoulos

Abstract:

Predicting the actual cost and duration in construction projects concern a continuous and existing problem for the construction sector. This paper addresses this problem with modern methods and data available from past public construction projects. 39 bridge projects, constructed in Greece, with a similar type of available data were examined. Considering each project’s attributes with the actual cost and the actual duration, correlation analysis is performed and the most appropriate predictive project variables are defined. Additionally, the most efficient subgroup of variables is selected with the use of the WEKA application, through its attribute selection function. The selected variables are used as input neurons for neural network models through correlation analysis. For constructing neural network models, the application FANN Tool is used. The optimum neural network model, for predicting the actual cost, produced a mean squared error with a value of 3.84886e-05 and it was based on the budgeted cost and the quantity of deck concrete. The optimum neural network model, for predicting the actual duration, produced a mean squared error with a value of 5.89463e-05 and it also was based on the budgeted cost and the amount of deck concrete.

Keywords: Actual cost and duration, attribute selection, bridge projects, neural networks, predicting models, FANN TOOL, WEKA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1225
61 Ordinal Regression with Fenton-Wilkinson Order Statistics: A Case Study of an Orienteering Race

Authors: Joonas Pääkkönen

Abstract:

In sports, individuals and teams are typically interested in final rankings. Final results, such as times or distances, dictate these rankings, also known as places. Places can be further associated with ordered random variables, commonly referred to as order statistics. In this work, we introduce a simple, yet accurate order statistical ordinal regression function that predicts relay race places with changeover-times. We call this function the Fenton-Wilkinson Order Statistics model. This model is built on the following educated assumption: individual leg-times follow log-normal distributions. Moreover, our key idea is to utilize Fenton-Wilkinson approximations of changeover-times alongside an estimator for the total number of teams as in the notorious German tank problem. This original place regression function is sigmoidal and thus correctly predicts the existence of a small number of elite teams that significantly outperform the rest of the teams. Our model also describes how place increases linearly with changeover-time at the inflection point of the log-normal distribution function. With real-world data from Jukola 2019, a massive orienteering relay race, the model is shown to be highly accurate even when the size of the training set is only 5% of the whole data set. Numerical results also show that our model exhibits smaller place prediction root-mean-square-errors than linear regression, mord regression and Gaussian process regression.

Keywords: Fenton-Wilkinson approximation, German tank problem, log-normal distribution, order statistics, ordinal regression, orienteering, sports analytics, sports modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 832
60 Further Development in Predicting Post-Earthquake Fire Ignition Hazard

Authors: Pegah Farshadmanesh, Jamshid Mohammadi, Mehdi Modares

Abstract:

In nearly all earthquakes of the past century that resulted in moderate to significant damage, the occurrence of postearthquake fire ignition (PEFI) has imposed a serious hazard and caused severe damage, especially in urban areas. In order to reduce the loss of life and property caused by post-earthquake fires, there is a crucial need for predictive models to estimate the PEFI risk. The parameters affecting PEFI risk can be categorized as: 1) factors influencing fire ignition in normal (non-earthquake) condition, including floor area, building category, ignitability, type of appliance, and prevention devices, and 2) earthquake related factors contributing to the PEFI risk, including building vulnerability and earthquake characteristics such as intensity, peak ground acceleration, and peak ground velocity. State-of-the-art statistical PEFI risk models are solely based on limited available earthquake data, and therefore they cannot predict the PEFI risk for areas with insufficient earthquake records since such records are needed in estimating the PEFI model parameters. In this paper, the correlation between normal condition ignition risk, peak ground acceleration, and PEFI risk is examined in an effort to offer a means for predicting post-earthquake ignition events. An illustrative example is presented to demonstrate how such correlation can be employed in a seismic area to predict PEFI hazard.

Keywords: Fire risk, post-earthquake fire ignition (PEFI), risk management, seismicity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1360
59 Prediction of Compressive Strength of Concrete from Early Age Test Result Using Design of Experiments (RSM)

Authors: Salem Alsanusi, Loubna Bentaher

Abstract:

Response Surface Methods (RSM) provide statistically validated predictive models that can then be manipulated for finding optimal process configurations. Variation transmitted to responses from poorly controlled process factors can be accounted for by the mathematical technique of propagation of error (POE), which facilitates ‘finding the flats’ on the surfaces generated by RSM. The dual response approach to RSM captures the standard deviation of the output as well as the average. It accounts for unknown sources of variation. Dual response plus propagation of error (POE) provides a more useful model of overall response variation. In our case, we implemented this technique in predicting compressive strength of concrete of 28 days in age. Since 28 days is quite time consuming, while it is important to ensure the quality control process. This paper investigates the potential of using design of experiments (DOE-RSM) to predict the compressive strength of concrete at 28th day. Data used for this study was carried out from experiment schemes at university of Benghazi, civil engineering department. A total of 114 sets of data were implemented. ACI mix design method was utilized for the mix design. No admixtures were used, only the main concrete mix constituents such as cement, coarseaggregate, fine aggregate and water were utilized in all mixes. Different mix proportions of the ingredients and different water cement ratio were used. The proposed mathematical models are capable of predicting the required concrete compressive strength of concrete from early ages.

Keywords: Mix proportioning, response surface methodology, compressive strength, optimal design.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2213
58 Ignition Delay Correlation for a Direct Injection Diesel Engine Fuelled with Automotive Diesel and Water Diesel Emulsion

Authors: K.Alkhulaifi, M. Hamdalla

Abstract:

Most of ignition delay correlations studies have been developed in a constant volume bombs which cannot capture the dynamic variation in pressure and temperature during the ignition delay as in real engines. Watson, Assanis et. al. and Hardenberg and Hase correlations have been developed based on experimental data of diesel engines. However, they showed limited predictive ability of ignition delay when compared to experimental results. The objective of the study was to investigate the dependency of ignition delay time on engine brake power. An experimental investigation of the effect of automotive diesel and water diesel emulsion fuels on ignition delay under steady state conditions of a direct injection diesel engine was conducted. A four cylinder, direct injection naturally aspirated diesel engine was used in this experiment over a wide range of engine speeds and two engine loads. The ignition delay experimental data were compared with predictions of Assanis et. al. and Watson ignition delay correlations. The results of the experimental investigation were then used to develop a new ignition delay correlation. The newly developed ignition delay correlation has shown a better agreement with the experimental data than Assanis et. al. and Watson when using automotive diesel and water diesel emulsion fuels especially at low to medium engine speeds at both loads. In addition, the second derivative of cylinder pressure which is the most widely used method in determining the start of combustion was investigated.

Keywords: gnition delay correlation, water diesel emulsion, direct injection diesel engine

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5807
57 Spatial Mapping of Dengue Incidence: A Case Study in Hulu Langat District, Selangor, Malaysia

Authors: Er, A. C., Rosli, M. H., Asmahani A., Mohamad Naim M. R., Harsuzilawati M.

Abstract:

Dengue is a mosquito-borne infection that has peaked to an alarming rate in recent decades. It can be found in tropical and sub-tropical climate. In Malaysia, dengue has been declared as one of the national health threat to the public. This study aimed to map the spatial distributions of dengue cases in the district of Hulu Langat, Selangor via a combination of Geographic Information System (GIS) and spatial statistic tools. Data related to dengue was gathered from the various government health agencies. The location of dengue cases was geocoded using a handheld GPS Juno SB Trimble. A total of 197 dengue cases occurring in 2003 were used in this study. Those data then was aggregated into sub-district level and then converted into GIS format. The study also used population or demographic data as well as the boundary of Hulu Langat. To assess the spatial distribution of dengue cases three spatial statistics method (Moran-s I, average nearest neighborhood (ANN) and kernel density estimation) were applied together with spatial analysis in the GIS environment. Those three indices were used to analyze the spatial distribution and average distance of dengue incidence and to locate the hot spot of dengue cases. The results indicated that the dengue cases was clustered (p < 0.01) when analyze using Moran-s I with z scores 5.03. The results from ANN analysis showed that the average nearest neighbor ratio is less than 1 which is 0.518755 (p < 0.0001). From this result, we can expect the dengue cases pattern in Hulu Langat district is exhibiting a cluster pattern. The z-score for dengue incidence within the district is -13.0525 (p < 0.0001). It was also found that the significant spatial autocorrelation of dengue incidences occurs at an average distance of 380.81 meters (p < 0.0001). Several locations especially residential area also had been identified as the hot spots of dengue cases in the district.

Keywords: Dengue, geographic information system (GIS), spatial analysis, spatial statistics

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5365
56 Typical Day Prediction Model for Output Power and Energy Efficiency of a Grid-Connected Solar Photovoltaic System

Authors: Yan Su, L. C. Chan

Abstract:

A novel typical day prediction model have been built and validated by the measured data of a grid-connected solar photovoltaic (PV) system in Macau. Unlike conventional statistical method used by previous study on PV systems which get results by averaging nearby continuous points, the present typical day statistical method obtain the value at every minute in a typical day by averaging discontinuous points at the same minute in different days. This typical day statistical method based on discontinuous point averaging makes it possible for us to obtain the Gaussian shape dynamical distributions for solar irradiance and output power in a yearly or monthly typical day. Based on the yearly typical day statistical analysis results, the maximum possible accumulated output energy in a year with on site climate conditions and the corresponding optimal PV system running time are obtained. Periodic Gaussian shape prediction models for solar irradiance, output energy and system energy efficiency have been built and their coefficients have been determined based on the yearly, maximum and minimum monthly typical day Gaussian distribution parameters, which are obtained from iterations for minimum Root Mean Squared Deviation (RMSD). With the present model, the dynamical effects due to time difference in a day are kept and the day to day uncertainty due to weather changing are smoothed but still included. The periodic Gaussian shape correlations for solar irradiance, output power and system energy efficiency have been compared favorably with data of the PV system in Macau and proved to be an improvement than previous models.

Keywords: Grid Connected, RMSD, Solar PV System, Typical Day.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1678