Search results for: destination prediction
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2692

Search results for: destination prediction

2182 Predicting Data Center Resource Usage Using Quantile Regression to Conserve Energy While Fulfilling the Service Level Agreement

Authors: Ahmed I. Alutabi, Naghmeh Dezhabad, Sudhakar Ganti

Abstract:

Data centers have been growing in size and dema nd continuously in the last two decades. Planning for the deployment of resources has been shallow and always resorted to over-provisioning. Data center operators try to maximize the availability of their services by allocating multiple of the needed resources. One resource that has been wasted, with little thought, has been energy. In recent years, programmable resource allocation has paved the way to allow for more efficient and robust data centers. In this work, we examine the predictability of resource usage in a data center environment. We use a number of models that cover a wide spectrum of machine learning categories. Then we establish a framework to guarantee the client service level agreement (SLA). Our results show that using prediction can cut energy loss by up to 55%.

Keywords: machine learning, artificial intelligence, prediction, data center, resource allocation, green computing

Procedia PDF Downloads 108
2181 Big Data: Appearance and Disappearance

Authors: James Moir

Abstract:

The mainstay of Big Data is prediction in that it allows practitioners, researchers, and policy analysts to predict trends based upon the analysis of large and varied sources of data. These can range from changing social and political opinions, patterns in crimes, and consumer behaviour. Big Data has therefore shifted the criterion of success in science from causal explanations to predictive modelling and simulation. The 19th-century science sought to capture phenomena and seek to show the appearance of it through causal mechanisms while 20th-century science attempted to save the appearance and relinquish causal explanations. Now 21st-century science in the form of Big Data is concerned with the prediction of appearances and nothing more. However, this pulls social science back in the direction of a more rule- or law-governed reality model of science and away from a consideration of the internal nature of rules in relation to various practices. In effect Big Data offers us no more than a world of surface appearance and in doing so it makes disappear any context-specific conceptual sensitivity.

Keywords: big data, appearance, disappearance, surface, epistemology

Procedia PDF Downloads 421
2180 Prediction of Childbearing Orientations According to Couples' Sexual Review Component

Authors: Razieh Rezaeekalantari

Abstract:

Objective: The purpose of this study was to investigate the prediction of parenting orientations in terms of the components of couples' sexual review. Methods: This was a descriptive correlational research method. The population consisted of 500 couples referring to Sari Health Center. Two hundred and fifteen (215) people were selected randomly by using Krejcie-Morgan-sample-size-table. For data collection, the childbearing orientations scale and the Multidimensional Sexual Self-Concept Questionnaire were used. Result: For data analysis, the mean and standard deviation were used and to analyze the research hypothesis regression correlation and inferential statistics were used. Conclusion: The findings indicate that there is not a significant relationship between the tendency to childbearing and the predictive value of sexual review (r = 0.84) with significant level (sig = 219.19) (P < 0.05). So, with 95% confidence, we conclude that there is not a meaningful relationship between sexual orientation and tendency to child-rearing.

Keywords: couples referring, health center, sexual review component, parenting orientations

Procedia PDF Downloads 219
2179 Sorghum Grains Grading for Food, Feed, and Fuel Using NIR Spectroscopy

Authors: Irsa Ejaz, Siyang He, Wei Li, Naiyue Hu, Chaochen Tang, Songbo Li, Meng Li, Boubacar Diallo, Guanghui Xie, Kang Yu

Abstract:

Background: Near-infrared spectroscopy (NIR) is a non-destructive, fast, and low-cost method to measure the grain quality of different cereals. Previously reported NIR model calibrations using the whole grain spectra had moderate accuracy. Improved predictions are achievable by using the spectra of whole grains, when compared with the use of spectra collected from the flour samples. However, the feasibility for determining the critical biochemicals, related to the classifications for food, feed, and fuel products are not adequately investigated. Objectives: To evaluate the feasibility of using NIRS and the influence of four sample types (whole grains, flours, hulled grain flours, and hull-less grain flours) on the prediction of chemical components to improve the grain sorting efficiency for human food, animal feed, and biofuel. Methods: NIR was applied in this study to determine the eight biochemicals in four types of sorghum samples: hulled grain flours, hull-less grain flours, whole grains, and grain flours. A total of 20 hybrids of sorghum grains were selected from the two locations in China. Followed by NIR spectral and wet-chemically measured biochemical data, partial least squares regression (PLSR) was used to construct the prediction models. Results: The results showed that sorghum grain morphology and sample format affected the prediction of biochemicals. Using NIR data of grain flours generally improved the prediction compared with the use of NIR data of whole grains. In addition, using the spectra of whole grains enabled comparable predictions, which are recommended when a non-destructive and rapid analysis is required. Compared with the hulled grain flours, hull-less grain flours allowed for improved predictions for tannin, cellulose, and hemicellulose using NIR data. Conclusion: The established PLSR models could enable food, feed, and fuel producers to efficiently evaluate a large number of samples by predicting the required biochemical components in sorghum grains without destruction.

Keywords: FT-NIR, sorghum grains, biochemical composition, food, feed, fuel, PLSR

Procedia PDF Downloads 69
2178 Analytical Study of Data Mining Techniques for Software Quality Assurance

Authors: Mariam Bibi, Rubab Mehboob, Mehreen Sirshar

Abstract:

Satisfying the customer requirements is the ultimate goal of producing or developing any product. The quality of the product is decided on the bases of the level of customer satisfaction. There are different techniques which have been reported during the survey which enhance the quality of the product through software defect prediction and by locating the missing software requirements. Some mining techniques were proposed to assess the individual performance indicators in collaborative environment to reduce errors at individual level. The basic intention is to produce a product with zero or few defects thereby producing a best product quality wise. In the analysis of survey the techniques like Genetic algorithm, artificial neural network, classification and clustering techniques and decision tree are studied. After analysis it has been discovered that these techniques contributed much to the improvement and enhancement of the quality of the product.

Keywords: data mining, defect prediction, missing requirements, software quality

Procedia PDF Downloads 468
2177 Cardiovascular Disease Prediction Using Machine Learning Approaches

Authors: P. Halder, A. Zaman

Abstract:

It is estimated that heart disease accounts for one in ten deaths worldwide. United States deaths due to heart disease are among the leading causes of death according to the World Health Organization. Cardiovascular diseases (CVDs) account for one in four U.S. deaths, according to the Centers for Disease Control and Prevention (CDC). According to statistics, women are more likely than men to die from heart disease as a result of strokes. A 50% increase in men's mortality was reported by the World Health Organization in 2009. The consequences of cardiovascular disease are severe. The causes of heart disease include diabetes, high blood pressure, high cholesterol, abnormal pulse rates, etc. Machine learning (ML) can be used to make predictions and decisions in the healthcare industry. Thus, scientists have turned to modern technologies like Machine Learning and Data Mining to predict diseases. The disease prediction is based on four algorithms. Compared to other boosts, the Ada boost is much more accurate.

Keywords: heart disease, cardiovascular disease, coronary artery disease, feature selection, random forest, AdaBoost, SVM, decision tree

Procedia PDF Downloads 153
2176 Quality of Service of Transportation Networks: A Hybrid Measurement of Travel Time and Reliability

Authors: Chin-Chia Jane

Abstract:

In a transportation network, travel time refers to the transmission time from source node to destination node, whereas reliability refers to the probability of a successful connection from source node to destination node. With an increasing emphasis on quality of service (QoS), both performance indexes are significant in the design and analysis of transportation systems. In this work, we extend the well-known flow network model for transportation networks so that travel time and reliability are integrated into the QoS measurement simultaneously. In the extended model, in addition to the general arc capacities, each intermediate node has a time weight which is the travel time for per unit of commodity going through the node. Meanwhile, arcs and nodes are treated as binary random variables that switch between operation and failure with associated probabilities. For pre-specified travel time limitation and demand requirement, the QoS of a transportation network is the probability that source can successfully transport the demand requirement to destination while the total transmission time is under the travel time limitation. This work is pioneering, since existing literatures that evaluate travel time reliability via a single optimization path, the proposed QoS focuses the performance of the whole network system. To compute the QoS of transportation networks, we first transfer the extended network model into an equivalent min-cost max-flow network model. In the transferred network, each arc has a new travel time weight which takes value 0. Each intermediate node is replaced by two nodes u and v, and an arc directed from u to v. The newly generated nodes u and v are perfect nodes. The new direct arc has three weights: travel time, capacity, and operation probability. Then the universal set of state vectors is recursively decomposed into disjoint subsets of reliable, unreliable, and stochastic vectors until no stochastic vector is left. The decomposition is made possible by applying existing efficient min-cost max-flow algorithm. Because the reliable subsets are disjoint, QoS can be obtained directly by summing the probabilities of these reliable subsets. Computational experiments are conducted on a benchmark network which has 11 nodes and 21 arcs. Five travel time limitations and five demand requirements are set to compute the QoS value. To make a comparison, we test the exhaustive complete enumeration method. Computational results reveal the proposed algorithm is much more efficient than the complete enumeration method. In this work, a transportation network is analyzed by an extended flow network model where each arc has a fixed capacity, each intermediate node has a time weight, and both arcs and nodes are independent binary random variables. The quality of service of the transportation network is an integration of customer demands, travel time, and the probability of connection. We present a decomposition algorithm to compute the QoS efficiently. Computational experiments conducted on a prototype network show that the proposed algorithm is superior to existing complete enumeration methods.

Keywords: quality of service, reliability, transportation network, travel time

Procedia PDF Downloads 221
2175 Prediction of Sepsis Illness from Patients Vital Signs Using Long Short-Term Memory Network and Dynamic Analysis

Authors: Marcio Freire Cruz, Naoaki Ono, Shigehiko Kanaya, Carlos Arthur Mattos Teixeira Cavalcante

Abstract:

The systems that record patient care information, known as Electronic Medical Record (EMR) and those that monitor vital signs of patients, such as heart rate, body temperature, and blood pressure have been extremely valuable for the effectiveness of the patient’s treatment. Several kinds of research have been using data from EMRs and vital signs of patients to predict illnesses. Among them, we highlight those that intend to predict, classify, or, at least identify patterns, of sepsis illness in patients under vital signs monitoring. Sepsis is an organic dysfunction caused by a dysregulated patient's response to an infection that affects millions of people worldwide. Early detection of sepsis is expected to provide a significant improvement in its treatment. Preceding works usually combined medical, statistical, mathematical and computational models to develop detection methods for early prediction, getting higher accuracies, and using the smallest number of variables. Among other techniques, we could find researches using survival analysis, specialist systems, machine learning and deep learning that reached great results. In our research, patients are modeled as points moving each hour in an n-dimensional space where n is the number of vital signs (variables). These points can reach a sepsis target point after some time. For now, the sepsis target point was calculated using the median of all patients’ variables on the sepsis onset. From these points, we calculate for each hour the position vector, the first derivative (velocity vector) and the second derivative (acceleration vector) of the variables to evaluate their behavior. And we construct a prediction model based on a Long Short-Term Memory (LSTM) Network, including these derivatives as explanatory variables. The accuracy of the prediction 6 hours before the time of sepsis, considering only the vital signs reached 83.24% and by including the vectors position, speed, and acceleration, we obtained 94.96%. The data are being collected from Medical Information Mart for Intensive Care (MIMIC) Database, a public database that contains vital signs, laboratory test results, observations, notes, and so on, from more than 60.000 patients.

Keywords: dynamic analysis, long short-term memory, prediction, sepsis

Procedia PDF Downloads 125
2174 Personalized Infectious Disease Risk Prediction System: A Knowledge Model

Authors: Retno A. Vinarti, Lucy M. Hederman

Abstract:

This research describes a knowledge model for a system which give personalized alert to users about infectious disease risks in the context of weather, location and time. The knowledge model is based on established epidemiological concepts augmented by information gleaned from infection-related data repositories. The existing disease risk prediction research has more focuses on utilizing raw historical data and yield seasonal patterns of infectious disease risk emergence. This research incorporates both data and epidemiological concepts gathered from Atlas of Human Infectious Disease (AHID) and Centre of Disease Control (CDC) as basic reasoning of infectious disease risk prediction. Using CommonKADS methodology, the disease risk prediction task is an assignment synthetic task, starting from knowledge identification through specification, refinement to implementation. First, knowledge is gathered from AHID primarily from the epidemiology and risk group chapters for each infectious disease. The result of this stage is five major elements (Person, Infectious Disease, Weather, Location and Time) and their properties. At the knowledge specification stage, the initial tree model of each element and detailed relationships are produced. This research also includes a validation step as part of knowledge refinement: on the basis that the best model is formed using the most common features, Frequency-based Selection (FBS) is applied. The portion of the Infectious Disease risk model relating to Person comes out strongest, with Location next, and Weather weaker. For Person attribute, Age is the strongest, Activity and Habits are moderate, and Blood type is weakest. At the Location attribute, General category (e.g. continents, region, country, and island) results much stronger than Specific category (i.e. terrain feature). For Weather attribute, Less Precise category (i.e. season) comes out stronger than Precise category (i.e. exact temperature or humidity interval). However, given that some infectious diseases are significantly more serious than others, a frequency based metric may not be appropriate. Future work will incorporate epidemiological measurements of disease seriousness (e.g. odds ratio, hazard ratio and fatality rate) into the validation metrics. This research is limited to modelling existing knowledge about epidemiology and chain of infection concepts. Further step, verification in knowledge refinement stage, might cause some minor changes on the shape of tree.

Keywords: epidemiology, knowledge modelling, infectious disease, prediction, risk

Procedia PDF Downloads 242
2173 Surface Roughness Prediction Using Numerical Scheme and Adaptive Control

Authors: Michael K.O. Ayomoh, Khaled A. Abou-El-Hossein., Sameh F.M. Ghobashy

Abstract:

This paper proposes a numerical modelling scheme for surface roughness prediction. The approach is premised on the use of 3D difference analysis method enhanced with the use of feedback control loop where a set of adaptive weights are generated. The surface roughness values utilized in this paper were adapted from [1]. Their experiments were carried out using S55C high carbon steel. A comparison was further carried out between the proposed technique and those utilized in [1]. The experimental design has three cutting parameters namely: depth of cut, feed rate and cutting speed with twenty-seven experimental sample-space. The simulation trials conducted using Matlab software is of two sub-classes namely: prediction of the surface roughness readings for the non-boundary cutting combinations (NBCC) with the aid of the known surface roughness readings of the boundary cutting combinations (BCC). The following simulation involved the use of the predicted outputs from the NBCC to recover the surface roughness readings for the boundary cutting combinations (BCC). The simulation trial for the NBCC attained a state of total stability in the 7th iteration i.e. a point where the actual and desired roughness readings are equal such that error is minimized to zero by using a set of dynamic weights generated in every following simulation trial. A comparative study among the three methods showed that the proposed difference analysis technique with adaptive weight from feedback control, produced a much accurate output as against the abductive and regression analysis techniques presented in this.

Keywords: Difference Analysis, Surface Roughness; Mesh- Analysis, Feedback control, Adaptive weight, Boundary Element

Procedia PDF Downloads 621
2172 The Design of a Vehicle Traffic Flow Prediction Model for a Gauteng Freeway Based on an Ensemble of Multi-Layer Perceptron

Authors: Tebogo Emma Makaba, Barnabas Ndlovu Gatsheni

Abstract:

The cities of Johannesburg and Pretoria both located in the Gauteng province are separated by a distance of 58 km. The traffic queues on the Ben Schoeman freeway which connects these two cities can stretch for almost 1.5 km. Vehicle traffic congestion impacts negatively on the business and the commuter’s quality of life. The goal of this paper is to identify variables that influence the flow of traffic and to design a vehicle traffic prediction model, which will predict the traffic flow pattern in advance. The model will unable motorist to be able to make appropriate travel decisions ahead of time. The data used was collected by Mikro’s Traffic Monitoring (MTM). Multi-Layer perceptron (MLP) was used individually to construct the model and the MLP was also combined with Bagging ensemble method to training the data. The cross—validation method was used for evaluating the models. The results obtained from the techniques were compared using predictive and prediction costs. The cost was computed using combination of the loss matrix and the confusion matrix. The predicted models designed shows that the status of the traffic flow on the freeway can be predicted using the following parameters travel time, average speed, traffic volume and day of month. The implications of this work is that commuters will be able to spend less time travelling on the route and spend time with their families. The logistics industry will save more than twice what they are currently spending.

Keywords: bagging ensemble methods, confusion matrix, multi-layer perceptron, vehicle traffic flow

Procedia PDF Downloads 344
2171 Springback Prediction for Sheet Metal Cold Stamping Using Convolutional Neural Networks

Authors: Lei Zhu, Nan Li

Abstract:

Cold stamping has been widely applied in the automotive industry for the mass production of a great range of automotive panels. Predicting the springback to ensure the dimensional accuracy of the cold-stamped components is a critical step. The main approaches for the prediction and compensation of springback in cold stamping include running Finite Element (FE) simulations and conducting experiments, which require forming process expertise and can be time-consuming and expensive for the design of cold stamping tools. Machine learning technologies have been proven and successfully applied in learning complex system behaviours using presentative samples. These technologies exhibit the promising potential to be used as supporting design tools for metal forming technologies. This study, for the first time, presents a novel application of a Convolutional Neural Network (CNN) based surrogate model to predict the springback fields for variable U-shape cold bending geometries. A dataset is created based on the U-shape cold bending geometries and the corresponding FE simulations results. The dataset is then applied to train the CNN surrogate model. The result shows that the surrogate model can achieve near indistinguishable full-field predictions in real-time when compared with the FE simulation results. The application of CNN in efficient springback prediction can be adopted in industrial settings to aid both conceptual and final component designs for designers without having manufacturing knowledge.

Keywords: springback, cold stamping, convolutional neural networks, machine learning

Procedia PDF Downloads 149
2170 Design and Burnback Analysis of Three Dimensional Modified Star Grain

Authors: Almostafa Abdelaziz, Liang Guozhu, Anwer Elsayed

Abstract:

The determination of grain geometry is an important and critical step in the design of solid propellant rocket motor. In this study, the design process involved parametric geometry modeling in CAD, MATLAB coding of performance prediction and 2D star grain ignition experiment. The 2D star grain burnback achieved by creating new surface via each web increment and calculating geometrical properties at each step. The 2D star grain is further modified to burn as a tapered 3D star grain. Zero dimensional method used to calculate the internal ballistic performance. Experimental and theoretical results were compared in order to validate the performance prediction of the solid rocket motor. The results show that the usage of 3D grain geometry will decrease the pressure inside the combustion chamber and enhance the volumetric loading ratio.

Keywords: burnback analysis, rocket motor, star grain, three dimensional grains

Procedia PDF Downloads 245
2169 Effects of Global Validity of Predictive Cues upon L2 Discourse Comprehension: Evidence from Self-paced Reading

Authors: Binger Lu

Abstract:

It remains unclear whether second language (L2) speakers could use discourse context cues to predict upcoming information as native speakers do during online comprehension. Some researchers propose that L2 learners may have a reduced ability to generate predictions during discourse processing. At the same time, there is evidence that discourse-level cues are weighed more heavily in L2 processing than in L1. Previous studies showed that L1 prediction is sensitive to the global validity of predictive cues. The current study aims to explore whether and to what extent L2 learners can dynamically and strategically adjust their prediction in accord with the global validity of predictive cues in L2 discourse comprehension as native speakers do. In a self-paced reading experiment, Chinese native speakers (N=128), C-E bilinguals (N=128), and English native speakers (N=128) read high-predictable (e.g., Jimmy felt thirsty after running. He wanted to get some water from the refrigerator.) and low-predictable (e.g., Jimmy felt sick this morning. He wanted to get some water from the refrigerator.) discourses in two-sentence frames. The global validity of predictive cues was manipulated by varying the ratio of predictable (e.g., Bill stood at the door. He opened it with the key.) and unpredictable fillers (e.g., Bill stood at the door. He opened it with the card.), such that across conditions, the predictability of the final word of the fillers ranged from 100% to 0%. The dependent variable was reading time on the critical region (the target word and the following word), analyzed with linear mixed-effects models in R. C-E bilinguals showed reliable prediction across all validity conditions (β = -35.6 ms, SE = 7.74, t = -4.601, p< .001), and Chinese native speakers showed significant effect (β = -93.5 ms, SE = 7.82, t = -11.956, p< .001) in two of the four validity conditions (namely, the High-validity and MedLow conditions, where fillers ended with predictable words in 100% and 25% cases respectively), whereas English native speakers didn’t predict at all (β = -2.78 ms, SE = 7.60, t = -.365, p = .715). There was neither main effect (χ^²(3) = .256, p = .968) nor interaction (Predictability: Background: Validity, χ^²(3) = 1.229, p = .746; Predictability: Validity, χ^²(3) = 2.520, p = .472; Background: Validity, χ^²(3) = 1.281, p = .734) of Validity with speaker groups. The results suggest that prediction occurs in L2 discourse processing but to a much less extent in L1, witha significant effect in some conditions of L1 Chinese and anull effect in L1 English processing, consistent with the view that L2 speakers are more sensitive to discourse cues compared with L1 speakers. Additionally, the pattern of L1 and L2 predictive processing was not affected by the global validity of predictive cues. C-E bilinguals’ predictive processing could be partly transferred from their L1, as prior research showed that discourse information played a more significant role in L1 Chinese processing.

Keywords: bilingualism, discourse processing, global validity, prediction, self-paced reading

Procedia PDF Downloads 138
2168 Predicting National Football League (NFL) Match with Score-Based System

Authors: Marcho Setiawan Handok, Samuel S. Lemma, Abdoulaye Fofana, Naseef Mansoor

Abstract:

This paper is proposing a method to predict the outcome of the National Football League match with data from 2019 to 2022 and compare it with other popular models. The model uses open-source statistical data of each team, such as passing yards, rushing yards, fumbles lost, and scoring. Each statistical data has offensive and defensive. For instance, a data set of anticipated values for a specific matchup is created by comparing the offensive passing yards obtained by one team to the defensive passing yards given by the opposition. We evaluated the model’s performance by contrasting its result with those of established prediction algorithms. This research is using a neural network to predict the score of a National Football League match and then predict the winner of the game.

Keywords: game prediction, NFL, football, artificial neural network

Procedia PDF Downloads 84
2167 Role of von Willebrand Factor Antigen as Non-Invasive Biomarker for the Prediction of Portal Hypertensive Gastropathy in Patients with Liver Cirrhosis

Authors: Mohamed El Horri, Amine Mouden, Reda Messaoudi, Mohamed Chekkal, Driss Benlaldj, Malika Baghdadi, Lahcene Benmahdi, Fatima Seghier

Abstract:

Background/aim: Recently, the Von Willebrand factor antigen (vWF-Ag)has been identified as a new marker of portal hypertension (PH) and its complications. Few studies talked about its role in the prediction of esophageal varices. VWF-Ag is considered a non-invasive approach, In order to avoid the endoscopic burden, cost, drawbacks, unpleasant and repeated examinations to the patients. In our study, we aimed to evaluate the ability of this marker in the prediction of another complication of portal hypertension, which is portal hypertensive gastropathy (PHG), the one that is diagnosed also by endoscopic tools. Patients and methods: It is about a prospective study, which include 124 cirrhotic patients with no history of bleeding who underwent screening endoscopy for PH-related complications like esophageal varices (EVs) and PHG. Routine biological tests were performed as well as the VWF-Ag testing by both ELFA and Immunoturbidimetric techniques. The diagnostic performance of our marker was assessed using sensitivity, specificity, positive predictive value, negative predictive value, accuracy, and receiver operating characteristic curves. Results: 124 patients were enrolled in this study, with a mean age of 58 years [CI: 55 – 60 years] and a sex ratio of 1.17. Viral etiologies were found in 50% of patients. Screening endoscopy revealed the presence of PHG in 20.2% of cases, while for EVsthey were found in 83.1% of cases. VWF-Ag levels, were significantly increased in patients with PHG compared to those who have not: 441% [CI: 375 – 506], versus 279% [CI: 253 – 304], respectively (p <0.0001). Using the area under the receiver operating characteristic curve (AUC), vWF-Ag was a good predictor for the presence of PHG. With a value higher than 320% and an AUC of 0.824, VWF-Ag had an 84% sensitivity, 74% specificity, 44.7% positive predictive value, 94.8% negative predictive value, and 75.8% diagnostic accuracy. Conclusion: VWF-Ag is a good non-invasive low coast marker for excluding the presence of PHG in patients with liver cirrhosis. Using this marker as part of a selective screening strategy might reduce the need for endoscopic screening and the coast of the management of these kinds of patients.

Keywords: von willebrand factor, portal hypertensive gastropathy, prediction, liver cirrhosis

Procedia PDF Downloads 205
2166 Stock Price Prediction with 'Earnings' Conference Call Sentiment

Authors: Sungzoon Cho, Hye Jin Lee, Sungwhan Jeon, Dongyoung Min, Sungwon Lyu

Abstract:

Major public corporations worldwide use conference calls to report their quarterly earnings. These 'earnings' conference calls allow for questions from stock analysts. We investigated if it is possible to identify sentiment from the call script and use it to predict stock price movement. We analyzed call scripts from six companies, two each from Korea, China and Indonesia during six years 2011Q1 – 2017Q2. Random forest with Frequency-based sentiment scores using Loughran MacDonald Dictionary did better than control model with only financial indicators. When the stock prices went up 20 days from earnings release, our model predicted correctly 77% of time. When the model predicted 'up,' actual stock prices went up 65% of time. This preliminary result encourages us to investigate advanced sentiment scoring methodologies such as topic modeling, auto-encoder, and word2vec variants.

Keywords: earnings call script, random forest, sentiment analysis, stock price prediction

Procedia PDF Downloads 292
2165 Forecasting Direct Normal Irradiation at Djibouti Using Artificial Neural Network

Authors: Ahmed Kayad Abdourazak, Abderafi Souad, Zejli Driss, Idriss Abdoulkader Ibrahim

Abstract:

In this paper Artificial Neural Network (ANN) is used to predict the solar irradiation in Djibouti for the first Time that is useful to the integration of Concentrating Solar Power (CSP) and sites selections for new or future solar plants as part of solar energy development. An ANN algorithm was developed to establish a forward/reverse correspondence between the latitude, longitude, altitude and monthly solar irradiation. For this purpose the German Aerospace Centre (DLR) data of eight Djibouti sites were used as training and testing in a standard three layers network with the back propagation algorithm of Lavenber-Marquardt. Results have shown a very good agreement for the solar irradiation prediction in Djibouti and proves that the proposed approach can be well used as an efficient tool for prediction of solar irradiation by providing so helpful information concerning sites selection, design and planning of solar plants.

Keywords: artificial neural network, solar irradiation, concentrated solar power, Lavenberg-Marquardt

Procedia PDF Downloads 354
2164 Applying the Regression Technique for ‎Prediction of the Acute Heart Attack ‎

Authors: Paria Soleimani, Arezoo Neshati

Abstract:

Myocardial infarction is one of the leading causes of ‎death in the world. Some of these deaths occur even before the patient ‎reaches the hospital. Myocardial infarction occurs as a result of ‎impaired blood supply. Because the most of these deaths are due to ‎coronary artery disease, hence the awareness of the warning signs of a ‎heart attack is essential. Some heart attacks are sudden and intense, but ‎most of them start slowly, with mild pain or discomfort, then early ‎detection and successful treatment of these symptoms is vital to save ‎them. Therefore, importance and usefulness of a system designing to ‎assist physicians in the early diagnosis of the acute heart attacks is ‎obvious.‎ The purpose of this study is to determine how well a predictive ‎model would perform based on the only patient-reportable clinical ‎history factors, without using diagnostic tests or physical exams. This ‎type of the prediction model might have application outside of the ‎hospital setting to give accurate advice to patients to influence them to ‎seek care in appropriate situations. For this purpose, the data were ‎collected on 711 heart patients in Iran hospitals. 28 attributes of clinical ‎factors can be reported by patients; were studied. Three logistic ‎regression models were made on the basis of the 28 features to predict ‎the risk of heart attacks. The best logistic regression model in terms of ‎performance had a C-index of 0.955 and with an accuracy of 94.9%. ‎The variables, severe chest pain, back pain, cold sweats, shortness of ‎breath, nausea, and vomiting were selected as the main features.‎

Keywords: Coronary heart disease, Acute heart attacks, Prediction, Logistic ‎regression‎

Procedia PDF Downloads 449
2163 A Convolution Neural Network PM-10 Prediction System Based on a Dense Measurement Sensor Network in Poland

Authors: Piotr A. Kowalski, Kasper Sapala, Wiktor Warchalowski

Abstract:

PM10 is a suspended dust that primarily has a negative effect on the respiratory system. PM10 is responsible for attacks of coughing and wheezing, asthma or acute, violent bronchitis. Indirectly, PM10 also negatively affects the rest of the body, including increasing the risk of heart attack and stroke. Unfortunately, Poland is a country that cannot boast of good air quality, in particular, due to large PM concentration levels. Therefore, based on the dense network of Airly sensors, it was decided to deal with the problem of prediction of suspended particulate matter concentration. Due to the very complicated nature of this issue, the Machine Learning approach was used. For this purpose, Convolution Neural Network (CNN) neural networks have been adopted, these currently being the leading information processing methods in the field of computational intelligence. The aim of this research is to show the influence of particular CNN network parameters on the quality of the obtained forecast. The forecast itself is made on the basis of parameters measured by Airly sensors and is carried out for the subsequent day, hour after hour. The evaluation of learning process for the investigated models was mostly based upon the mean square error criterion; however, during the model validation, a number of other methods of quantitative evaluation were taken into account. The presented model of pollution prediction has been verified by way of real weather and air pollution data taken from the Airly sensor network. The dense and distributed network of Airly measurement devices enables access to current and archival data on air pollution, temperature, suspended particulate matter PM1.0, PM2.5, and PM10, CAQI levels, as well as atmospheric pressure and air humidity. In this investigation, PM2.5, and PM10, temperature and wind information, as well as external forecasts of temperature and wind for next 24h served as inputted data. Due to the specificity of the CNN type network, this data is transformed into tensors and then processed. This network consists of an input layer, an output layer, and many hidden layers. In the hidden layers, convolutional and pooling operations are performed. The output of this system is a vector containing 24 elements that contain prediction of PM10 concentration for the upcoming 24 hour period. Over 1000 models based on CNN methodology were tested during the study. During the research, several were selected out that give the best results, and then a comparison was made with the other models based on linear regression. The numerical tests carried out fully confirmed the positive properties of the presented method. These were carried out using real ‘big’ data. Models based on the CNN technique allow prediction of PM10 dust concentration with a much smaller mean square error than currently used methods based on linear regression. What's more, the use of neural networks increased Pearson's correlation coefficient (R²) by about 5 percent compared to the linear model. During the simulation, the R² coefficient was 0.92, 0.76, 0.75, 0.73, and 0.73 for 1st, 6th, 12th, 18th, and 24th hour of prediction respectively.

Keywords: air pollution prediction (forecasting), machine learning, regression task, convolution neural networks

Procedia PDF Downloads 149
2162 A Machine Learning Model for Dynamic Prediction of Chronic Kidney Disease Risk Using Laboratory Data, Non-Laboratory Data, and Metabolic Indices

Authors: Amadou Wurry Jallow, Adama N. S. Bah, Karamo Bah, Shih-Ye Wang, Kuo-Chung Chu, Chien-Yeh Hsu

Abstract:

Chronic kidney disease (CKD) is a major public health challenge with high prevalence, rising incidence, and serious adverse consequences. Developing effective risk prediction models is a cost-effective approach to predicting and preventing complications of chronic kidney disease (CKD). This study aimed to develop an accurate machine learning model that can dynamically identify individuals at risk of CKD using various kinds of diagnostic data, with or without laboratory data, at different follow-up points. Creatinine is a key component used to predict CKD. These models will enable affordable and effective screening for CKD even with incomplete patient data, such as the absence of creatinine testing. This retrospective cohort study included data on 19,429 adults provided by a private research institute and screening laboratory in Taiwan, gathered between 2001 and 2015. Univariate Cox proportional hazard regression analyses were performed to determine the variables with high prognostic values for predicting CKD. We then identified interacting variables and grouped them according to diagnostic data categories. Our models used three types of data gathered at three points in time: non-laboratory, laboratory, and metabolic indices data. Next, we used subgroups of variables within each category to train two machine learning models (Random Forest and XGBoost). Our machine learning models can dynamically discriminate individuals at risk for developing CKD. All the models performed well using all three kinds of data, with or without laboratory data. Using only non-laboratory-based data (such as age, sex, body mass index (BMI), and waist circumference), both models predict chronic kidney disease as accurately as models using laboratory and metabolic indices data. Our machine learning models have demonstrated the use of different categories of diagnostic data for CKD prediction, with or without laboratory data. The machine learning models are simple to use and flexible because they work even with incomplete data and can be applied in any clinical setting, including settings where laboratory data is difficult to obtain.

Keywords: chronic kidney disease, glomerular filtration rate, creatinine, novel metabolic indices, machine learning, risk prediction

Procedia PDF Downloads 105
2161 Prediction of Dubai Financial Market Stocks Movement Using K-Nearest Neighbor and Support Vector Regression

Authors: Abdulla D. Alblooshi

Abstract:

The stock market is a representation of human behavior and psychology, such as fear, greed, and discipline. Those are manifested in the form of price movements during the trading sessions. Therefore, predicting the stock movement and prices is a challenging effort. However, those trading sessions produce a large amount of data that can be utilized to train an AI agent for the purpose of predicting the stock movement. Predicting the stock market price action will be advantageous. In this paper, the stock movement data of three DFM listed stocks are studied using historical price movements and technical indicators value and used to train an agent using KNN and SVM methods to predict the future price movement. MATLAB Toolbox and a simple script is written to process and classify the information and output the prediction. It will also compare the different learning methods and parameters s using metrics like RMSE, MAE, and R².

Keywords: KNN, ANN, style, SVM, stocks, technical indicators, RSI, MACD, moving averages, RMSE, MAE

Procedia PDF Downloads 171
2160 Neuronal Networks for the Study of the Effects of Cosmic Rays on Climate Variations

Authors: Jossitt Williams Vargas Cruz, Aura Jazmín Pérez Ríos

Abstract:

The variations of solar dynamics have become a relevant topic of study due to the effects of climate changes generated on the earth. One of the most disconcerting aspects is the variability that the sun has on the climate is the role played by sunspots (extra-atmospheric variable) in the modulation of the Cosmic Rays CR (extra-atmospheric variable). CRs influence the earth's climate by affecting cloud formation (atmospheric variable), and solar cycle influence is associated with the presence of solar storms, and the magnetic activity is greater, resulting in less CR entering the earth's atmosphere. The different methods of climate prediction in Colombia do not take into account the extra-atmospheric variables. Therefore, correlations between atmospheric and extra-atmospheric variables were studied in order to implement a Python code based on neural networks to make the prediction of the extra-atmospheric variable with the highest correlation.

Keywords: correlations, cosmic rays, sun, sunspots and variations.

Procedia PDF Downloads 74
2159 A Wall Law for Two-Phase Turbulent Boundary Layers

Authors: Dhahri Maher, Aouinet Hana

Abstract:

The presence of bubbles in the boundary layer introduces corrections into the log law, which must be taken into account. In this work, a logarithmic wall law was presented for bubbly two phase flows. The wall law presented in this work was based on the postulation of additional turbulent viscosity associated with bubble wakes in the boundary layer. The presented wall law contained empirical constant accounting both for shear induced turbulence interaction and for non-linearity of bubble. This constant was deduced from experimental data. The wall friction prediction achieved with the wall law was compared to the experimental data, in the case of a turbulent boundary layer developing on a vertical flat plate in the presence of millimetric bubbles. A very good agreement between experimental and numerical wall friction prediction was verified. The agreement was especially noticeable for the low void fraction when bubble induced turbulence plays a significant role.

Keywords: bubbly flows, log law, boundary layer, CFD

Procedia PDF Downloads 278
2158 Learning Dynamic Representations of Nodes in Temporally Variant Graphs

Authors: Sandra Mitrovic, Gaurav Singh

Abstract:

In many industries, including telecommunications, churn prediction has been a topic of active research. A lot of attention has been drawn on devising the most informative features, and this area of research has gained even more focus with spread of (social) network analytics. The call detail records (CDRs) have been used to construct customer networks and extract potentially useful features. However, to the best of our knowledge, no studies including network features have yet proposed a generic way of representing network information. Instead, ad-hoc and dataset dependent solutions have been suggested. In this work, we build upon a recently presented method (node2vec) to obtain representations for nodes in observed network. The proposed approach is generic and applicable to any network and domain. Unlike node2vec, which assumes a static network, we consider a dynamic and time-evolving network. To account for this, we propose an approach that constructs the feature representation of each node by generating its node2vec representations at different timestamps, concatenating them and finally compressing using an auto-encoder-like method in order to retain reasonably long and informative feature vectors. We test the proposed method on churn prediction task in telco domain. To predict churners at timestamp ts+1, we construct training and testing datasets consisting of feature vectors from time intervals [t1, ts-1] and [t2, ts] respectively, and use traditional supervised classification models like SVM and Logistic Regression. Observed results show the effectiveness of proposed approach as compared to ad-hoc feature selection based approaches and static node2vec.

Keywords: churn prediction, dynamic networks, node2vec, auto-encoders

Procedia PDF Downloads 314
2157 Artificial Intelligence Methods in Estimating the Minimum Miscibility Pressure Required for Gas Flooding

Authors: Emad A. Mohammed

Abstract:

Utilizing the capabilities of Data Mining and Artificial Intelligence in the prediction of the minimum miscibility pressure (MMP) required for multi-contact miscible (MCM) displacement of reservoir petroleum by hydrocarbon gas flooding using Fuzzy Logic models and Artificial Neural Network models will help a lot in giving accurate results. The factors affecting the (MMP) as it is proved from the literature and from the dataset are as follows: XC2-6: Intermediate composition in the oil-containing C2-6, CO2 and H2S, in mole %, XC1: Amount of methane in the oil (%),T: Temperature (°C), MwC7+: Molecular weight of C7+ (g/mol), YC2+: Mole percent of C2+ composition in injected gas (%), MwC2+: Molecular weight of C2+ in injected gas. Fuzzy Logic and Neural Networks have been used widely in prediction and classification, with relatively high accuracy, in different fields of study. It is well known that the Fuzzy Inference system can handle uncertainty within the inputs such as in our case. The results of this work showed that our proposed models perform better with higher performance indices than other emprical correlations.

Keywords: MMP, gas flooding, artificial intelligence, correlation

Procedia PDF Downloads 144
2156 Time Series Modelling and Prediction of River Runoff: Case Study of Karkheh River, Iran

Authors: Karim Hamidi Machekposhti, Hossein Sedghi, Abdolrasoul Telvari, Hossein Babazadeh

Abstract:

Rainfall and runoff phenomenon is a chaotic and complex outcome of nature which requires sophisticated modelling and simulation methods for explanation and use. Time Series modelling allows runoff data analysis and can be used as forecasting tool. In the paper attempt is made to model river runoff data and predict the future behavioural pattern of river based on annual past observations of annual river runoff. The river runoff analysis and predict are done using ARIMA model. For evaluating the efficiency of prediction to hydrological events such as rainfall, runoff and etc., we use the statistical formulae applicable. The good agreement between predicted and observation river runoff coefficient of determination (R2) display that the ARIMA (4,1,1) is the suitable model for predicting Karkheh River runoff at Iran.

Keywords: time series modelling, ARIMA model, river runoff, Karkheh River, CLS method

Procedia PDF Downloads 341
2155 Ensemble-Based SVM Classification Approach for miRNA Prediction

Authors: Sondos M. Hammad, Sherin M. ElGokhy, Mahmoud M. Fahmy, Elsayed A. Sallam

Abstract:

In this paper, an ensemble-based Support Vector Machine (SVM) classification approach is proposed. It is used for miRNA prediction. Three problems, commonly associated with previous approaches, are alleviated. These problems arise due to impose assumptions on the secondary structural of premiRNA, imbalance between the numbers of the laboratory checked miRNAs and the pseudo-hairpins, and finally using a training data set that does not consider all the varieties of samples in different species. We aggregate the predicted outputs of three well-known SVM classifiers; namely, Triplet-SVM, Virgo and Mirident, weighted by their variant features without any structural assumptions. An additional SVM layer is used in aggregating the final output. The proposed approach is trained and then tested with balanced data sets. The results of the proposed approach outperform the three base classifiers. Improved values for the metrics of 88.88% f-score, 92.73% accuracy, 90.64% precision, 96.64% specificity, 87.2% sensitivity, and the area under the ROC curve is 0.91 are achieved.

Keywords: MiRNAs, SVM classification, ensemble algorithm, assumption problem, imbalance data

Procedia PDF Downloads 349
2154 Study of the Use of Artificial Neural Networks in Islamic Finance

Authors: Kaoutar Abbahaddou, Mohammed Salah Chiadmi

Abstract:

The need to find a relevant way to predict the next-day price of a stock index is a real concern for many financial stakeholders and researchers. We have known across years the proliferation of several methods. Nevertheless, among all these methods, the most controversial one is a machine learning algorithm that claims to be reliable, namely neural networks. Thus, the purpose of this article is to study the prediction power of neural networks in the particular case of Islamic finance as it is an under-looked area. In this article, we will first briefly present a review of the literature regarding neural networks and Islamic finance. Next, we present the architecture and principles of artificial neural networks most commonly used in finance. Then, we will show its empirical application on two Islamic stock indexes. The accuracy rate would be used to measure the performance of the algorithm in predicting the right price the next day. As a result, we can conclude that artificial neural networks are a reliable method to predict the next-day price for Islamic indices as it is claimed for conventional ones.

Keywords: Islamic finance, stock price prediction, artificial neural networks, machine learning

Procedia PDF Downloads 237
2153 CD133 and CD44 - Stem Cell Markers for Prediction of Clinically Aggressive Form of Colorectal Cancer

Authors: Ognen Kostovski, Svetozar Antovic, Rubens Jovanovic, Irena Kostovska, Nikola Jankulovski

Abstract:

Introduction:Colorectal carcinoma (CRC) is one of the most common malignancies in the world. The cancer stem cell (CSC) markers are associated with aggressive cancer types and poor prognosis. The aim of study was to determine whether the expression of colorectal cancer stem cell markers CD133 and CD44 could be significant in prediction of clinically aggressive form of CRC. Materials and methods: Our study included ninety patients (n=90) with CRC. Patients were divided into two subgroups: with metatstatic CRC and non-metastatic CRC. Tumor samples were analyzed with standard histopathological methods, than was performed immunohistochemical analysis with monoclonal antibodies against CD133 and CD44 stem cell markers. Results: High coexpression of CD133 and CD44 was observed in 71.4% of patients with metastatic disease, compared to 37.9% in patients without metastases. Discordant expression of both markers was found in 8% of the subgroup with metastatic CRC, and in 13.4% of the subgroup without metastatic CRC. Statistical analyses showed a significant association of increased expression of CD133 and CD44 with the disease stage, T - category and N - nodal status. With multiple regression analysis the stage of disease was designate as a factor with the greatest statistically significant influence on expression of CD133 (p <0.0001) and CD44 (p <0.0001). Conclusion: Our results suggest that the coexpression of CD133 and CD44 have an important role in prediction of clinically aggressive form of CRC. Both stem cell markers can be routinely implemented in standard pathohistological diagnostics and can be useful markers for pre-therapeutic oncology screening.

Keywords: colorectal carcinoma, stem cells, CD133+, CD44+

Procedia PDF Downloads 150