Search results for: click prediction
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2301

Search results for: click prediction

1911 Big Data: Appearance and Disappearance

Authors: James Moir

Abstract:

The mainstay of Big Data is prediction in that it allows practitioners, researchers, and policy analysts to predict trends based upon the analysis of large and varied sources of data. These can range from changing social and political opinions, patterns in crimes, and consumer behaviour. Big Data has therefore shifted the criterion of success in science from causal explanations to predictive modelling and simulation. The 19th-century science sought to capture phenomena and seek to show the appearance of it through causal mechanisms while 20th-century science attempted to save the appearance and relinquish causal explanations. Now 21st-century science in the form of Big Data is concerned with the prediction of appearances and nothing more. However, this pulls social science back in the direction of a more rule- or law-governed reality model of science and away from a consideration of the internal nature of rules in relation to various practices. In effect Big Data offers us no more than a world of surface appearance and in doing so it makes disappear any context-specific conceptual sensitivity.

Keywords: big data, appearance, disappearance, surface, epistemology

Procedia PDF Downloads 419
1910 Addressing Security and Privacy Issues in a Smart Environment by Using Block-Chain as a Preemptive Technique

Authors: Shahbaz Pervez, Aljawharah Almuhana, Zahida Parveen, Samina Naz, Hira Tariq, Seyed Hosseini, Muhammad Awais Azam

Abstract:

With the latest development in the field of cutting-edge technologies, there is a rapid increase in the use of technology-oriented gadgets. In a recent scenario of the tech era, there is increasing demand to fulfill our day-to-day routine tasks with the help of technological gadgets. We are living in an era of technology where trends have been changing, and a race to introduce a new technology gadget has already begun. Smart cities are getting more popular with every passing day; city councils and governments are under enormous pressure to provide the latest services for their citizens and equip them with all the latest facilities. Thus, ultimately, they are going more into smart cities infrastructure building, providing services to their inhabitants with a single click from their smart devices. This trend is very exciting, but on the other hand, if some incident of security breach happens due to any weaker link, the results would be catastrophic. This paper addresses potential security and privacy breaches with a possible solution by using Blockchain technology in IoT enabled environment.

Keywords: blockchain, cybersecurity, DDOS, intrusion detection, IoT, RFID, smart devices security, smart services

Procedia PDF Downloads 117
1909 Prediction of Childbearing Orientations According to Couples' Sexual Review Component

Authors: Razieh Rezaeekalantari

Abstract:

Objective: The purpose of this study was to investigate the prediction of parenting orientations in terms of the components of couples' sexual review. Methods: This was a descriptive correlational research method. The population consisted of 500 couples referring to Sari Health Center. Two hundred and fifteen (215) people were selected randomly by using Krejcie-Morgan-sample-size-table. For data collection, the childbearing orientations scale and the Multidimensional Sexual Self-Concept Questionnaire were used. Result: For data analysis, the mean and standard deviation were used and to analyze the research hypothesis regression correlation and inferential statistics were used. Conclusion: The findings indicate that there is not a significant relationship between the tendency to childbearing and the predictive value of sexual review (r = 0.84) with significant level (sig = 219.19) (P < 0.05). So, with 95% confidence, we conclude that there is not a meaningful relationship between sexual orientation and tendency to child-rearing.

Keywords: couples referring, health center, sexual review component, parenting orientations

Procedia PDF Downloads 218
1908 Sorghum Grains Grading for Food, Feed, and Fuel Using NIR Spectroscopy

Authors: Irsa Ejaz, Siyang He, Wei Li, Naiyue Hu, Chaochen Tang, Songbo Li, Meng Li, Boubacar Diallo, Guanghui Xie, Kang Yu

Abstract:

Background: Near-infrared spectroscopy (NIR) is a non-destructive, fast, and low-cost method to measure the grain quality of different cereals. Previously reported NIR model calibrations using the whole grain spectra had moderate accuracy. Improved predictions are achievable by using the spectra of whole grains, when compared with the use of spectra collected from the flour samples. However, the feasibility for determining the critical biochemicals, related to the classifications for food, feed, and fuel products are not adequately investigated. Objectives: To evaluate the feasibility of using NIRS and the influence of four sample types (whole grains, flours, hulled grain flours, and hull-less grain flours) on the prediction of chemical components to improve the grain sorting efficiency for human food, animal feed, and biofuel. Methods: NIR was applied in this study to determine the eight biochemicals in four types of sorghum samples: hulled grain flours, hull-less grain flours, whole grains, and grain flours. A total of 20 hybrids of sorghum grains were selected from the two locations in China. Followed by NIR spectral and wet-chemically measured biochemical data, partial least squares regression (PLSR) was used to construct the prediction models. Results: The results showed that sorghum grain morphology and sample format affected the prediction of biochemicals. Using NIR data of grain flours generally improved the prediction compared with the use of NIR data of whole grains. In addition, using the spectra of whole grains enabled comparable predictions, which are recommended when a non-destructive and rapid analysis is required. Compared with the hulled grain flours, hull-less grain flours allowed for improved predictions for tannin, cellulose, and hemicellulose using NIR data. Conclusion: The established PLSR models could enable food, feed, and fuel producers to efficiently evaluate a large number of samples by predicting the required biochemical components in sorghum grains without destruction.

Keywords: FT-NIR, sorghum grains, biochemical composition, food, feed, fuel, PLSR

Procedia PDF Downloads 66
1907 Analytical Study of Data Mining Techniques for Software Quality Assurance

Authors: Mariam Bibi, Rubab Mehboob, Mehreen Sirshar

Abstract:

Satisfying the customer requirements is the ultimate goal of producing or developing any product. The quality of the product is decided on the bases of the level of customer satisfaction. There are different techniques which have been reported during the survey which enhance the quality of the product through software defect prediction and by locating the missing software requirements. Some mining techniques were proposed to assess the individual performance indicators in collaborative environment to reduce errors at individual level. The basic intention is to produce a product with zero or few defects thereby producing a best product quality wise. In the analysis of survey the techniques like Genetic algorithm, artificial neural network, classification and clustering techniques and decision tree are studied. After analysis it has been discovered that these techniques contributed much to the improvement and enhancement of the quality of the product.

Keywords: data mining, defect prediction, missing requirements, software quality

Procedia PDF Downloads 463
1906 Frequent Itemset Mining Using Rough-Sets

Authors: Usman Qamar, Younus Javed

Abstract:

Frequent pattern mining is the process of finding a pattern (a set of items, subsequences, substructures, etc.) that occurs frequently in a data set. It was proposed in the context of frequent itemsets and association rule mining. Frequent pattern mining is used to find inherent regularities in data. What products were often purchased together? Its applications include basket data analysis, cross-marketing, catalog design, sale campaign analysis, Web log (click stream) analysis, and DNA sequence analysis. However, one of the bottlenecks of frequent itemset mining is that as the data increase the amount of time and resources required to mining the data increases at an exponential rate. In this investigation a new algorithm is proposed which can be uses as a pre-processor for frequent itemset mining. FASTER (FeAture SelecTion using Entropy and Rough sets) is a hybrid pre-processor algorithm which utilizes entropy and rough-sets to carry out record reduction and feature (attribute) selection respectively. FASTER for frequent itemset mining can produce a speed up of 3.1 times when compared to original algorithm while maintaining an accuracy of 71%.

Keywords: rough-sets, classification, feature selection, entropy, outliers, frequent itemset mining

Procedia PDF Downloads 435
1905 Cardiovascular Disease Prediction Using Machine Learning Approaches

Authors: P. Halder, A. Zaman

Abstract:

It is estimated that heart disease accounts for one in ten deaths worldwide. United States deaths due to heart disease are among the leading causes of death according to the World Health Organization. Cardiovascular diseases (CVDs) account for one in four U.S. deaths, according to the Centers for Disease Control and Prevention (CDC). According to statistics, women are more likely than men to die from heart disease as a result of strokes. A 50% increase in men's mortality was reported by the World Health Organization in 2009. The consequences of cardiovascular disease are severe. The causes of heart disease include diabetes, high blood pressure, high cholesterol, abnormal pulse rates, etc. Machine learning (ML) can be used to make predictions and decisions in the healthcare industry. Thus, scientists have turned to modern technologies like Machine Learning and Data Mining to predict diseases. The disease prediction is based on four algorithms. Compared to other boosts, the Ada boost is much more accurate.

Keywords: heart disease, cardiovascular disease, coronary artery disease, feature selection, random forest, AdaBoost, SVM, decision tree

Procedia PDF Downloads 151
1904 Prediction of Sepsis Illness from Patients Vital Signs Using Long Short-Term Memory Network and Dynamic Analysis

Authors: Marcio Freire Cruz, Naoaki Ono, Shigehiko Kanaya, Carlos Arthur Mattos Teixeira Cavalcante

Abstract:

The systems that record patient care information, known as Electronic Medical Record (EMR) and those that monitor vital signs of patients, such as heart rate, body temperature, and blood pressure have been extremely valuable for the effectiveness of the patient’s treatment. Several kinds of research have been using data from EMRs and vital signs of patients to predict illnesses. Among them, we highlight those that intend to predict, classify, or, at least identify patterns, of sepsis illness in patients under vital signs monitoring. Sepsis is an organic dysfunction caused by a dysregulated patient's response to an infection that affects millions of people worldwide. Early detection of sepsis is expected to provide a significant improvement in its treatment. Preceding works usually combined medical, statistical, mathematical and computational models to develop detection methods for early prediction, getting higher accuracies, and using the smallest number of variables. Among other techniques, we could find researches using survival analysis, specialist systems, machine learning and deep learning that reached great results. In our research, patients are modeled as points moving each hour in an n-dimensional space where n is the number of vital signs (variables). These points can reach a sepsis target point after some time. For now, the sepsis target point was calculated using the median of all patients’ variables on the sepsis onset. From these points, we calculate for each hour the position vector, the first derivative (velocity vector) and the second derivative (acceleration vector) of the variables to evaluate their behavior. And we construct a prediction model based on a Long Short-Term Memory (LSTM) Network, including these derivatives as explanatory variables. The accuracy of the prediction 6 hours before the time of sepsis, considering only the vital signs reached 83.24% and by including the vectors position, speed, and acceleration, we obtained 94.96%. The data are being collected from Medical Information Mart for Intensive Care (MIMIC) Database, a public database that contains vital signs, laboratory test results, observations, notes, and so on, from more than 60.000 patients.

Keywords: dynamic analysis, long short-term memory, prediction, sepsis

Procedia PDF Downloads 124
1903 Personalized Infectious Disease Risk Prediction System: A Knowledge Model

Authors: Retno A. Vinarti, Lucy M. Hederman

Abstract:

This research describes a knowledge model for a system which give personalized alert to users about infectious disease risks in the context of weather, location and time. The knowledge model is based on established epidemiological concepts augmented by information gleaned from infection-related data repositories. The existing disease risk prediction research has more focuses on utilizing raw historical data and yield seasonal patterns of infectious disease risk emergence. This research incorporates both data and epidemiological concepts gathered from Atlas of Human Infectious Disease (AHID) and Centre of Disease Control (CDC) as basic reasoning of infectious disease risk prediction. Using CommonKADS methodology, the disease risk prediction task is an assignment synthetic task, starting from knowledge identification through specification, refinement to implementation. First, knowledge is gathered from AHID primarily from the epidemiology and risk group chapters for each infectious disease. The result of this stage is five major elements (Person, Infectious Disease, Weather, Location and Time) and their properties. At the knowledge specification stage, the initial tree model of each element and detailed relationships are produced. This research also includes a validation step as part of knowledge refinement: on the basis that the best model is formed using the most common features, Frequency-based Selection (FBS) is applied. The portion of the Infectious Disease risk model relating to Person comes out strongest, with Location next, and Weather weaker. For Person attribute, Age is the strongest, Activity and Habits are moderate, and Blood type is weakest. At the Location attribute, General category (e.g. continents, region, country, and island) results much stronger than Specific category (i.e. terrain feature). For Weather attribute, Less Precise category (i.e. season) comes out stronger than Precise category (i.e. exact temperature or humidity interval). However, given that some infectious diseases are significantly more serious than others, a frequency based metric may not be appropriate. Future work will incorporate epidemiological measurements of disease seriousness (e.g. odds ratio, hazard ratio and fatality rate) into the validation metrics. This research is limited to modelling existing knowledge about epidemiology and chain of infection concepts. Further step, verification in knowledge refinement stage, might cause some minor changes on the shape of tree.

Keywords: epidemiology, knowledge modelling, infectious disease, prediction, risk

Procedia PDF Downloads 241
1902 Surface Roughness Prediction Using Numerical Scheme and Adaptive Control

Authors: Michael K.O. Ayomoh, Khaled A. Abou-El-Hossein., Sameh F.M. Ghobashy

Abstract:

This paper proposes a numerical modelling scheme for surface roughness prediction. The approach is premised on the use of 3D difference analysis method enhanced with the use of feedback control loop where a set of adaptive weights are generated. The surface roughness values utilized in this paper were adapted from [1]. Their experiments were carried out using S55C high carbon steel. A comparison was further carried out between the proposed technique and those utilized in [1]. The experimental design has three cutting parameters namely: depth of cut, feed rate and cutting speed with twenty-seven experimental sample-space. The simulation trials conducted using Matlab software is of two sub-classes namely: prediction of the surface roughness readings for the non-boundary cutting combinations (NBCC) with the aid of the known surface roughness readings of the boundary cutting combinations (BCC). The following simulation involved the use of the predicted outputs from the NBCC to recover the surface roughness readings for the boundary cutting combinations (BCC). The simulation trial for the NBCC attained a state of total stability in the 7th iteration i.e. a point where the actual and desired roughness readings are equal such that error is minimized to zero by using a set of dynamic weights generated in every following simulation trial. A comparative study among the three methods showed that the proposed difference analysis technique with adaptive weight from feedback control, produced a much accurate output as against the abductive and regression analysis techniques presented in this.

Keywords: Difference Analysis, Surface Roughness; Mesh- Analysis, Feedback control, Adaptive weight, Boundary Element

Procedia PDF Downloads 620
1901 The Design of a Vehicle Traffic Flow Prediction Model for a Gauteng Freeway Based on an Ensemble of Multi-Layer Perceptron

Authors: Tebogo Emma Makaba, Barnabas Ndlovu Gatsheni

Abstract:

The cities of Johannesburg and Pretoria both located in the Gauteng province are separated by a distance of 58 km. The traffic queues on the Ben Schoeman freeway which connects these two cities can stretch for almost 1.5 km. Vehicle traffic congestion impacts negatively on the business and the commuter’s quality of life. The goal of this paper is to identify variables that influence the flow of traffic and to design a vehicle traffic prediction model, which will predict the traffic flow pattern in advance. The model will unable motorist to be able to make appropriate travel decisions ahead of time. The data used was collected by Mikro’s Traffic Monitoring (MTM). Multi-Layer perceptron (MLP) was used individually to construct the model and the MLP was also combined with Bagging ensemble method to training the data. The cross—validation method was used for evaluating the models. The results obtained from the techniques were compared using predictive and prediction costs. The cost was computed using combination of the loss matrix and the confusion matrix. The predicted models designed shows that the status of the traffic flow on the freeway can be predicted using the following parameters travel time, average speed, traffic volume and day of month. The implications of this work is that commuters will be able to spend less time travelling on the route and spend time with their families. The logistics industry will save more than twice what they are currently spending.

Keywords: bagging ensemble methods, confusion matrix, multi-layer perceptron, vehicle traffic flow

Procedia PDF Downloads 343
1900 Springback Prediction for Sheet Metal Cold Stamping Using Convolutional Neural Networks

Authors: Lei Zhu, Nan Li

Abstract:

Cold stamping has been widely applied in the automotive industry for the mass production of a great range of automotive panels. Predicting the springback to ensure the dimensional accuracy of the cold-stamped components is a critical step. The main approaches for the prediction and compensation of springback in cold stamping include running Finite Element (FE) simulations and conducting experiments, which require forming process expertise and can be time-consuming and expensive for the design of cold stamping tools. Machine learning technologies have been proven and successfully applied in learning complex system behaviours using presentative samples. These technologies exhibit the promising potential to be used as supporting design tools for metal forming technologies. This study, for the first time, presents a novel application of a Convolutional Neural Network (CNN) based surrogate model to predict the springback fields for variable U-shape cold bending geometries. A dataset is created based on the U-shape cold bending geometries and the corresponding FE simulations results. The dataset is then applied to train the CNN surrogate model. The result shows that the surrogate model can achieve near indistinguishable full-field predictions in real-time when compared with the FE simulation results. The application of CNN in efficient springback prediction can be adopted in industrial settings to aid both conceptual and final component designs for designers without having manufacturing knowledge.

Keywords: springback, cold stamping, convolutional neural networks, machine learning

Procedia PDF Downloads 147
1899 Design and Burnback Analysis of Three Dimensional Modified Star Grain

Authors: Almostafa Abdelaziz, Liang Guozhu, Anwer Elsayed

Abstract:

The determination of grain geometry is an important and critical step in the design of solid propellant rocket motor. In this study, the design process involved parametric geometry modeling in CAD, MATLAB coding of performance prediction and 2D star grain ignition experiment. The 2D star grain burnback achieved by creating new surface via each web increment and calculating geometrical properties at each step. The 2D star grain is further modified to burn as a tapered 3D star grain. Zero dimensional method used to calculate the internal ballistic performance. Experimental and theoretical results were compared in order to validate the performance prediction of the solid rocket motor. The results show that the usage of 3D grain geometry will decrease the pressure inside the combustion chamber and enhance the volumetric loading ratio.

Keywords: burnback analysis, rocket motor, star grain, three dimensional grains

Procedia PDF Downloads 239
1898 Effects of Global Validity of Predictive Cues upon L2 Discourse Comprehension: Evidence from Self-paced Reading

Authors: Binger Lu

Abstract:

It remains unclear whether second language (L2) speakers could use discourse context cues to predict upcoming information as native speakers do during online comprehension. Some researchers propose that L2 learners may have a reduced ability to generate predictions during discourse processing. At the same time, there is evidence that discourse-level cues are weighed more heavily in L2 processing than in L1. Previous studies showed that L1 prediction is sensitive to the global validity of predictive cues. The current study aims to explore whether and to what extent L2 learners can dynamically and strategically adjust their prediction in accord with the global validity of predictive cues in L2 discourse comprehension as native speakers do. In a self-paced reading experiment, Chinese native speakers (N=128), C-E bilinguals (N=128), and English native speakers (N=128) read high-predictable (e.g., Jimmy felt thirsty after running. He wanted to get some water from the refrigerator.) and low-predictable (e.g., Jimmy felt sick this morning. He wanted to get some water from the refrigerator.) discourses in two-sentence frames. The global validity of predictive cues was manipulated by varying the ratio of predictable (e.g., Bill stood at the door. He opened it with the key.) and unpredictable fillers (e.g., Bill stood at the door. He opened it with the card.), such that across conditions, the predictability of the final word of the fillers ranged from 100% to 0%. The dependent variable was reading time on the critical region (the target word and the following word), analyzed with linear mixed-effects models in R. C-E bilinguals showed reliable prediction across all validity conditions (β = -35.6 ms, SE = 7.74, t = -4.601, p< .001), and Chinese native speakers showed significant effect (β = -93.5 ms, SE = 7.82, t = -11.956, p< .001) in two of the four validity conditions (namely, the High-validity and MedLow conditions, where fillers ended with predictable words in 100% and 25% cases respectively), whereas English native speakers didn’t predict at all (β = -2.78 ms, SE = 7.60, t = -.365, p = .715). There was neither main effect (χ^²(3) = .256, p = .968) nor interaction (Predictability: Background: Validity, χ^²(3) = 1.229, p = .746; Predictability: Validity, χ^²(3) = 2.520, p = .472; Background: Validity, χ^²(3) = 1.281, p = .734) of Validity with speaker groups. The results suggest that prediction occurs in L2 discourse processing but to a much less extent in L1, witha significant effect in some conditions of L1 Chinese and anull effect in L1 English processing, consistent with the view that L2 speakers are more sensitive to discourse cues compared with L1 speakers. Additionally, the pattern of L1 and L2 predictive processing was not affected by the global validity of predictive cues. C-E bilinguals’ predictive processing could be partly transferred from their L1, as prior research showed that discourse information played a more significant role in L1 Chinese processing.

Keywords: bilingualism, discourse processing, global validity, prediction, self-paced reading

Procedia PDF Downloads 138
1897 Analyzing the Significance of Online Purchase Behavior of Tourists for the Development of Online Travel Bookings

Authors: April C. Abalos, Marmie R. Poquiz, Paul Nigel S. Abalos

Abstract:

With the advent of the fourth industrial revolution, everything is becoming possible with just a single click through the internet. What is more exciting is that through the power of the technological advancements, options are readily available in one’s fingertips. These technological advancements have greatly affected the perspectives of people in almost all human endeavors, even in their purchasing behavior. Hence, this study is conceptualized. This aims to identify the significance of the online purchase behavior of tourists for the development of travel bookings and provide knowledge to sellers and understanding major factors towards the online purchase behavior of tourists. Social media applications in booking online were also identified, as well as the profile and the marketing strategies influencing the behavior of individuals in an online travel booking. This study also sought to determine which behavioral intention should be given more attention to know where to exert more effort in winning the hearts of consumers. This study used a descriptive-survey design using an online survey questionnaire to gather real-time responses from the tourists visiting and/or planning to visit the scenic spots in the province of Pangasinan, which are highly reliable to formulate conclusions as deemed necessary.

Keywords: behavior, online purchase, tourists, travel bookings

Procedia PDF Downloads 126
1896 Predicting National Football League (NFL) Match with Score-Based System

Authors: Marcho Setiawan Handok, Samuel S. Lemma, Abdoulaye Fofana, Naseef Mansoor

Abstract:

This paper is proposing a method to predict the outcome of the National Football League match with data from 2019 to 2022 and compare it with other popular models. The model uses open-source statistical data of each team, such as passing yards, rushing yards, fumbles lost, and scoring. Each statistical data has offensive and defensive. For instance, a data set of anticipated values for a specific matchup is created by comparing the offensive passing yards obtained by one team to the defensive passing yards given by the opposition. We evaluated the model’s performance by contrasting its result with those of established prediction algorithms. This research is using a neural network to predict the score of a National Football League match and then predict the winner of the game.

Keywords: game prediction, NFL, football, artificial neural network

Procedia PDF Downloads 80
1895 Role of von Willebrand Factor Antigen as Non-Invasive Biomarker for the Prediction of Portal Hypertensive Gastropathy in Patients with Liver Cirrhosis

Authors: Mohamed El Horri, Amine Mouden, Reda Messaoudi, Mohamed Chekkal, Driss Benlaldj, Malika Baghdadi, Lahcene Benmahdi, Fatima Seghier

Abstract:

Background/aim: Recently, the Von Willebrand factor antigen (vWF-Ag)has been identified as a new marker of portal hypertension (PH) and its complications. Few studies talked about its role in the prediction of esophageal varices. VWF-Ag is considered a non-invasive approach, In order to avoid the endoscopic burden, cost, drawbacks, unpleasant and repeated examinations to the patients. In our study, we aimed to evaluate the ability of this marker in the prediction of another complication of portal hypertension, which is portal hypertensive gastropathy (PHG), the one that is diagnosed also by endoscopic tools. Patients and methods: It is about a prospective study, which include 124 cirrhotic patients with no history of bleeding who underwent screening endoscopy for PH-related complications like esophageal varices (EVs) and PHG. Routine biological tests were performed as well as the VWF-Ag testing by both ELFA and Immunoturbidimetric techniques. The diagnostic performance of our marker was assessed using sensitivity, specificity, positive predictive value, negative predictive value, accuracy, and receiver operating characteristic curves. Results: 124 patients were enrolled in this study, with a mean age of 58 years [CI: 55 – 60 years] and a sex ratio of 1.17. Viral etiologies were found in 50% of patients. Screening endoscopy revealed the presence of PHG in 20.2% of cases, while for EVsthey were found in 83.1% of cases. VWF-Ag levels, were significantly increased in patients with PHG compared to those who have not: 441% [CI: 375 – 506], versus 279% [CI: 253 – 304], respectively (p <0.0001). Using the area under the receiver operating characteristic curve (AUC), vWF-Ag was a good predictor for the presence of PHG. With a value higher than 320% and an AUC of 0.824, VWF-Ag had an 84% sensitivity, 74% specificity, 44.7% positive predictive value, 94.8% negative predictive value, and 75.8% diagnostic accuracy. Conclusion: VWF-Ag is a good non-invasive low coast marker for excluding the presence of PHG in patients with liver cirrhosis. Using this marker as part of a selective screening strategy might reduce the need for endoscopic screening and the coast of the management of these kinds of patients.

Keywords: von willebrand factor, portal hypertensive gastropathy, prediction, liver cirrhosis

Procedia PDF Downloads 201
1894 Stock Price Prediction with 'Earnings' Conference Call Sentiment

Authors: Sungzoon Cho, Hye Jin Lee, Sungwhan Jeon, Dongyoung Min, Sungwon Lyu

Abstract:

Major public corporations worldwide use conference calls to report their quarterly earnings. These 'earnings' conference calls allow for questions from stock analysts. We investigated if it is possible to identify sentiment from the call script and use it to predict stock price movement. We analyzed call scripts from six companies, two each from Korea, China and Indonesia during six years 2011Q1 – 2017Q2. Random forest with Frequency-based sentiment scores using Loughran MacDonald Dictionary did better than control model with only financial indicators. When the stock prices went up 20 days from earnings release, our model predicted correctly 77% of time. When the model predicted 'up,' actual stock prices went up 65% of time. This preliminary result encourages us to investigate advanced sentiment scoring methodologies such as topic modeling, auto-encoder, and word2vec variants.

Keywords: earnings call script, random forest, sentiment analysis, stock price prediction

Procedia PDF Downloads 292
1893 Forecasting Direct Normal Irradiation at Djibouti Using Artificial Neural Network

Authors: Ahmed Kayad Abdourazak, Abderafi Souad, Zejli Driss, Idriss Abdoulkader Ibrahim

Abstract:

In this paper Artificial Neural Network (ANN) is used to predict the solar irradiation in Djibouti for the first Time that is useful to the integration of Concentrating Solar Power (CSP) and sites selections for new or future solar plants as part of solar energy development. An ANN algorithm was developed to establish a forward/reverse correspondence between the latitude, longitude, altitude and monthly solar irradiation. For this purpose the German Aerospace Centre (DLR) data of eight Djibouti sites were used as training and testing in a standard three layers network with the back propagation algorithm of Lavenber-Marquardt. Results have shown a very good agreement for the solar irradiation prediction in Djibouti and proves that the proposed approach can be well used as an efficient tool for prediction of solar irradiation by providing so helpful information concerning sites selection, design and planning of solar plants.

Keywords: artificial neural network, solar irradiation, concentrated solar power, Lavenberg-Marquardt

Procedia PDF Downloads 352
1892 Applying the Regression Technique for ‎Prediction of the Acute Heart Attack ‎

Authors: Paria Soleimani, Arezoo Neshati

Abstract:

Myocardial infarction is one of the leading causes of ‎death in the world. Some of these deaths occur even before the patient ‎reaches the hospital. Myocardial infarction occurs as a result of ‎impaired blood supply. Because the most of these deaths are due to ‎coronary artery disease, hence the awareness of the warning signs of a ‎heart attack is essential. Some heart attacks are sudden and intense, but ‎most of them start slowly, with mild pain or discomfort, then early ‎detection and successful treatment of these symptoms is vital to save ‎them. Therefore, importance and usefulness of a system designing to ‎assist physicians in the early diagnosis of the acute heart attacks is ‎obvious.‎ The purpose of this study is to determine how well a predictive ‎model would perform based on the only patient-reportable clinical ‎history factors, without using diagnostic tests or physical exams. This ‎type of the prediction model might have application outside of the ‎hospital setting to give accurate advice to patients to influence them to ‎seek care in appropriate situations. For this purpose, the data were ‎collected on 711 heart patients in Iran hospitals. 28 attributes of clinical ‎factors can be reported by patients; were studied. Three logistic ‎regression models were made on the basis of the 28 features to predict ‎the risk of heart attacks. The best logistic regression model in terms of ‎performance had a C-index of 0.955 and with an accuracy of 94.9%. ‎The variables, severe chest pain, back pain, cold sweats, shortness of ‎breath, nausea, and vomiting were selected as the main features.‎

Keywords: Coronary heart disease, Acute heart attacks, Prediction, Logistic ‎regression‎

Procedia PDF Downloads 447
1891 A Convolution Neural Network PM-10 Prediction System Based on a Dense Measurement Sensor Network in Poland

Authors: Piotr A. Kowalski, Kasper Sapala, Wiktor Warchalowski

Abstract:

PM10 is a suspended dust that primarily has a negative effect on the respiratory system. PM10 is responsible for attacks of coughing and wheezing, asthma or acute, violent bronchitis. Indirectly, PM10 also negatively affects the rest of the body, including increasing the risk of heart attack and stroke. Unfortunately, Poland is a country that cannot boast of good air quality, in particular, due to large PM concentration levels. Therefore, based on the dense network of Airly sensors, it was decided to deal with the problem of prediction of suspended particulate matter concentration. Due to the very complicated nature of this issue, the Machine Learning approach was used. For this purpose, Convolution Neural Network (CNN) neural networks have been adopted, these currently being the leading information processing methods in the field of computational intelligence. The aim of this research is to show the influence of particular CNN network parameters on the quality of the obtained forecast. The forecast itself is made on the basis of parameters measured by Airly sensors and is carried out for the subsequent day, hour after hour. The evaluation of learning process for the investigated models was mostly based upon the mean square error criterion; however, during the model validation, a number of other methods of quantitative evaluation were taken into account. The presented model of pollution prediction has been verified by way of real weather and air pollution data taken from the Airly sensor network. The dense and distributed network of Airly measurement devices enables access to current and archival data on air pollution, temperature, suspended particulate matter PM1.0, PM2.5, and PM10, CAQI levels, as well as atmospheric pressure and air humidity. In this investigation, PM2.5, and PM10, temperature and wind information, as well as external forecasts of temperature and wind for next 24h served as inputted data. Due to the specificity of the CNN type network, this data is transformed into tensors and then processed. This network consists of an input layer, an output layer, and many hidden layers. In the hidden layers, convolutional and pooling operations are performed. The output of this system is a vector containing 24 elements that contain prediction of PM10 concentration for the upcoming 24 hour period. Over 1000 models based on CNN methodology were tested during the study. During the research, several were selected out that give the best results, and then a comparison was made with the other models based on linear regression. The numerical tests carried out fully confirmed the positive properties of the presented method. These were carried out using real ‘big’ data. Models based on the CNN technique allow prediction of PM10 dust concentration with a much smaller mean square error than currently used methods based on linear regression. What's more, the use of neural networks increased Pearson's correlation coefficient (R²) by about 5 percent compared to the linear model. During the simulation, the R² coefficient was 0.92, 0.76, 0.75, 0.73, and 0.73 for 1st, 6th, 12th, 18th, and 24th hour of prediction respectively.

Keywords: air pollution prediction (forecasting), machine learning, regression task, convolution neural networks

Procedia PDF Downloads 148
1890 A Machine Learning Model for Dynamic Prediction of Chronic Kidney Disease Risk Using Laboratory Data, Non-Laboratory Data, and Metabolic Indices

Authors: Amadou Wurry Jallow, Adama N. S. Bah, Karamo Bah, Shih-Ye Wang, Kuo-Chung Chu, Chien-Yeh Hsu

Abstract:

Chronic kidney disease (CKD) is a major public health challenge with high prevalence, rising incidence, and serious adverse consequences. Developing effective risk prediction models is a cost-effective approach to predicting and preventing complications of chronic kidney disease (CKD). This study aimed to develop an accurate machine learning model that can dynamically identify individuals at risk of CKD using various kinds of diagnostic data, with or without laboratory data, at different follow-up points. Creatinine is a key component used to predict CKD. These models will enable affordable and effective screening for CKD even with incomplete patient data, such as the absence of creatinine testing. This retrospective cohort study included data on 19,429 adults provided by a private research institute and screening laboratory in Taiwan, gathered between 2001 and 2015. Univariate Cox proportional hazard regression analyses were performed to determine the variables with high prognostic values for predicting CKD. We then identified interacting variables and grouped them according to diagnostic data categories. Our models used three types of data gathered at three points in time: non-laboratory, laboratory, and metabolic indices data. Next, we used subgroups of variables within each category to train two machine learning models (Random Forest and XGBoost). Our machine learning models can dynamically discriminate individuals at risk for developing CKD. All the models performed well using all three kinds of data, with or without laboratory data. Using only non-laboratory-based data (such as age, sex, body mass index (BMI), and waist circumference), both models predict chronic kidney disease as accurately as models using laboratory and metabolic indices data. Our machine learning models have demonstrated the use of different categories of diagnostic data for CKD prediction, with or without laboratory data. The machine learning models are simple to use and flexible because they work even with incomplete data and can be applied in any clinical setting, including settings where laboratory data is difficult to obtain.

Keywords: chronic kidney disease, glomerular filtration rate, creatinine, novel metabolic indices, machine learning, risk prediction

Procedia PDF Downloads 105
1889 Prediction of Dubai Financial Market Stocks Movement Using K-Nearest Neighbor and Support Vector Regression

Authors: Abdulla D. Alblooshi

Abstract:

The stock market is a representation of human behavior and psychology, such as fear, greed, and discipline. Those are manifested in the form of price movements during the trading sessions. Therefore, predicting the stock movement and prices is a challenging effort. However, those trading sessions produce a large amount of data that can be utilized to train an AI agent for the purpose of predicting the stock movement. Predicting the stock market price action will be advantageous. In this paper, the stock movement data of three DFM listed stocks are studied using historical price movements and technical indicators value and used to train an agent using KNN and SVM methods to predict the future price movement. MATLAB Toolbox and a simple script is written to process and classify the information and output the prediction. It will also compare the different learning methods and parameters s using metrics like RMSE, MAE, and R².

Keywords: KNN, ANN, style, SVM, stocks, technical indicators, RSI, MACD, moving averages, RMSE, MAE

Procedia PDF Downloads 168
1888 Neuronal Networks for the Study of the Effects of Cosmic Rays on Climate Variations

Authors: Jossitt Williams Vargas Cruz, Aura Jazmín Pérez Ríos

Abstract:

The variations of solar dynamics have become a relevant topic of study due to the effects of climate changes generated on the earth. One of the most disconcerting aspects is the variability that the sun has on the climate is the role played by sunspots (extra-atmospheric variable) in the modulation of the Cosmic Rays CR (extra-atmospheric variable). CRs influence the earth's climate by affecting cloud formation (atmospheric variable), and solar cycle influence is associated with the presence of solar storms, and the magnetic activity is greater, resulting in less CR entering the earth's atmosphere. The different methods of climate prediction in Colombia do not take into account the extra-atmospheric variables. Therefore, correlations between atmospheric and extra-atmospheric variables were studied in order to implement a Python code based on neural networks to make the prediction of the extra-atmospheric variable with the highest correlation.

Keywords: correlations, cosmic rays, sun, sunspots and variations.

Procedia PDF Downloads 70
1887 A Wall Law for Two-Phase Turbulent Boundary Layers

Authors: Dhahri Maher, Aouinet Hana

Abstract:

The presence of bubbles in the boundary layer introduces corrections into the log law, which must be taken into account. In this work, a logarithmic wall law was presented for bubbly two phase flows. The wall law presented in this work was based on the postulation of additional turbulent viscosity associated with bubble wakes in the boundary layer. The presented wall law contained empirical constant accounting both for shear induced turbulence interaction and for non-linearity of bubble. This constant was deduced from experimental data. The wall friction prediction achieved with the wall law was compared to the experimental data, in the case of a turbulent boundary layer developing on a vertical flat plate in the presence of millimetric bubbles. A very good agreement between experimental and numerical wall friction prediction was verified. The agreement was especially noticeable for the low void fraction when bubble induced turbulence plays a significant role.

Keywords: bubbly flows, log law, boundary layer, CFD

Procedia PDF Downloads 277
1886 Learning Dynamic Representations of Nodes in Temporally Variant Graphs

Authors: Sandra Mitrovic, Gaurav Singh

Abstract:

In many industries, including telecommunications, churn prediction has been a topic of active research. A lot of attention has been drawn on devising the most informative features, and this area of research has gained even more focus with spread of (social) network analytics. The call detail records (CDRs) have been used to construct customer networks and extract potentially useful features. However, to the best of our knowledge, no studies including network features have yet proposed a generic way of representing network information. Instead, ad-hoc and dataset dependent solutions have been suggested. In this work, we build upon a recently presented method (node2vec) to obtain representations for nodes in observed network. The proposed approach is generic and applicable to any network and domain. Unlike node2vec, which assumes a static network, we consider a dynamic and time-evolving network. To account for this, we propose an approach that constructs the feature representation of each node by generating its node2vec representations at different timestamps, concatenating them and finally compressing using an auto-encoder-like method in order to retain reasonably long and informative feature vectors. We test the proposed method on churn prediction task in telco domain. To predict churners at timestamp ts+1, we construct training and testing datasets consisting of feature vectors from time intervals [t1, ts-1] and [t2, ts] respectively, and use traditional supervised classification models like SVM and Logistic Regression. Observed results show the effectiveness of proposed approach as compared to ad-hoc feature selection based approaches and static node2vec.

Keywords: churn prediction, dynamic networks, node2vec, auto-encoders

Procedia PDF Downloads 314
1885 Artificial Intelligence Methods in Estimating the Minimum Miscibility Pressure Required for Gas Flooding

Authors: Emad A. Mohammed

Abstract:

Utilizing the capabilities of Data Mining and Artificial Intelligence in the prediction of the minimum miscibility pressure (MMP) required for multi-contact miscible (MCM) displacement of reservoir petroleum by hydrocarbon gas flooding using Fuzzy Logic models and Artificial Neural Network models will help a lot in giving accurate results. The factors affecting the (MMP) as it is proved from the literature and from the dataset are as follows: XC2-6: Intermediate composition in the oil-containing C2-6, CO2 and H2S, in mole %, XC1: Amount of methane in the oil (%),T: Temperature (°C), MwC7+: Molecular weight of C7+ (g/mol), YC2+: Mole percent of C2+ composition in injected gas (%), MwC2+: Molecular weight of C2+ in injected gas. Fuzzy Logic and Neural Networks have been used widely in prediction and classification, with relatively high accuracy, in different fields of study. It is well known that the Fuzzy Inference system can handle uncertainty within the inputs such as in our case. The results of this work showed that our proposed models perform better with higher performance indices than other emprical correlations.

Keywords: MMP, gas flooding, artificial intelligence, correlation

Procedia PDF Downloads 143
1884 Time Series Modelling and Prediction of River Runoff: Case Study of Karkheh River, Iran

Authors: Karim Hamidi Machekposhti, Hossein Sedghi, Abdolrasoul Telvari, Hossein Babazadeh

Abstract:

Rainfall and runoff phenomenon is a chaotic and complex outcome of nature which requires sophisticated modelling and simulation methods for explanation and use. Time Series modelling allows runoff data analysis and can be used as forecasting tool. In the paper attempt is made to model river runoff data and predict the future behavioural pattern of river based on annual past observations of annual river runoff. The river runoff analysis and predict are done using ARIMA model. For evaluating the efficiency of prediction to hydrological events such as rainfall, runoff and etc., we use the statistical formulae applicable. The good agreement between predicted and observation river runoff coefficient of determination (R2) display that the ARIMA (4,1,1) is the suitable model for predicting Karkheh River runoff at Iran.

Keywords: time series modelling, ARIMA model, river runoff, Karkheh River, CLS method

Procedia PDF Downloads 337
1883 Ensemble-Based SVM Classification Approach for miRNA Prediction

Authors: Sondos M. Hammad, Sherin M. ElGokhy, Mahmoud M. Fahmy, Elsayed A. Sallam

Abstract:

In this paper, an ensemble-based Support Vector Machine (SVM) classification approach is proposed. It is used for miRNA prediction. Three problems, commonly associated with previous approaches, are alleviated. These problems arise due to impose assumptions on the secondary structural of premiRNA, imbalance between the numbers of the laboratory checked miRNAs and the pseudo-hairpins, and finally using a training data set that does not consider all the varieties of samples in different species. We aggregate the predicted outputs of three well-known SVM classifiers; namely, Triplet-SVM, Virgo and Mirident, weighted by their variant features without any structural assumptions. An additional SVM layer is used in aggregating the final output. The proposed approach is trained and then tested with balanced data sets. The results of the proposed approach outperform the three base classifiers. Improved values for the metrics of 88.88% f-score, 92.73% accuracy, 90.64% precision, 96.64% specificity, 87.2% sensitivity, and the area under the ROC curve is 0.91 are achieved.

Keywords: MiRNAs, SVM classification, ensemble algorithm, assumption problem, imbalance data

Procedia PDF Downloads 347
1882 The New Media and Their Economic and Socio-Political Imperatives for Africa: A Study of Nigeria

Authors: Chukwukelue Uzodinma Umenyilorah

Abstract:

The advent of the New Media as enabled by information and communication technology from the 19th through the 21st century has no doubt taken its toll on all fronts of human existence; especially in Africa. Apart from shortening the distance between all parts of the world, technology and the new media has also succeeded in making the world a global village. Hence, it is now easy to relay live audio and visual signals across the length and breadth of the world in real time. People now contract and execute businesses across countries, conferences are held and ideas are shared with a simple push of a button. Likewise, political leaders and diplomats are now just a click away from reaching those important decisions that take their country’s fortunes to the next level. On the flip side, ICT and the New Media have also contributed in no small measure in aiding global terrorism and general insecurity around the world. More interesting is the fact that as developing economies, African countries have massively embraced the information technology and this has helped them in keeping up with the trends in the polity of other model democracies around the world. This paper is therefore designed to determine the how much effect ICT and the New Media has exerted on the economic, social and political lives of African. Nigeria shall be used as a case in point for the purpose of this paper.

Keywords: Africa, ICT, new media, Nigeria

Procedia PDF Downloads 252