Search results for: meteorological prediction data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25722

Search results for: meteorological prediction data

23172 Foundation of the Information Model for Connected-Cars

Authors: Hae-Won Seo, Yong-Gu Lee

Abstract:

Recent progress in the next generation of automobile technology is geared towards incorporating information technology into cars. Collectively called smart cars are bringing intelligence to cars that provides comfort, convenience and safety. A branch of smart cars is connected-car system. The key concept in connected-cars is the sharing of driving information among cars through decentralized manner enabling collective intelligence. This paper proposes a foundation of the information model that is necessary to define the driving information for smart-cars. Road conditions are modeled through a unique data structure that unambiguously represent the time variant traffics in the streets. Additionally, the modeled data structure is exemplified in a navigational scenario and usage using UML. Optimal driving route searching is also discussed using the proposed data structure in a dynamically changing road conditions.

Keywords: connected-car, data modeling, route planning, navigation system

Procedia PDF Downloads 369
23171 Allylation of Active Methylene Compounds with Cyclic Baylis-Hillman Alcohols: Why Is It Direct and Not Conjugate?

Authors: Karim Hrratha, Khaled Essalahb, Christophe Morellc, Henry Chermettec, Salima Boughdiria

Abstract:

Among the carbon-carbon bond formation types, allylation of active methylene compounds with cyclic Baylis-Hillman (BH) alcohols is a reliable and widely used method. This reaction is a very attractive tool in organic synthesis of biological and biodiesel compounds. Thus, in view of an insistent and peremptory request for an efficient and straightly method for synthesizing the desired product, a thorough analysis of various aspects of the reaction processes is an important task. The product afforded by the reaction of active methylene with BH alcohols depends largely on the experimental conditions, notably on the catalyst properties. All experiments reported that catalysis is needed for this reaction type because of the poor ability of alcohol hydroxyl group to be as a suitable leaving group. Within the catalysts, several transition- metal based have been used such as palladium in the presence of acid or base and have been considered as reliable methods. Furthemore, acid catalysts such as BF3.OEt2, BiX3 (X= Cl, Br, I, (OTf)3), InCl3, Yb(OTf)3, FeCl3, p-TsOH and H-montmorillonite have been employed to activate the C-C bond formation through the alkylation of active methylene compounds. Interestingly a report of a smoothly process for the ability of 4-imethyaminopyridine(DMAP) to catalyze the allylation reaction of active methylene compounds with cyclic Baylis-Hillman (BH) alcohol appeared recently. However, the reaction mechanism remains ambiguous, since the C- allylation process leads to an unexpected product (noted P1), corresponding to a direct allylation instead of conjugate allylation, which involves the most electrophilic center according to the electron withdrawing group CO effect. The main objective of the present theoretical study is to better understand the role of the DMAP catalytic activity as well as the process leading to the end- product (P1) for the catalytic reaction of a cyclic BH alcohol with active methylene compounds. For that purpose, we have carried out computations of a set of active methylene compounds varying by R1 and R2 toward the same alcohol, and we have attempted to rationalize the mechanisms thanks to the acid–base approach, and conceptual DFT tools such as chemical potential, hardness, Fukui functions, electrophilicity index and dual descriptor, as these approaches have shown a good prediction of reactions products.The present work is then organized as follows: In a first part some computational details will be given, introducing the reactivity indexes used in the present work, then Section 3 is dedicated to the discussion of the prediction of the selectivity and regioselectivity. The paper ends with some concluding remarks. In this work, we have shown, through DFT method at the B3LYP/6-311++G(d,p) level of theory that: The allylation of active methylene compounds with cyclic BH alcohol is governed by orbital control character. Hence the end- product denoted P1 is generated by direct allylation.

Keywords: DFT calculation, gas phase pKa, theoretical mechanism, orbital control, charge control, Fukui function, transition state

Procedia PDF Downloads 291
23170 Downregulation of Epidermal Growth Factor Receptor in Advanced Stage Laryngeal Squamous Cell Carcinoma

Authors: Sarocha Vivatvakin, Thanaporn Ratchataswan, Thiratest Leesutipornchai, Komkrit Ruangritchankul, Somboon Keelawat, Virachai Kerekhanjanarong, Patnarin Mahattanasakul, Saknan Bongsebandhu-Phubhakdi

Abstract:

In this globalization era, much attention has been drawn to various molecular biomarkers, which may have the potential to predict the progression of cancer. Epidermal growth factor receptor (EGFR) is the classic member of the ErbB family of membrane-associated intrinsic tyrosine kinase receptors. EGFR expression was found in several organs throughout the body as its roles involve in the regulation of cell proliferation, survival, and differentiation in normal physiologic conditions. However, anomalous expression, whether over- or under-expression is believed to be the underlying mechanism of pathologic conditions, including carcinogenesis. Even though numerous discussions regarding the EGFR as a prognostic tool in head and neck cancer have been established, the consensus has not yet been met. The aims of the present study are to assess the correlation between the level of EGFR expression and demographic data as well as clinicopathological features and to evaluate the ability of EGFR as a reliable prognostic marker. Furthermore, another aim of this study is to investigate the probable pathophysiology that explains the finding results. This retrospective study included 30 squamous cell laryngeal carcinoma patients treated at King Chulalongkorn Memorial Hospital from January 1, 2000, to December 31, 2004. EGFR expression level was observed to be significantly downregulated with the progression of the laryngeal cancer stage. (one way ANOVA, p = 0.001) A statistically significant lower EGFR expression in the late stage of the disease compared to the early stage was recorded. (unpaired t-test, p = 0.041) EGFR overexpression also showed the tendency to increase recurrence of cancer (unpaired t-test, p = 0.128). A significant downregulation of EGFR expression was documented in advanced stage laryngeal cancer. The results indicated that EGFR level correlates to prognosis in term of stage progression. Thus, EGFR expression might be used as a prevailing biomarker for laryngeal squamous cell carcinoma prognostic prediction.

Keywords: downregulation, epidermal growth factor receptor, immunohistochemistry, laryngeal squamous cell carcinoma

Procedia PDF Downloads 96
23169 A Supervised Learning Data Mining Approach for Object Recognition and Classification in High Resolution Satellite Data

Authors: Mais Nijim, Rama Devi Chennuboyina, Waseem Al Aqqad

Abstract:

Advances in spatial and spectral resolution of satellite images have led to tremendous growth in large image databases. The data we acquire through satellites, radars and sensors consists of important geographical information that can be used for remote sensing applications such as region planning, disaster management. Spatial data classification and object recognition are important tasks for many applications. However, classifying objects and identifying them manually from images is a difficult task. Object recognition is often considered as a classification problem, this task can be performed using machine-learning techniques. Despite of many machine-learning algorithms, the classification is done using supervised classifiers such as Support Vector Machines (SVM) as the area of interest is known. We proposed a classification method, which considers neighboring pixels in a region for feature extraction and it evaluates classifications precisely according to neighboring classes for semantic interpretation of region of interest (ROI). A dataset has been created for training and testing purpose; we generated the attributes by considering pixel intensity values and mean values of reflectance. We demonstrated the benefits of using knowledge discovery and data-mining techniques, which can be on image data for accurate information extraction and classification from high spatial resolution remote sensing imagery.

Keywords: remote sensing, object recognition, classification, data mining, waterbody identification, feature extraction

Procedia PDF Downloads 326
23168 Automated Multisensory Data Collection System for Continuous Monitoring of Refrigerating Appliances Recycling Plants

Authors: Georgii Emelianov, Mikhail Polikarpov, Fabian Hübner, Jochen Deuse, Jochen Schiemann

Abstract:

Recycling refrigerating appliances plays a major role in protecting the Earth's atmosphere from ozone depletion and emissions of greenhouse gases. The performance of refrigerator recycling plants in terms of material retention is the subject of strict environmental certifications and is reviewed periodically through specialized audits. The continuous collection of Refrigerator data required for the input-output analysis is still mostly manual, error-prone, and not digitalized. In this paper, we propose an automated data collection system for recycling plants in order to deduce expected material contents in individual end-of-life refrigerating appliances. The system utilizes laser scanner measurements and optical data to extract attributes of individual refrigerators by applying transfer learning with pre-trained vision models and optical character recognition. Based on Recognized features, the system automatically provides material categories and target values of contained material masses, especially foaming and cooling agents. The presented data collection system paves the way for continuous performance monitoring and efficient control of refrigerator recycling plants.

Keywords: automation, data collection, performance monitoring, recycling, refrigerators

Procedia PDF Downloads 154
23167 Sales Patterns Clustering Analysis on Seasonal Product Sales Data

Authors: Soojin Kim, Jiwon Yang, Sungzoon Cho

Abstract:

As a seasonal product is only in demand for a short time, inventory management is critical to profits. Both markdowns and stockouts decrease the return on perishable products; therefore, researchers have been interested in the distribution of seasonal products with the aim of maximizing profits. In this study, we propose a data-driven seasonal product sales pattern analysis method for individual retail outlets based on observed sales data clustering; the proposed method helps in determining distribution strategies.

Keywords: clustering, distribution, sales pattern, seasonal product

Procedia PDF Downloads 584
23166 Numerical Modeling of Structural Failure of a Ship During the Collision Event

Authors: Adjal Yassine, Semmani Amar

Abstract:

During the last decades, The risk of collision has been increased, especially in high maritime traffic. As the consequence, the demand is required for safety at sea and environmental protection. For this purpose, the consequences prediction of ship collisions is recommended in order to minimize structural failure. additionally, at the design stage of the ship, damage generated during the collision event must be taken into consideration. This structural failure, in some cases, can develop into the progressive collapse of other structural elements and generate catastrophic consequences. The present study investigates the progressive collapse of ships damaged by collisions using the Non -linear finite element method. The failure criteria are taken into account. The impacted area has a refined mesh in order to have more reliable results. Finally, a parametric study was conducted in this study to highlight the effect of the ship's speed, as well as the different impacted areas of double-bottom ships.

Keywords: collsion, strucural failure, ship, finite element analysis

Procedia PDF Downloads 93
23165 Probability Sampling in Matched Case-Control Study in Drug Abuse

Authors: Surya R. Niraula, Devendra B Chhetry, Girish K. Singh, S. Nagesh, Frederick A. Connell

Abstract:

Background: Although random sampling is generally considered to be the gold standard for population-based research, the majority of drug abuse research is based on non-random sampling despite the well-known limitations of this kind of sampling. Method: We compared the statistical properties of two surveys of drug abuse in the same community: one using snowball sampling of drug users who then identified “friend controls” and the other using a random sample of non-drug users (controls) who then identified “friend cases.” Models to predict drug abuse based on risk factors were developed for each data set using conditional logistic regression. We compared the precision of each model using bootstrapping method and the predictive properties of each model using receiver operating characteristics (ROC) curves. Results: Analysis of 100 random bootstrap samples drawn from the snowball-sample data set showed a wide variation in the standard errors of the beta coefficients of the predictive model, none of which achieved statistical significance. One the other hand, bootstrap analysis of the random-sample data set showed less variation, and did not change the significance of the predictors at the 5% level when compared to the non-bootstrap analysis. Comparison of the area under the ROC curves using the model derived from the random-sample data set was similar when fitted to either data set (0.93, for random-sample data vs. 0.91 for snowball-sample data, p=0.35); however, when the model derived from the snowball-sample data set was fitted to each of the data sets, the areas under the curve were significantly different (0.98 vs. 0.83, p < .001). Conclusion: The proposed method of random sampling of controls appears to be superior from a statistical perspective to snowball sampling and may represent a viable alternative to snowball sampling.

Keywords: drug abuse, matched case-control study, non-probability sampling, probability sampling

Procedia PDF Downloads 485
23164 Bioinformatics High Performance Computation and Big Data

Authors: Javed Mohammed

Abstract:

Right now, bio-medical infrastructure lags well behind the curve. Our healthcare system is dispersed and disjointed; medical records are a bit of a mess; and we do not yet have the capacity to store and process the crazy amounts of data coming our way from widespread whole-genome sequencing. And then there are privacy issues. Despite these infrastructure challenges, some researchers are plunging into bio medical Big Data now, in hopes of extracting new and actionable knowledge. They are doing delving into molecular-level data to discover bio markers that help classify patients based on their response to existing treatments; and pushing their results out to physicians in novel and creative ways. Computer scientists and bio medical researchers are able to transform data into models and simulations that will enable scientists for the first time to gain a profound under-standing of the deepest biological functions. Solving biological problems may require High-Performance Computing HPC due either to the massive parallel computation required to solve a particular problem or to algorithmic complexity that may range from difficult to intractable. Many problems involve seemingly well-behaved polynomial time algorithms (such as all-to-all comparisons) but have massive computational requirements due to the large data sets that must be analyzed. High-throughput techniques for DNA sequencing and analysis of gene expression have led to exponential growth in the amount of publicly available genomic data. With the increased availability of genomic data traditional database approaches are no longer sufficient for rapidly performing life science queries involving the fusion of data types. Computing systems are now so powerful it is possible for researchers to consider modeling the folding of a protein or even the simulation of an entire human body. This research paper emphasizes the computational biology's growing need for high-performance computing and Big Data. It illustrates this article’s indispensability in meeting the scientific and engineering challenges of the twenty-first century, and how Protein Folding (the structure and function of proteins) and Phylogeny Reconstruction (evolutionary history of a group of genes) can use HPC that provides sufficient capability for evaluating or solving more limited but meaningful instances. This article also indicates solutions to optimization problems, and benefits Big Data and Computational Biology. The article illustrates the Current State-of-the-Art and Future-Generation Biology of HPC Computing with Big Data.

Keywords: high performance, big data, parallel computation, molecular data, computational biology

Procedia PDF Downloads 357
23163 Evaluating the Effectiveness of Science Teacher Training Programme in National Colleges of Education: a Preliminary Study, Perceptions of Prospective Teachers

Authors: A. S. V Polgampala, F. Huang

Abstract:

This is an overview of what is entailed in an evaluation and issues to be aware of when class observation is being done. This study examined the effects of evaluating teaching practice of a 7-day ‘block teaching’ session in a pre -service science teacher training program at a reputed National College of Education in Sri Lanka. Effects were assessed in three areas: evaluation of the training process, evaluation of the training impact, and evaluation of the training procedure. Data for this study were collected by class observation of 18 teachers during 9th February to 16th of 2017. Prospective teachers of science teaching, the participants of the study were evaluated based on newly introduced format by the NIE. The data collected was analyzed qualitatively using the Miles and Huberman procedure for analyzing qualitative data: data reduction, data display and conclusion drawing/verification. It was observed that the trainees showed their confidence in teaching those competencies and skills. Teacher educators’ dissatisfaction has been a great impact on evaluation process.

Keywords: evaluation, perceptions & perspectives, pre-service, science teachering

Procedia PDF Downloads 305
23162 Overconfidence and Self-Attribution Bias: The Difference among Economic Students at Different Stage of the Study and Non-Economic Students

Authors: Vera Jancurova

Abstract:

People are, in general, exposed to behavioral biases, however, the degree and impact are affected by experience, knowledge, and other characteristics. The purpose of this article is to study two of defined behavioral biases, the overconfidence and self-attribution bias, and its impact on economic and non-economic students at different stage of the study. The research method used for the purpose of this study is a controlled field study that contains questions on perception of own confidence and self-attribution and estimation of limits to analyse actual abilities. The results of the research show that economic students seem to be more overconfident than their non–economic colleagues, which seems to be caused by the fact the questionnaire was asking for predicting economic indexes and own knowledge and abilities in financial environment. Surprisingly, the most overconfidence was detected by the students at the beginning of their study (1st-semester students). However, the estimations of real numbers do not point out, that economic students have better results by the prediction itself. The study confirmed the presence of self-attribution bias at all of the respondents.

Keywords: behavioral finance, overconfidence, self-attribution, heuristics and biases

Procedia PDF Downloads 251
23161 Detecting Venomous Files in IDS Using an Approach Based on Data Mining Algorithm

Authors: Sukhleen Kaur

Abstract:

In security groundwork, Intrusion Detection System (IDS) has become an important component. The IDS has received increasing attention in recent years. IDS is one of the effective way to detect different kinds of attacks and malicious codes in a network and help us to secure the network. Data mining techniques can be implemented to IDS, which analyses the large amount of data and gives better results. Data mining can contribute to improving intrusion detection by adding a level of focus to anomaly detection. So far the study has been carried out on finding the attacks but this paper detects the malicious files. Some intruders do not attack directly, but they hide some harmful code inside the files or may corrupt those file and attack the system. These files are detected according to some defined parameters which will form two lists of files as normal files and harmful files. After that data mining will be performed. In this paper a hybrid classifier has been used via Naive Bayes and Ripper classification methods. The results show how the uploaded file in the database will be tested against the parameters and then it is characterised as either normal or harmful file and after that the mining is performed. Moreover, when a user tries to mine on harmful file it will generate an exception that mining cannot be made on corrupted or harmful files.

Keywords: data mining, association, classification, clustering, decision tree, intrusion detection system, misuse detection, anomaly detection, naive Bayes, ripper

Procedia PDF Downloads 407
23160 Performance of High Density Genotyping in Sahiwal Cattle Breed

Authors: Hamid Mustafa, Huson J. Heather, Kim Eiusoo, Adeela Ajmal, Tad S. Sonstegard

Abstract:

The objective of this study was to evaluate the informativeness of Bovine high density SNPs genotyping in Sahiwal cattle population. This is a first attempt to assess the Bovine HD SNP genotyping array in any Pakistani indigenous cattle population. To evaluate these SNPs on genome wide scale, we considered 777,962 SNPs spanning the whole autosomal and X chromosomes in Sahiwal cattle population. Fifteen (15) non related gDNA samples were genotyped with the bovine HD infinium. Approximately 500,939 SNPs were found polymorphic (MAF > 0.05) in Sahiwal cattle population. The results of this study indicate potential application of Bovine High Density SNP genotyping in Pakistani indigenous cattle population. The information generated from this array can be applied in genetic prediction, characterization and genome wide association studies of Pakistani Sahiwal cattle population.

Keywords: Sahiwal cattle, polymorphic SNPs, genotyping, Pakistan

Procedia PDF Downloads 415
23159 Generalized Approach to Linear Data Transformation

Authors: Abhijith Asok

Abstract:

This paper presents a generalized approach for the simple linear data transformation, Y=bX, through an integration of multidimensional coordinate geometry, vector space theory and polygonal geometry. The scaling is performed by adding an additional ’Dummy Dimension’ to the n-dimensional data, which helps plot two dimensional component-wise straight lines on pairs of dimensions. The end result is a set of scaled extensions of observations in any of the 2n spatial divisions, where n is the total number of applicable dimensions/dataset variables, created by shifting the n-dimensional plane along the ’Dummy Axis’. The derived scaling factor was found to be dependent on the coordinates of the common point of origin for diverging straight lines and the plane of extension, chosen on and perpendicular to the ’Dummy Axis’, respectively. This result indicates the geometrical interpretation of a linear data transformation and hence, opportunities for a more informed choice of the factor ’b’, based on a better choice of these coordinate values. The paper follows on to identify the effect of this transformation on certain popular distance metrics, wherein for many, the distance metric retained the same scaling factor as that of the features.

Keywords: data transformation, dummy dimension, linear transformation, scaling

Procedia PDF Downloads 290
23158 Blockchain Platform Configuration for MyData Operator in Digital and Connected Health

Authors: Minna Pikkarainen, Yueqiang Xu

Abstract:

The integration of digital technology with existing healthcare processes has been painfully slow, a huge gap exists between the fields of strictly regulated official medical care and the quickly moving field of health and wellness technology. We claim that the promises of preventive healthcare can only be fulfilled when this gap is closed – health care and self-care becomes seamless continuum “correct information, in the correct hands, at the correct time allowing individuals and professionals to make better decisions” what we call connected health approach. Currently, the issues related to security, privacy, consumer consent and data sharing are hindering the implementation of this new paradigm of healthcare. This could be solved by following MyData principles stating that: Individuals should have the right and practical means to manage their data and privacy. MyData infrastructure enables decentralized management of personal data, improves interoperability, makes it easier for companies to comply with tightening data protection regulations, and allows individuals to change service providers without proprietary data lock-ins. This paper tackles today’s unprecedented challenges of enabling and stimulating multiple healthcare data providers and stakeholders to have more active participation in the digital health ecosystem. First, the paper systematically proposes the MyData approach for healthcare and preventive health data ecosystem. In this research, the work is targeted for health and wellness ecosystems. Each ecosystem consists of key actors, such as 1) individual (citizen or professional controlling/using the services) i.e. data subject, 2) services providing personal data (e.g. startups providing data collection apps or data collection devices), 3) health and wellness services utilizing aforementioned data and 4) services authorizing the access to this data under individual’s provided explicit consent. Second, the research extends the existing four archetypes of orchestrator-driven healthcare data business models for the healthcare industry and proposes the fifth type of healthcare data model, the MyData Blockchain Platform. This new architecture is developed by the Action Design Research approach, which is a prominent research methodology in the information system domain. The key novelty of the paper is to expand the health data value chain architecture and design from centralization and pseudo-decentralization to full decentralization, enabled by blockchain, thus the MyData blockchain platform. The study not only broadens the healthcare informatics literature but also contributes to the theoretical development of digital healthcare and blockchain research domains with a systemic approach.

Keywords: blockchain, health data, platform, action design

Procedia PDF Downloads 92
23157 Prediction-Based Midterm Operation Planning for Energy Management of Exhibition Hall

Authors: Doseong Eom, Jeongmin Kim, Kwang Ryel Ryu

Abstract:

Large exhibition halls require a lot of energy to maintain comfortable atmosphere for the visitors viewing inside. One way of reducing the energy cost is to have thermal energy storage systems installed so that the thermal energy can be stored in the middle of night when the energy price is low and then used later when the price is high. To minimize the overall energy cost, however, we should be able to decide how much energy to save during which time period exactly. If we can foresee future energy load and the corresponding cost, we will be able to make such decisions reasonably. In this paper, we use machine learning technique to obtain models for predicting weather conditions and the number of visitors on hourly basis for the next day. Based on the energy load thus predicted, we build a cost-optimal daily operation plan for the thermal energy storage systems and cooling and heating facilities through simulation-based optimization.

Keywords: building energy management, machine learning, operation planning, simulation-based optimization

Procedia PDF Downloads 314
23156 Impact of Religious Struggles on Life Satisfaction among Young Muslims: The Mediating Role of Psychological Wellbeing

Authors: Sarwat Sultan, Frasat Kanwal, Motasem Mirza

Abstract:

The impact of religiosity on people’s lives has always been found complex because some of them turn to religion to get comfort and relief from their fear, guilt, and illness, whereas some become away due to the perception that God is revengeful and distant for their conduct. The overarching aim of this study was to know whether the relationship between religious struggles (comfort/strain) and life satisfaction is mediated by psychological well-being. The participants of this study were 529 Muslim students who provided their responses on the measures of religious comfort/strain, psychological well-being, and life satisfaction. Results revealed that religious comfort predicted well-being and life satisfaction positively, while religious strain predicted negatively. Findings showed that psychological well-being mediated the prediction of religious comfort and strain for life satisfaction. These findings have implications for students’ mental health because their teachers and professionals can enhance their well-being by teaching them positive aspects of religion and God.

Keywords: attitude towards god, religious comfort, religious strain, life satisfaction, psychological wellbeing

Procedia PDF Downloads 43
23155 Using Learning Apps in the Classroom

Authors: Janet C. Read

Abstract:

UClan set collaboration with Lingokids to assess the Lingokids learning app's impact on learning outcomes in classrooms in the UK for children with ages ranging from 3 to 5 years. Data gathered during the controlled study with 69 children includes attitudinal data, engagement, and learning scores. Data shows that children enjoyment while learning was higher among those children using the game-based app compared to those children using other traditional methods. It’s worth pointing out that engagement when using the learning app was significantly higher than other traditional methods among older children. According to existing literature, there is a direct correlation between engagement, motivation, and learning. Therefore, this study provides relevant data points to conclude that Lingokids learning app serves its purpose of encouraging learning through playful and interactive content. That being said, we believe that learning outcomes should be assessed with a wider range of methods in further studies. Likewise, it would be beneficial to assess the level of usability and playability of the app in order to evaluate the learning app from other angles.

Keywords: learning app, learning outcomes, rapid test activity, Smileyometer, early childhood education, innovative pedagogy

Procedia PDF Downloads 60
23154 Road Safety in the Great Britain: An Exploratory Data Analysis

Authors: Jatin Kumar Choudhary, Naren Rayala, Abbas Eslami Kiasari, Fahimeh Jafari

Abstract:

The Great Britain has one of the safest road networks in the world. However, the consequences of any death or serious injury are devastating for loved ones, as well as for those who help the severely injured. This paper aims to analyse the Great Britain's road safety situation and show the response measures for areas where the total damage caused by accidents can be significantly and quickly reduced. In this paper, we do an exploratory data analysis using STATS19 data. For the past 30 years, the UK has had a good record in reducing fatalities. The UK ranked third based on the number of road deaths per million inhabitants. There were around 165,000 accidents reported in the Great Britain in 2009 and it has been decreasing every year until 2019 which is under 120,000. The government continues to scale back road deaths empowering responsible road users by identifying and prosecuting the parameters that make the roads less safe.

Keywords: road safety, data analysis, openstreetmap, feature expanding.

Procedia PDF Downloads 123
23153 Predictive Analysis of the Stock Price Market Trends with Deep Learning

Authors: Suraj Mehrotra

Abstract:

The stock market is a volatile, bustling marketplace that is a cornerstone of economics. It defines whether companies are successful or in spiral. A thorough understanding of it is important - many companies have whole divisions dedicated to analysis of both their stock and of rivaling companies. Linking the world of finance and artificial intelligence (AI), especially the stock market, has been a relatively recent development. Predicting how stocks will do considering all external factors and previous data has always been a human task. With the help of AI, however, machine learning models can help us make more complete predictions in financial trends. Taking a look at the stock market specifically, predicting the open, closing, high, and low prices for the next day is very hard to do. Machine learning makes this task a lot easier. A model that builds upon itself that takes in external factors as weights can predict trends far into the future. When used effectively, new doors can be opened up in the business and finance world, and companies can make better and more complete decisions. This paper explores the various techniques used in the prediction of stock prices, from traditional statistical methods to deep learning and neural networks based approaches, among other methods. It provides a detailed analysis of the techniques and also explores the challenges in predictive analysis. For the accuracy of the testing set, taking a look at four different models - linear regression, neural network, decision tree, and naïve Bayes - on the different stocks, Apple, Google, Tesla, Amazon, United Healthcare, Exxon Mobil, J.P. Morgan & Chase, and Johnson & Johnson, the naïve Bayes model and linear regression models worked best. For the testing set, the naïve Bayes model had the highest accuracy along with the linear regression model, followed by the neural network model and then the decision tree model. The training set had similar results except for the fact that the decision tree model was perfect with complete accuracy in its predictions, which makes sense. This means that the decision tree model likely overfitted the training set when used for the testing set.

Keywords: machine learning, testing set, artificial intelligence, stock analysis

Procedia PDF Downloads 85
23152 Intrusion Detection System Using Linear Discriminant Analysis

Authors: Zyad Elkhadir, Khalid Chougdali, Mohammed Benattou

Abstract:

Most of the existing intrusion detection systems works on quantitative network traffic data with many irrelevant and redundant features, which makes detection process more time’s consuming and inaccurate. A several feature extraction methods, such as linear discriminant analysis (LDA), have been proposed. However, LDA suffers from the small sample size (SSS) problem which occurs when the number of the training samples is small compared with the samples dimension. Hence, classical LDA cannot be applied directly for high dimensional data such as network traffic data. In this paper, we propose two solutions to solve SSS problem for LDA and apply them to a network IDS. The first method, reduce the original dimension data using principal component analysis (PCA) and then apply LDA. In the second solution, we propose to use the pseudo inverse to avoid singularity of within-class scatter matrix due to SSS problem. After that, the KNN algorithm is used for classification process. We have chosen two known datasets KDDcup99 and NSLKDD for testing the proposed approaches. Results showed that the classification accuracy of (PCA+LDA) method outperforms clearly the pseudo inverse LDA method when we have large training data.

Keywords: LDA, Pseudoinverse, PCA, IDS, NSL-KDD, KDDcup99

Procedia PDF Downloads 219
23151 Estimation of Location and Scale Parameters of Extended Exponential Distribution Based on Record Statistics

Authors: E. Krishna

Abstract:

An Extended form of exponential distribution using Marshall and Olkin method is introduced.The location scale family of these distributions is considered. For location scale free family, exact expressions for single and product moments of upper record statistics are derived. The mean, variance and covariance of record values are computed for various values of the shape parameter. Using these the BLUE's of location and scale parameters are derived.The variances and covariance of estimates are obtained.Through Monte Carlo simulation the con dence intervals for location and scale parameters are constructed.The Best liner unbiased Predictor (BLUP) of future records are also discussed.

Keywords: BLUE, BLUP, con dence interval, Marshall-Olkin distribution, Monte Carlo simulation, prediction of future records, record statistics

Procedia PDF Downloads 411
23150 Studies of Rule Induction by STRIM from the Decision Table with Contaminated Attribute Values from Missing Data and Noise — in the Case of Critical Dataset Size —

Authors: Tetsuro Saeki, Yuichi Kato, Shoutarou Mizuno

Abstract:

STRIM (Statistical Test Rule Induction Method) has been proposed as a method to effectively induct if-then rules from the decision table which is considered as a sample set obtained from the population of interest. Its usefulness has been confirmed by simulation experiments specifying rules in advance, and by comparison with conventional methods. However, scope for future development remains before STRIM can be applied to the analysis of real-world data sets. The first requirement is to determine the size of the dataset needed for inducting true rules, since finding statistically significant rules is the core of the method. The second is to examine the capacity of rule induction from datasets with contaminated attribute values created by missing data and noise, since real-world datasets usually contain such contaminated data. This paper examines the first problem theoretically, in connection with the rule length. The second problem is then examined in a simulation experiment, utilizing the critical size of dataset derived from the first step. The experimental results show that STRIM is highly robust in the analysis of datasets with contaminated attribute values, and hence is applicable to realworld data.

Keywords: rule induction, decision table, missing data, noise

Procedia PDF Downloads 388
23149 Interest Rate Prediction with Taylor Rule

Authors: T. Bouchabchoub, A. Bendahmane, A. Haouriqui, N. Attou

Abstract:

This paper presents simulation results of Forex predicting model equations in order to give approximately a prevision of interest rates. First, Hall-Taylor (HT) equations have been used with Taylor rule (TR) to adapt them to European and American Forex Markets. Indeed, initial Taylor Rule equation is conceived for all Forex transactions in every States: It includes only one equation and six parameters. Here, the model has been used with Hall-Taylor equations, initially including twelve equations which have been reduced to only three equations. Analysis has been developed on the following base macroeconomic variables: Real change rate, investment wages, anticipated inflation, realized inflation, real production, interest rates, gap production and potential production. This model has been used to specifically study the impact of an inflation shock on macroeconomic director interest rates.

Keywords: interest rate, Forex, Taylor rule, production, European Central Bank (ECB), Federal Reserve System (FED).

Procedia PDF Downloads 517
23148 Machine Learning Strategies for Data Extraction from Unstructured Documents in Financial Services

Authors: Delphine Vendryes, Dushyanth Sekhar, Baojia Tong, Matthew Theisen, Chester Curme

Abstract:

Much of the data that inform the decisions of governments, corporations and individuals are harvested from unstructured documents. Data extraction is defined here as a process that turns non-machine-readable information into a machine-readable format that can be stored, for instance, in a database. In financial services, introducing more automation in data extraction pipelines is a major challenge. Information sought by financial data consumers is often buried within vast bodies of unstructured documents, which have historically required thorough manual extraction. Automated solutions provide faster access to non-machine-readable datasets, in a context where untimely information quickly becomes irrelevant. Data quality standards cannot be compromised, so automation requires high data integrity. This multifaceted task is broken down into smaller steps: ingestion, table parsing (detection and structure recognition), text analysis (entity detection and disambiguation), schema-based record extraction, user feedback incorporation. Selected intermediary steps are phrased as machine learning problems. Solutions leveraging cutting-edge approaches from the fields of computer vision (e.g. table detection) and natural language processing (e.g. entity detection and disambiguation) are proposed.

Keywords: computer vision, entity recognition, finance, information retrieval, machine learning, natural language processing

Procedia PDF Downloads 100
23147 Regression Approach for Optimal Purchase of Hosts Cluster in Fixed Fund for Hadoop Big Data Platform

Authors: Haitao Yang, Jianming Lv, Fei Xu, Xintong Wang, Yilin Huang, Lanting Xia, Xuewu Zhu

Abstract:

Given a fixed fund, purchasing fewer hosts of higher capability or inversely more of lower capability is a must-be-made trade-off in practices for building a Hadoop big data platform. An exploratory study is presented for a Housing Big Data Platform project (HBDP), where typical big data computing is with SQL queries of aggregate, join, and space-time condition selections executed upon massive data from more than 10 million housing units. In HBDP, an empirical formula was introduced to predict the performance of host clusters potential for the intended typical big data computing, and it was shaped via a regression approach. With this empirical formula, it is easy to suggest an optimal cluster configuration. The investigation was based on a typical Hadoop computing ecosystem HDFS+Hive+Spark. A proper metric was raised to measure the performance of Hadoop clusters in HBDP, which was tested and compared with its predicted counterpart, on executing three kinds of typical SQL query tasks. Tests were conducted with respect to factors of CPU benchmark, memory size, virtual host division, and the number of element physical host in cluster. The research has been applied to practical cluster procurement for housing big data computing.

Keywords: Hadoop platform planning, optimal cluster scheme at fixed-fund, performance predicting formula, typical SQL query tasks

Procedia PDF Downloads 222
23146 Model Predictive Controller for Pasteurization Process

Authors: Tesfaye Alamirew Dessie

Abstract:

Our study focuses on developing a Model Predictive Controller (MPC) and evaluating it against a traditional PID for a pasteurization process. Utilizing system identification from the experimental data, the dynamics of the pasteurization process were calculated. Using best fit with data validation, residual, and stability analysis, the quality of several model architectures was evaluated. The validation data fit the auto-regressive with exogenous input (ARX322) model of the pasteurization process by roughly 80.37 percent. The ARX322 model structure was used to create MPC and PID control techniques. After comparing controller performance based on settling time, overshoot percentage, and stability analysis, it was found that MPC controllers outperform PID for those parameters.

Keywords: MPC, PID, ARX, pasteurization

Procedia PDF Downloads 148
23145 Point Estimation for the Type II Generalized Logistic Distribution Based on Progressively Censored Data

Authors: Rana Rimawi, Ayman Baklizi

Abstract:

Skewed distributions are important models that are frequently used in applications. Generalized distributions form a class of skewed distributions and gain widespread use in applications because of their flexibility in data analysis. More specifically, the Generalized Logistic Distribution with its different types has received considerable attention recently. In this study, based on progressively type-II censored data, we will consider point estimation in type II Generalized Logistic Distribution (Type II GLD). We will develop several estimators for its unknown parameters, including maximum likelihood estimators (MLE), Bayes estimators and linear estimators (BLUE). The estimators will be compared using simulation based on the criteria of bias and Mean square error (MSE). An illustrative example of a real data set will be given.

Keywords: point estimation, type II generalized logistic distribution, progressive censoring, maximum likelihood estimation

Procedia PDF Downloads 189
23144 Omni: Data Science Platform for Evaluate Performance of a LoRaWAN Network

Authors: Emanuele A. Solagna, Ricardo S, Tozetto, Roberto dos S. Rabello

Abstract:

Nowadays, physical processes are becoming digitized by the evolution of communication, sensing and storage technologies which promote the development of smart cities. The evolution of this technology has generated multiple challenges related to the generation of big data and the active participation of electronic devices in society. Thus, devices can send information that is captured and processed over large areas, but there is no guarantee that all the obtained data amount will be effectively stored and correctly persisted. Because, depending on the technology which is used, there are parameters that has huge influence on the full delivery of information. This article aims to characterize the project, currently under development, of a platform that based on data science will perform a performance and effectiveness evaluation of an industrial network that implements LoRaWAN technology considering its main parameters configuration relating these parameters to the information loss.

Keywords: Internet of Things, LoRa, LoRaWAN, smart cities

Procedia PDF Downloads 136
23143 Cybervetting and Online Privacy in Job Recruitment – Perspectives on the Current and Future Legislative Framework Within the EU

Authors: Nicole Christiansen, Hanne Marie Motzfeldt

Abstract:

In recent years, more and more HR professionals have been using cyber-vetting in job recruitment in an effort to find the perfect match for the company. These practices are growing rapidly, accessing a vast amount of data from social networks, some of which is privileged and protected information. Thus, there is a risk that the right to privacy is becoming a duty to manage your private data. This paper investigates to which degree a job applicant's fundamental rights are protected adequately in current and future legislation in the EU. This paper argues that current data protection regulations and forthcoming regulations on the use of AI ensure sufficient protection. However, even though the regulation on paper protects employees within the EU, the recruitment sector may not pay sufficient attention to the regulation as it not specifically targeting this area. Therefore, the lack of specific labor and employment regulation is a concern that the social partners should attend to.

Keywords: AI, cyber vetting, data protection, job recruitment, online privacy

Procedia PDF Downloads 75