Search results for: data fitting
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24604

Search results for: data fitting

23884 Road Accidents Bigdata Mining and Visualization Using Support Vector Machines

Authors: Usha Lokala, Srinivas Nowduri, Prabhakar K. Sharma

Abstract:

Useful information has been extracted from the road accident data in United Kingdom (UK), using data analytics method, for avoiding possible accidents in rural and urban areas. This analysis make use of several methodologies such as data integration, support vector machines (SVM), correlation machines and multinomial goodness. The entire datasets have been imported from the traffic department of UK with due permission. The information extracted from these huge datasets forms a basis for several predictions, which in turn avoid unnecessary memory lapses. Since data is expected to grow continuously over a period of time, this work primarily proposes a new framework model which can be trained and adapt itself to new data and make accurate predictions. This work also throws some light on use of SVM’s methodology for text classifiers from the obtained traffic data. Finally, it emphasizes the uniqueness and adaptability of SVMs methodology appropriate for this kind of research work.

Keywords: support vector mechanism (SVM), machine learning (ML), support vector machines (SVM), department of transportation (DFT)

Procedia PDF Downloads 254
23883 A Relational Data Base for Radiation Therapy

Authors: Raffaele Danilo Esposito, Domingo Planes Meseguer, Maria Del Pilar Dorado Rodriguez

Abstract:

As far as we know, it is still unavailable a commercial solution which would allow to manage, openly and configurable up to user needs, the huge amount of data generated in a modern Radiation Oncology Department. Currently, available information management systems are mainly focused on Record & Verify and clinical data, and only to a small extent on physical data. Thus, results in a partial and limited use of the actually available information. In the present work we describe the implementation at our department of a centralized information management system based on a web server. Our system manages both information generated during patient planning and treatment, and information of general interest for the whole department (i.e. treatment protocols, quality assurance protocols etc.). Our objective it to be able to analyze in a simple and efficient way all the available data and thus to obtain quantitative evaluations of our treatments. This would allow us to improve our work flow and protocols. To this end we have implemented a relational data base which would allow us to use in a practical and efficient way all the available information. As always we only use license free software.

Keywords: information management system, radiation oncology, medical physics, free software

Procedia PDF Downloads 223
23882 A Study of Safety of Data Storage Devices of Graduate Students at Suan Sunandha Rajabhat University

Authors: Komol Phaisarn, Natcha Wattanaprapa

Abstract:

This research is a survey research with an objective to study the safety of data storage devices of graduate students of academic year 2013, Suan Sunandha Rajabhat University. Data were collected by questionnaire on the safety of data storage devices according to CIA principle. A sample size of 81 was drawn from population by purposive sampling method. The results show that most of the graduate students of academic year 2013 at Suan Sunandha Rajabhat University use handy drive to store their data and the safety level of the devices is at good level.

Keywords: security, safety, storage devices, graduate students

Procedia PDF Downloads 337
23881 Spectroscopic Study of Tb³⁺ Doped Calcium Aluminozincate Phosphor for Display and Solid-State Lighting Applications

Authors: Sumandeep Kaur, Allam Srinivasa Rao, Mula Jayasimhadri

Abstract:

In recent years, rare earth (RE) ions doped inorganic luminescent materials are seeking great attention due to their excellent physical and chemical properties. These materials offer high thermal and chemical stability and exhibit good luminescence properties due to the presence of RE ions. The luminescent properties of these materials are attributed to their intra-configurational f-f transitions in RE ions. A series of Tb³⁺ doped calcium aluminozincate has been synthesized via sol-gel method. The structural and morphological studies have been carried out by recording X-ray diffraction patterns and SEM image. The luminescent spectra have been recorded for a comprehensive study of their luminescence properties. The XRD profile reveals the single-phase orthorhombic crystal structure with an average crystallite size of 65 nm as calculated by using DebyeScherrer equation. The SEM image exhibits completely random, irregular morphology of micron size particles of the prepared samples. The optimization of luminescence has been carried out by varying the dopant Tb³⁺ concentration within the range from 0.5 to 2.0 mol%. The as-synthesized phosphors exhibit intense emission at 544 nm pumped at 478 nm excitation wavelength. The optimized Tb³⁺ concentration has been found to be 1.0 mol% in the present host lattice. The decay curves show bi-exponential fitting for the as-synthesized phosphor. The colorimetric studies show green emission with CIE coordinates (0.334, 0.647) lying in green region for the optimized Tb³⁺ concentration. This report reveals the potential utility of Tb³⁺ doped calcium aluminozincate phosphors for display and solid-state lighting devices.

Keywords: concentration quenching, phosphor, photoluminescence, XRD

Procedia PDF Downloads 132
23880 Simulation of a Cost Model Response Requests for Replication in Data Grid Environment

Authors: Kaddi Mohammed, A. Benatiallah, D. Benatiallah

Abstract:

Data grid is a technology that has full emergence of new challenges, such as the heterogeneity and availability of various resources and geographically distributed, fast data access, minimizing latency and fault tolerance. Researchers interested in this technology address the problems of the various systems related to the industry such as task scheduling, load balancing and replication. The latter is an effective solution to achieve good performance in terms of data access and grid resources and better availability of data cost. In a system with duplication, a coherence protocol is used to impose some degree of synchronization between the various copies and impose some order on updates. In this project, we present an approach for placing replicas to minimize the cost of response of requests to read or write, and we implement our model in a simulation environment. The placement techniques are based on a cost model which depends on several factors, such as bandwidth, data size and storage nodes.

Keywords: response time, query, consistency, bandwidth, storage capacity, CERN

Procedia PDF Downloads 254
23879 Effect of Extrusion Processing Parameters on Protein in Banana Flour Extrudates: Characterisation Using Fourier-Transform Infrared Spectroscopy

Authors: Surabhi Pandey, Pavuluri Srinivasa Rao

Abstract:

Extrusion processing is a high-temperature short time (HTST) treatment which can improve protein quality and digestibility together with retaining active nutrients. In-vitro protein digestibility of plant protein-based foods is generally enhanced by extrusion. The current study aimed to investigate the effect of extrusion cooking on in-vitro protein digestibility (IVPD) and conformational modification of protein in green banana flour extrudates. Green banana flour was extruded through a co-rotating twin-screw extruder varying the moisture content, barrel temperature, screw speed in the range of 10-20 %, 60-80 °C, 200-300 rpm, respectively, at constant feed rate. Response surface methodology was used to optimise the result for IVPD. Fourier-transform infrared spectroscopy (FTIR) analysis provided a convenient and powerful means to monitor interactions and changes in functional and conformational properties of extrudates. Results showed that protein digestibility was highest in extrudate produced at 80°C, 250 rpm and 15% feed moisture. FTIR analysis was done for the optimised sample having highest IVPD. FTIR analysis showed that there were no changes in primary structure of protein while the secondary protein structure changed. In order to explain this behaviour, infrared spectroscopy analysis was carried out, mainly in the amide I and II regions. Moreover, curve fitting analysis showed the conformational changes produced in the flour due to protein denaturation. The quantitative analysis of the changes in the amide I and II regions provided information about the modifications produced in banana flour extrudates.

Keywords: extrusion, FTIR, protein conformation, raw banana flour, SDS-PAGE method

Procedia PDF Downloads 143
23878 Comparison of Different Methods to Produce Fuzzy Tolerance Relations for Rainfall Data Classification in the Region of Central Greece

Authors: N. Samarinas, C. Evangelides, C. Vrekos

Abstract:

The aim of this paper is the comparison of three different methods, in order to produce fuzzy tolerance relations for rainfall data classification. More specifically, the three methods are correlation coefficient, cosine amplitude and max-min method. The data were obtained from seven rainfall stations in the region of central Greece and refers to 20-year time series of monthly rainfall height average. Three methods were used to express these data as a fuzzy relation. This specific fuzzy tolerance relation is reformed into an equivalence relation with max-min composition for all three methods. From the equivalence relation, the rainfall stations were categorized and classified according to the degree of confidence. The classification shows the similarities among the rainfall stations. Stations with high similarity can be utilized in water resource management scenarios interchangeably or to augment data from one to another. Due to the complexity of calculations, it is important to find out which of the methods is computationally simpler and needs fewer compositions in order to give reliable results.

Keywords: classification, fuzzy logic, tolerance relations, rainfall data

Procedia PDF Downloads 298
23877 Evaluation of Digital Marketing Strategies by Behavioral Economics

Authors: Sajjad Esmaeili Aghdam

Abstract:

Economics typically conceptualizes individual behavior as the consequence of external states, for example, budgets and prices (or respective beliefs) and choices. As the main goal, we focus on the influence of a range of Behavioral Economics factors on Strategies of Digital Marketing, evaluation of strategies and deformation of it into highly prospective marketing strategies. The different forms of behavioral prospects all lead to the succeeding two main results. First, the steadiness of the economic dynamics in a currency union be contingent fatefully on the level of economic incorporation. More economic incorporation leads to more steady economic dynamics. Electronic word-of-mouth (eWOM) is “all casual communications focused at consumers through Internet-based technology connected to the usage or characteristics of specific properties and services or their venders.” eWOM can take many methods, the most significant one being online analyses. Writing this paper, 72 articles have been gathered, focusing on the title and the aim of the article from research search engines like Google Scholar, Web of Science, and PubMed. Recent research in strategic management and marketing proposes that markets should not be viewed as a given and deterministic setting, exogenous to the firm. Instead, firms are progressively abstracted as dynamic inventors of market prospects. The use of new technologies touches all spheres of the modern lifestyle. Social and economic life becomes unbearable without fast, applicable, first-class and fitting material. Psychology and economics (together known as behavioral economics) are two protruding disciplines underlying many theories in marketing. The wide marketing works papers consumers’ none balanced behavior even though behavioral biases might not continuously be steadily called or officially labeled.

Keywords: behavioral economics, digital marketing, marketing strategy, high impact strategies

Procedia PDF Downloads 165
23876 Customer Satisfaction and Effective HRM Policies: Customer and Employee Satisfaction

Authors: S. Anastasiou, C. Nathanailides

Abstract:

The purpose of this study is to examine the possible link between employee and customer satisfaction. The service provided by employees, help to build a good relationship with customers and can help at increasing their loyalty. Published data for job satisfaction and indicators of customer services were gathered from relevant published works which included data from five different countries. The reviewed data indicate a significant correlation between indicators of customer and employee satisfaction in the Banking sector. There was a significant correlation between the two parameters (Pearson correlation R2=0.52 P<0.05) The reviewed data provide evidence that there is some practical evidence which links these two parameters.

Keywords: job satisfaction, job performance, customer’ service, banks, human resources management

Procedia PDF Downloads 306
23875 Evaluation of Australian Open Banking Regulation: Balancing Customer Data Privacy and Innovation

Authors: Suman Podder

Abstract:

As Australian ‘Open Banking’ allows customers to share their financial data with accredited Third-Party Providers (‘TPPs’), it is necessary to evaluate whether the regulators have achieved the balance between protecting customer data privacy and promoting data-related innovation. Recognising the need to increase customers’ influence on their own data, and the benefits of data-related innovation, the Australian Government introduced ‘Consumer Data Right’ (‘CDR’) to the banking sector through Open Banking regulation. Under Open Banking, TPPs can access customers’ banking data that allows the TPPs to tailor their products and services to meet customer needs at a more competitive price. This facilitated access and use of customer data will promote innovation by providing opportunities for new products and business models to emerge and grow. However, the success of Open Banking depends on the willingness of the customers to share their data, so the regulators have augmented the protection of data by introducing new privacy safeguards to instill confidence and trust in the system. The dilemma in policymaking is that, on the one hand, lenient data privacy laws will help the flow of information, but at the risk of individuals’ loss of privacy, on the other hand, stringent laws that adequately protect privacy may dissuade innovation. Using theoretical and doctrinal methods, this paper examines whether the privacy safeguards under Open Banking will add to the compliance burden of the participating financial institutions, resulting in the undesirable effect of stifling other policy objectives such as innovation. The contribution of this research is three-fold. In the emerging field of customer data sharing, this research is one of the few academic studies on the objectives and impact of Open Banking in the Australian context. Additionally, Open Banking is still in the early stages of implementation, so this research traces the evolution of Open Banking through policy debates regarding the desirability of customer data-sharing. Finally, the research focuses not only on the customers’ data privacy and juxtaposes it with another important objective of promoting innovation, but it also highlights the critical issues facing the data-sharing regime. This paper argues that while it is challenging to develop a regulatory framework for protecting data privacy without impeding innovation and jeopardising yet unknown opportunities, data privacy and innovation promote different aspects of customer welfare. This paper concludes that if a regulation is appropriately designed and implemented, the benefits of data-sharing will outweigh the cost of compliance with the CDR.

Keywords: consumer data right, innovation, open banking, privacy safeguards

Procedia PDF Downloads 128
23874 Generation of Automated Alarms for Plantwide Process Monitoring

Authors: Hyun-Woo Cho

Abstract:

Earlier detection of incipient abnormal operations in terms of plant-wide process management is quite necessary in order to improve product quality and process safety. And generating warning signals or alarms for operating personnel plays an important role in process automation and intelligent plant health monitoring. Various methodologies have been developed and utilized in this area such as expert systems, mathematical model-based approaches, multivariate statistical approaches, and so on. This work presents a nonlinear empirical monitoring methodology based on the real-time analysis of massive process data. Unfortunately, the big data includes measurement noises and unwanted variations unrelated to true process behavior. Thus the elimination of such unnecessary patterns of the data is executed in data processing step to enhance detection speed and accuracy. The performance of the methodology was demonstrated using simulated process data. The case study showed that the detection speed and performance was improved significantly irrespective of the size and the location of abnormal events.

Keywords: detection, monitoring, process data, noise

Procedia PDF Downloads 233
23873 Meanings and Concepts of Standardization in Systems Medicine

Authors: Imme Petersen, Wiebke Sick, Regine Kollek

Abstract:

In systems medicine, high-throughput technologies produce large amounts of data on different biological and pathological processes, including (disturbed) gene expressions, metabolic pathways and signaling. The large volume of data of different types, stored in separate databases and often located at different geographical sites have posed new challenges regarding data handling and processing. Tools based on bioinformatics have been developed to resolve the upcoming problems of systematizing, standardizing and integrating the various data. However, the heterogeneity of data gathered at different levels of biological complexity is still a major challenge in data analysis. To build multilayer disease modules, large and heterogeneous data of disease-related information (e.g., genotype, phenotype, environmental factors) are correlated. Therefore, a great deal of attention in systems medicine has been put on data standardization, primarily to retrieve and combine large, heterogeneous datasets into standardized and incorporated forms and structures. However, this data-centred concept of standardization in systems medicine is contrary to the debate in science and technology studies (STS) on standardization that rather emphasizes the dynamics, contexts and negotiations of standard operating procedures. Based on empirical work on research consortia that explore the molecular profile of diseases to establish systems medical approaches in the clinic in Germany, we trace how standardized data are processed and shaped by bioinformatics tools, how scientists using such data in research perceive such standard operating procedures and which consequences for knowledge production (e.g. modeling) arise from it. Hence, different concepts and meanings of standardization are explored to get a deeper insight into standard operating procedures not only in systems medicine, but also beyond.

Keywords: data, science and technology studies (STS), standardization, systems medicine

Procedia PDF Downloads 323
23872 Integrated On-Board Diagnostic-II and Direct Controller Area Network Access for Vehicle Monitoring System

Authors: Kavian Khosravinia, Mohd Khair Hassan, Ribhan Zafira Abdul Rahman, Syed Abdul Rahman Al-Haddad

Abstract:

The CAN (controller area network) bus is introduced as a multi-master, message broadcast system. The messages sent on the CAN are used to communicate state information, referred as a signal between different ECUs, which provides data consistency in every node of the system. OBD-II Dongles that are based on request and response method is the wide-spread solution for extracting sensor data from cars among researchers. Unfortunately, most of the past researches do not consider resolution and quantity of their input data extracted through OBD-II technology. The maximum feasible scan rate is only 9 queries per second which provide 8 data points per second with using ELM327 as well-known OBD-II dongle. This study aims to develop and design a programmable, and latency-sensitive vehicle data acquisition system that improves the modularity and flexibility to extract exact, trustworthy, and fresh car sensor data with higher frequency rates. Furthermore, the researcher must break apart, thoroughly inspect, and observe the internal network of the vehicle, which may cause severe damages to the expensive ECUs of the vehicle due to intrinsic vulnerabilities of the CAN bus during initial research. Desired sensors data were collected from various vehicles utilizing Raspberry Pi3 as computing and processing unit with using OBD (request-response) and direct CAN method at the same time. Two types of data were collected for this study. The first, CAN bus frame data that illustrates data collected for each line of hex data sent from an ECU and the second type is the OBD data that represents some limited data that is requested from ECU under standard condition. The proposed system is reconfigurable, human-readable and multi-task telematics device that can be fitted into any vehicle with minimum effort and minimum time lag in the data extraction process. The standard operational procedure experimental vehicle network test bench is developed and can be used for future vehicle network testing experiment.

Keywords: CAN bus, OBD-II, vehicle data acquisition, connected cars, telemetry, Raspberry Pi3

Procedia PDF Downloads 183
23871 Big Data in Construction Project Management: The Colombian Northeast Case

Authors: Sergio Zabala-Vargas, Miguel Jiménez-Barrera, Luz VArgas-Sánchez

Abstract:

In recent years, information related to project management in organizations has been increasing exponentially. Performance data, management statistics, indicator results have forced the collection, analysis, traceability, and dissemination of project managers to be essential. In this sense, there are current trends to facilitate efficient decision-making in emerging technology projects, such as: Machine Learning, Data Analytics, Data Mining, and Big Data. The latter is the most interesting in this project. This research is part of the thematic line Construction methods and project management. Many authors present the relevance that the use of emerging technologies, such as Big Data, has taken in recent years in project management in the construction sector. The main focus is the optimization of time, scope, budget, and in general mitigating risks. This research was developed in the northeastern region of Colombia-South America. The first phase was aimed at diagnosing the use of emerging technologies (Big-Data) in the construction sector. In Colombia, the construction sector represents more than 50% of the productive system, and more than 2 million people participate in this economic segment. The quantitative approach was used. A survey was applied to a sample of 91 companies in the construction sector. Preliminary results indicate that the use of Big Data and other emerging technologies is very low and also that there is interest in modernizing project management. There is evidence of a correlation between the interest in using new data management technologies and the incorporation of Building Information Modeling BIM. The next phase of the research will allow the generation of guidelines and strategies for the incorporation of technological tools in the construction sector in Colombia.

Keywords: big data, building information modeling, tecnology, project manamegent

Procedia PDF Downloads 114
23870 Minimum Data of a Speech Signal as Special Indicators of Identification in Phonoscopy

Authors: Nazaket Gazieva

Abstract:

Voice biometric data associated with physiological, psychological and other factors are widely used in forensic phonoscopy. There are various methods for identifying and verifying a person by voice. This article explores the minimum speech signal data as individual parameters of a speech signal. Monozygotic twins are believed to be genetically identical. Using the minimum data of the speech signal, we came to the conclusion that the voice imprint of monozygotic twins is individual. According to the conclusion of the experiment, we can conclude that the minimum indicators of the speech signal are more stable and reliable for phonoscopic examinations.

Keywords: phonogram, speech signal, temporal characteristics, fundamental frequency, biometric fingerprints

Procedia PDF Downloads 122
23869 A Non-parametric Clustering Approach for Multivariate Geostatistical Data

Authors: Francky Fouedjio

Abstract:

Multivariate geostatistical data have become omnipresent in the geosciences and pose substantial analysis challenges. One of them is the grouping of data locations into spatially contiguous clusters so that data locations within the same cluster are more similar while clusters are different from each other, in some sense. Spatially contiguous clusters can significantly improve the interpretation that turns the resulting clusters into meaningful geographical subregions. In this paper, we develop an agglomerative hierarchical clustering approach that takes into account the spatial dependency between observations. It relies on a dissimilarity matrix built from a non-parametric kernel estimator of the spatial dependence structure of data. It integrates existing methods to find the optimal cluster number and to evaluate the contribution of variables to the clustering. The capability of the proposed approach to provide spatially compact, connected and meaningful clusters is assessed using bivariate synthetic dataset and multivariate geochemical dataset. The proposed clustering method gives satisfactory results compared to other similar geostatistical clustering methods.

Keywords: clustering, geostatistics, multivariate data, non-parametric

Procedia PDF Downloads 465
23868 Guidelines for Sustainable Urban Mobility in Historic Districts from International Experiences

Authors: Tamer ElSerafi

Abstract:

In recent approaches to heritage conservation, the whole context of historic areas becomes as important as the single historic building. This makes the provision of infrastructure and network of mobility an effective element in the urban conservation. Sustainable urban conservation projects consider the high density of activities, the need for a good quality access system to the transit system, and the importance of the configuration of the mobility network by identifying the best way to connect the different districts of the urban area through a complex unique system that helps the synergic development to achieve a sustainable mobility system. A sustainable urban mobility is a key factor in maintaining the integrity between socio-cultural aspects and functional aspects. This paper illustrates the mobility aspects, mobility problems in historic districts, and the needs of the mobility systems in the first part. The second part is a practical analysis for different mobility plans. It is challenging to find innovative and creative conservation solutions fitting modern uses and needs without risking the loss of inherited built resources. Urban mobility management is becoming an essential and challenging issue in the urban conservation projects. Depending on literature review and practical analysis, this paper tries to define and clarify the guidelines for mobility management in historic districts as a key element in sustainability of urban conservation and development projects. Such rules and principles could control the conflict between the socio–cultural and economic activities, and the different needs for mobility in these districts in a sustainable way. The practical analysis includes a comparison between mobility plans which have been implemented in four different cities; Freiburg in Germany, Zurich in Switzerland and Bray Town in Ireland. This paper concludes with a matrix of guidelines that considers both principles of sustainability and livability factors in urban historic districts.

Keywords: sustainable mobility, urban mobility, mobility management, historic districts

Procedia PDF Downloads 142
23867 A Data Mining Approach for Analysing and Predicting the Bank's Asset Liability Management Based on Basel III Norms

Authors: Nidhin Dani Abraham, T. K. Sri Shilpa

Abstract:

Asset liability management is an important aspect in banking business. Moreover, the today’s banking is based on BASEL III which strictly regulates on the counterparty default. This paper focuses on prediction and analysis of counter party default risk, which is a type of risk occurs when the customers fail to repay the amount back to the lender (bank or any financial institutions). This paper proposes an approach to reduce the counterparty risk occurring in the financial institutions using an appropriate data mining technique and thus predicts the occurrence of NPA. It also helps in asset building and restructuring quality. Liability management is very important to carry out banking business. To know and analyze the depth of liability of bank, a suitable technique is required. For that a data mining technique is being used to predict the dormant behaviour of various deposit bank customers. Various models are implemented and the results are analyzed of saving bank deposit customers. All these data are cleaned using data cleansing approach from the bank data warehouse.

Keywords: data mining, asset liability management, BASEL III, banking

Procedia PDF Downloads 535
23866 A Dynamic Ensemble Learning Approach for Online Anomaly Detection in Alibaba Datacenters

Authors: Wanyi Zhu, Xia Ming, Huafeng Wang, Junda Chen, Lu Liu, Jiangwei Jiang, Guohua Liu

Abstract:

Anomaly detection is a first and imperative step needed to respond to unexpected problems and to assure high performance and security in large data center management. This paper presents an online anomaly detection system through an innovative approach of ensemble machine learning and adaptive differentiation algorithms, and applies them to performance data collected from a continuous monitoring system for multi-tier web applications running in Alibaba data centers. We evaluate the effectiveness and efficiency of this algorithm with production traffic data and compare with the traditional anomaly detection approaches such as a static threshold and other deviation-based detection techniques. The experiment results show that our algorithm correctly identifies the unexpected performance variances of any running application, with an acceptable false positive rate. This proposed approach has already been deployed in real-time production environments to enhance the efficiency and stability in daily data center operations.

Keywords: Alibaba data centers, anomaly detection, big data computation, dynamic ensemble learning

Procedia PDF Downloads 180
23865 Unsupervised Text Mining Approach to Early Warning System

Authors: Ichihan Tai, Bill Olson, Paul Blessner

Abstract:

Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.

Keywords: early warning system, knowledge management, market prediction, topic modeling.

Procedia PDF Downloads 322
23864 The Role of Synthetic Data in Aerial Object Detection

Authors: Ava Dodd, Jonathan Adams

Abstract:

The purpose of this study is to explore the characteristics of developing a machine learning application using synthetic data. The study is structured to develop the application for the purpose of deploying the computer vision model. The findings discuss the realities of attempting to develop a computer vision model for practical purpose, and detail the processes, tools, and techniques that were used to meet accuracy requirements. The research reveals that synthetic data represents another variable that can be adjusted to improve the performance of a computer vision model. Further, a suite of tools and tuning recommendations are provided.

Keywords: computer vision, machine learning, synthetic data, YOLOv4

Procedia PDF Downloads 208
23863 Perception-Oriented Model Driven Development for Designing Data Acquisition Process in Wireless Sensor Networks

Authors: K. Indra Gandhi

Abstract:

Wireless Sensor Networks (WSNs) have always been characterized for application-specific sensing, relaying and collection of information for further analysis. However, software development was not considered as a separate entity in this process of data collection which has posed severe limitations on the software development for WSN. Software development for WSN is a complex process since the components involved are data-driven, network-driven and application-driven in nature. This implies that there is a tremendous need for the separation of concern from the software development perspective. A layered approach for developing data acquisition design based on Model Driven Development (MDD) has been proposed as the sensed data collection process itself varies depending upon the application taken into consideration. This work focuses on the layered view of the data acquisition process so as to ease the software point of development. A metamodel has been proposed that enables reusability and realization of the software development as an adaptable component for WSN systems. Further, observing users perception indicates that proposed model helps in improving the programmer's productivity by realizing the collaborative system involved.

Keywords: data acquisition, model-driven development, separation of concern, wireless sensor networks

Procedia PDF Downloads 418
23862 Comparative Analysis of Data Gathering Protocols with Multiple Mobile Elements for Wireless Sensor Network

Authors: Bhat Geetalaxmi Jairam, D. V. Ashoka

Abstract:

Wireless Sensor Networks are used in many applications to collect sensed data from different sources. Sensed data has to be delivered through sensors wireless interface using multi-hop communication towards the sink. The data collection in wireless sensor networks consumes energy. Energy consumption is the major constraints in WSN .Reducing the energy consumption while increasing the amount of generated data is a great challenge. In this paper, we have implemented two data gathering protocols with multiple mobile sinks/elements to collect data from sensor nodes. First, is Energy-Efficient Data Gathering with Tour Length-Constrained Mobile Elements in Wireless Sensor Networks (EEDG), in which mobile sinks uses vehicle routing protocol to collect data. Second is An Intelligent Agent-based Routing Structure for Mobile Sinks in WSNs (IAR), in which mobile sinks uses prim’s algorithm to collect data. Authors have implemented concepts which are common to both protocols like deployment of mobile sinks, generating visiting schedule, collecting data from the cluster member. Authors have compared the performance of both protocols by taking statistics based on performance parameters like Delay, Packet Drop, Packet Delivery Ratio, Energy Available, Control Overhead. Authors have concluded this paper by proving EEDG is more efficient than IAR protocol but with few limitations which include unaddressed issues likes Redundancy removal, Idle listening, Mobile Sink’s pause/wait state at the node. In future work, we plan to concentrate more on these limitations to avail a new energy efficient protocol which will help in improving the life time of the WSN.

Keywords: aggregation, consumption, data gathering, efficiency

Procedia PDF Downloads 476
23861 Status and Results from EXO-200

Authors: Ryan Maclellan

Abstract:

EXO-200 has provided one of the most sensitive searches for neutrinoless double-beta decay utilizing 175 kg of enriched liquid xenon in an ultra-low background time projection chamber. This detector has demonstrated excellent energy resolution and background rejection capabilities. Using the first two years of data, EXO-200 has set a limit of 1.1x10^25 years at 90% C.L. on the neutrinoless double-beta decay half-life of Xe-136. The experiment has experienced a brief hiatus in data taking during a temporary shutdown of its host facility: the Waste Isolation Pilot Plant. EXO-200 expects to resume data taking in earnest this fall with upgraded detector electronics. Results from the analysis of EXO-200 data and an update on the current status of EXO-200 will be presented.

Keywords: double-beta, Majorana, neutrino, neutrinoless

Procedia PDF Downloads 398
23860 Remaining Useful Life (RUL) Assessment Using Progressive Bearing Degradation Data and ANN Model

Authors: Amit R. Bhende, G. K. Awari

Abstract:

Remaining useful life (RUL) prediction is one of key technologies to realize prognostics and health management that is being widely applied in many industrial systems to ensure high system availability over their life cycles. The present work proposes a data-driven method of RUL prediction based on multiple health state assessment for rolling element bearings. Bearing degradation data at three different conditions from run to failure is used. A RUL prediction model is separately built in each condition. Feed forward back propagation neural network models are developed for prediction modeling.

Keywords: bearing degradation data, remaining useful life (RUL), back propagation, prognosis

Procedia PDF Downloads 417
23859 Spatio-Temporal Data Mining with Association Rules for Lake Van

Authors: Tolga Aydin, M. Fatih Alaeddinoğlu

Abstract:

People, throughout the history, have made estimates and inferences about the future by using their past experiences. Developing information technologies and the improvements in the database management systems make it possible to extract useful information from knowledge in hand for the strategic decisions. Therefore, different methods have been developed. Data mining by association rules learning is one of such methods. Apriori algorithm, one of the well-known association rules learning algorithms, is not commonly used in spatio-temporal data sets. However, it is possible to embed time and space features into the data sets and make Apriori algorithm a suitable data mining technique for learning spatio-temporal association rules. Lake Van, the largest lake of Turkey, is a closed basin. This feature causes the volume of the lake to increase or decrease as a result of change in water amount it holds. In this study, evaporation, humidity, lake altitude, amount of rainfall and temperature parameters recorded in Lake Van region throughout the years are used by the Apriori algorithm and a spatio-temporal data mining application is developed to identify overflows and newly-formed soil regions (underflows) occurring in the coastal parts of Lake Van. Identifying possible reasons of overflows and underflows may be used to alert the experts to take precautions and make the necessary investments.

Keywords: apriori algorithm, association rules, data mining, spatio-temporal data

Procedia PDF Downloads 356
23858 Building Data Infrastructure for Public Use and Informed Decision Making in Developing Countries-Nigeria

Authors: Busayo Fashoto, Abdulhakeem Shaibu, Justice Agbadu, Samuel Aiyeoribe

Abstract:

Data has gone from just rows and columns to being an infrastructure itself. The traditional medium of data infrastructure has been managed by individuals in different industries and saved on personal work tools; one of such is the laptop. This hinders data sharing and Sustainable Development Goal (SDG) 9 for infrastructure sustainability across all countries and regions. However, there has been a constant demand for data across different agencies and ministries by investors and decision-makers. The rapid development and adoption of open-source technologies that promote the collection and processing of data in new ways and in ever-increasing volumes are creating new data infrastructure in sectors such as lands and health, among others. This paper examines the process of developing data infrastructure and, by extension, a data portal to provide baseline data for sustainable development and decision making in Nigeria. This paper employs the FAIR principle (Findable, Accessible, Interoperable, and Reusable) of data management using open-source technology tools to develop data portals for public use. eHealth Africa, an organization that uses technology to drive public health interventions in Nigeria, developed a data portal which is a typical data infrastructure that serves as a repository for various datasets on administrative boundaries, points of interest, settlements, social infrastructure, amenities, and others. This portal makes it possible for users to have access to datasets of interest at any point in time at no cost. A skeletal infrastructure of this data portal encompasses the use of open-source technology such as Postgres database, GeoServer, GeoNetwork, and CKan. These tools made the infrastructure sustainable, thus promoting the achievement of SDG 9 (Industries, Innovation, and Infrastructure). As of 6th August 2021, a wider cross-section of 8192 users had been created, 2262 datasets had been downloaded, and 817 maps had been created from the platform. This paper shows the use of rapid development and adoption of technologies that facilitates data collection, processing, and publishing in new ways and in ever-increasing volumes. In addition, the paper is explicit on new data infrastructure in sectors such as health, social amenities, and agriculture. Furthermore, this paper reveals the importance of cross-sectional data infrastructures for planning and decision making, which in turn can form a central data repository for sustainable development across developing countries.

Keywords: data portal, data infrastructure, open source, sustainability

Procedia PDF Downloads 77
23857 Process Data-Driven Representation of Abnormalities for Efficient Process Control

Authors: Hyun-Woo Cho

Abstract:

Unexpected operational events or abnormalities of industrial processes have a serious impact on the quality of final product of interest. In terms of statistical process control, fault detection and diagnosis of processes is one of the essential tasks needed to run the process safely. In this work, nonlinear representation of process measurement data is presented and evaluated using a simulation process. The effect of using different representation methods on the diagnosis performance is tested in terms of computational efficiency and data handling. The results have shown that the nonlinear representation technique produced more reliable diagnosis results and outperforms linear methods. The use of data filtering step improved computational speed and diagnosis performance for test data sets. The presented scheme is different from existing ones in that it attempts to extract the fault pattern in the reduced space, not in the original process variable space. Thus this scheme helps to reduce the sensitivity of empirical models to noise.

Keywords: fault diagnosis, nonlinear technique, process data, reduced spaces

Procedia PDF Downloads 231
23856 Text-to-Speech in Azerbaijani Language via Transfer Learning in a Low Resource Environment

Authors: Dzhavidan Zeinalov, Bugra Sen, Firangiz Aslanova

Abstract:

Most text-to-speech models cannot operate well in low-resource languages and require a great amount of high-quality training data to be considered good enough. Yet, with the improvements made in ASR systems, it is now much easier than ever to collect data for the design of custom text-to-speech models. In this work, our work on using the ASR model to collect data to build a viable text-to-speech system for one of the leading financial institutions of Azerbaijan will be outlined. NVIDIA’s implementation of the Tacotron 2 model was utilized along with the HiFiGAN vocoder. As for the training, the model was first trained with high-quality audio data collected from the Internet, then fine-tuned on the bank’s single speaker call center data. The results were then evaluated by 50 different listeners and got a mean opinion score of 4.17, displaying that our method is indeed viable. With this, we have successfully designed the first text-to-speech model in Azerbaijani and publicly shared 12 hours of audiobook data for everyone to use.

Keywords: Azerbaijani language, HiFiGAN, Tacotron 2, text-to-speech, transfer learning, whisper

Procedia PDF Downloads 22
23855 An Empirical Evaluation of Performance of Machine Learning Techniques on Imbalanced Software Quality Data

Authors: Ruchika Malhotra, Megha Khanna

Abstract:

The development of change prediction models can help the software practitioners in planning testing and inspection resources at early phases of software development. However, a major challenge faced during the training process of any classification model is the imbalanced nature of the software quality data. A data with very few minority outcome categories leads to inefficient learning process and a classification model developed from the imbalanced data generally does not predict these minority categories correctly. Thus, for a given dataset, a minority of classes may be change prone whereas a majority of classes may be non-change prone. This study explores various alternatives for adeptly handling the imbalanced software quality data using different sampling methods and effective MetaCost learners. The study also analyzes and justifies the use of different performance metrics while dealing with the imbalanced data. In order to empirically validate different alternatives, the study uses change data from three application packages of open-source Android data set and evaluates the performance of six different machine learning techniques. The results of the study indicate extensive improvement in the performance of the classification models when using resampling method and robust performance measures.

Keywords: change proneness, empirical validation, imbalanced learning, machine learning techniques, object-oriented metrics

Procedia PDF Downloads 403