Search results for: consumer data right
24705 Road Accidents Bigdata Mining and Visualization Using Support Vector Machines
Authors: Usha Lokala, Srinivas Nowduri, Prabhakar K. Sharma
Abstract:
Useful information has been extracted from the road accident data in United Kingdom (UK), using data analytics method, for avoiding possible accidents in rural and urban areas. This analysis make use of several methodologies such as data integration, support vector machines (SVM), correlation machines and multinomial goodness. The entire datasets have been imported from the traffic department of UK with due permission. The information extracted from these huge datasets forms a basis for several predictions, which in turn avoid unnecessary memory lapses. Since data is expected to grow continuously over a period of time, this work primarily proposes a new framework model which can be trained and adapt itself to new data and make accurate predictions. This work also throws some light on use of SVM’s methodology for text classifiers from the obtained traffic data. Finally, it emphasizes the uniqueness and adaptability of SVMs methodology appropriate for this kind of research work.Keywords: support vector mechanism (SVM), machine learning (ML), support vector machines (SVM), department of transportation (DFT)
Procedia PDF Downloads 27424704 A Relational Data Base for Radiation Therapy
Authors: Raffaele Danilo Esposito, Domingo Planes Meseguer, Maria Del Pilar Dorado Rodriguez
Abstract:
As far as we know, it is still unavailable a commercial solution which would allow to manage, openly and configurable up to user needs, the huge amount of data generated in a modern Radiation Oncology Department. Currently, available information management systems are mainly focused on Record & Verify and clinical data, and only to a small extent on physical data. Thus, results in a partial and limited use of the actually available information. In the present work we describe the implementation at our department of a centralized information management system based on a web server. Our system manages both information generated during patient planning and treatment, and information of general interest for the whole department (i.e. treatment protocols, quality assurance protocols etc.). Our objective it to be able to analyze in a simple and efficient way all the available data and thus to obtain quantitative evaluations of our treatments. This would allow us to improve our work flow and protocols. To this end we have implemented a relational data base which would allow us to use in a practical and efficient way all the available information. As always we only use license free software.Keywords: information management system, radiation oncology, medical physics, free software
Procedia PDF Downloads 24124703 A Study of Safety of Data Storage Devices of Graduate Students at Suan Sunandha Rajabhat University
Authors: Komol Phaisarn, Natcha Wattanaprapa
Abstract:
This research is a survey research with an objective to study the safety of data storage devices of graduate students of academic year 2013, Suan Sunandha Rajabhat University. Data were collected by questionnaire on the safety of data storage devices according to CIA principle. A sample size of 81 was drawn from population by purposive sampling method. The results show that most of the graduate students of academic year 2013 at Suan Sunandha Rajabhat University use handy drive to store their data and the safety level of the devices is at good level.Keywords: security, safety, storage devices, graduate students
Procedia PDF Downloads 35324702 Simulation of a Cost Model Response Requests for Replication in Data Grid Environment
Authors: Kaddi Mohammed, A. Benatiallah, D. Benatiallah
Abstract:
Data grid is a technology that has full emergence of new challenges, such as the heterogeneity and availability of various resources and geographically distributed, fast data access, minimizing latency and fault tolerance. Researchers interested in this technology address the problems of the various systems related to the industry such as task scheduling, load balancing and replication. The latter is an effective solution to achieve good performance in terms of data access and grid resources and better availability of data cost. In a system with duplication, a coherence protocol is used to impose some degree of synchronization between the various copies and impose some order on updates. In this project, we present an approach for placing replicas to minimize the cost of response of requests to read or write, and we implement our model in a simulation environment. The placement techniques are based on a cost model which depends on several factors, such as bandwidth, data size and storage nodes.Keywords: response time, query, consistency, bandwidth, storage capacity, CERN
Procedia PDF Downloads 27124701 Prompt Design for Code Generation in Data Analysis Using Large Language Models
Authors: Lu Song Ma Li Zhi
Abstract:
With the rapid advancement of artificial intelligence technology, large language models (LLMs) have become a milestone in the field of natural language processing, demonstrating remarkable capabilities in semantic understanding, intelligent question answering, and text generation. These models are gradually penetrating various industries, particularly showcasing significant application potential in the data analysis domain. However, retraining or fine-tuning these models requires substantial computational resources and ample downstream task datasets, which poses a significant challenge for many enterprises and research institutions. Without modifying the internal parameters of the large models, prompt engineering techniques can rapidly adapt these models to new domains. This paper proposes a prompt design strategy aimed at leveraging the capabilities of large language models to automate the generation of data analysis code. By carefully designing prompts, data analysis requirements can be described in natural language, which the large language model can then understand and convert into executable data analysis code, thereby greatly enhancing the efficiency and convenience of data analysis. This strategy not only lowers the threshold for using large models but also significantly improves the accuracy and efficiency of data analysis. Our approach includes requirements for the precision of natural language descriptions, coverage of diverse data analysis needs, and mechanisms for immediate feedback and adjustment. Experimental results show that with this prompt design strategy, large language models perform exceptionally well in multiple data analysis tasks, generating high-quality code and significantly shortening the data analysis cycle. This method provides an efficient and convenient tool for the data analysis field and demonstrates the enormous potential of large language models in practical applications.Keywords: large language models, prompt design, data analysis, code generation
Procedia PDF Downloads 3924700 Comparison of Different Methods to Produce Fuzzy Tolerance Relations for Rainfall Data Classification in the Region of Central Greece
Authors: N. Samarinas, C. Evangelides, C. Vrekos
Abstract:
The aim of this paper is the comparison of three different methods, in order to produce fuzzy tolerance relations for rainfall data classification. More specifically, the three methods are correlation coefficient, cosine amplitude and max-min method. The data were obtained from seven rainfall stations in the region of central Greece and refers to 20-year time series of monthly rainfall height average. Three methods were used to express these data as a fuzzy relation. This specific fuzzy tolerance relation is reformed into an equivalence relation with max-min composition for all three methods. From the equivalence relation, the rainfall stations were categorized and classified according to the degree of confidence. The classification shows the similarities among the rainfall stations. Stations with high similarity can be utilized in water resource management scenarios interchangeably or to augment data from one to another. Due to the complexity of calculations, it is important to find out which of the methods is computationally simpler and needs fewer compositions in order to give reliable results.Keywords: classification, fuzzy logic, tolerance relations, rainfall data
Procedia PDF Downloads 31424699 Using Axiomatic Design for Developing a Framework of Manufacturing Cloud Service Composition in the Equilibrium State
Authors: Ehsan Vaziri Goodarzi, Mahmood Houshmand, Omid Fatahi Valilai, Vahidreza Ghezavati, Shahrooz Bamdad
Abstract:
One important paradigm of industry 4.0 is Cloud Manufacturing (CM). In CM everything is considered as a service, therefore, the CM platform should consider all service provider's capabilities and tries to integrate services in an equilibrium state. This research develops a framework for implementing manufacturing cloud service composition in the equilibrium state. The developed framework using well-known tools called axiomatic design (AD) and game theory. The research has investigated the factors for forming equilibrium for measures of the manufacturing cloud service composition. Functional requirements (FRs) represent the measures of manufacturing cloud service composition in the equilibrium state. These FRs satisfied by related Design Parameters (DPs). The FRs and DPs are defined by considering the game theory, QoS, consumer needs, parallel and cooperative services. Ultimately, four FRs and DPs represent the framework. To insure the validity of the framework, the authors have used the first AD’s independent axiom.Keywords: axiomatic design, manufacturing cloud service composition, cloud manufacturing, industry 4.0
Procedia PDF Downloads 17324698 Customer Satisfaction and Effective HRM Policies: Customer and Employee Satisfaction
Authors: S. Anastasiou, C. Nathanailides
Abstract:
The purpose of this study is to examine the possible link between employee and customer satisfaction. The service provided by employees, help to build a good relationship with customers and can help at increasing their loyalty. Published data for job satisfaction and indicators of customer services were gathered from relevant published works which included data from five different countries. The reviewed data indicate a significant correlation between indicators of customer and employee satisfaction in the Banking sector. There was a significant correlation between the two parameters (Pearson correlation R2=0.52 P<0.05) The reviewed data provide evidence that there is some practical evidence which links these two parameters.Keywords: job satisfaction, job performance, customer’ service, banks, human resources management
Procedia PDF Downloads 32124697 Generation of Automated Alarms for Plantwide Process Monitoring
Authors: Hyun-Woo Cho
Abstract:
Earlier detection of incipient abnormal operations in terms of plant-wide process management is quite necessary in order to improve product quality and process safety. And generating warning signals or alarms for operating personnel plays an important role in process automation and intelligent plant health monitoring. Various methodologies have been developed and utilized in this area such as expert systems, mathematical model-based approaches, multivariate statistical approaches, and so on. This work presents a nonlinear empirical monitoring methodology based on the real-time analysis of massive process data. Unfortunately, the big data includes measurement noises and unwanted variations unrelated to true process behavior. Thus the elimination of such unnecessary patterns of the data is executed in data processing step to enhance detection speed and accuracy. The performance of the methodology was demonstrated using simulated process data. The case study showed that the detection speed and performance was improved significantly irrespective of the size and the location of abnormal events.Keywords: detection, monitoring, process data, noise
Procedia PDF Downloads 25224696 The Search of New Laws for a Gluten Kingdom
Authors: Mohammed Saleem Tariq
Abstract:
The enthusiasm for gluten avoidance in a growing market is met by improvements in sensitive detection methods for analysing gluten content. Paradoxically, manufacturers employ no such systems in the production process but continue to market their product as gluten free, a significant risk posed to an undetermined coeliac population. The paper resonates with an immunological response that causes gastrointestinal scarring and villous atrophy with the conventional description of personal injury. The current developing regime in the UK however, it is discussed, has avoided creating specific rules to provide an adequate level of protection for this type of vulnerable ‘characteristic’. Due to the struggle involved with identifying an appropriate cause of action, this paper analyses whether a claim brought in misrepresentation, negligence and/or under the Consumer Protect Act 1987 could be sustained. A necessary comparison is then made with the approach adopted by the Americans with Disability Act 1990 which recognises this chronic disease as a disability. The ongoing failure to introduce a level of protection which matches that afforded to those who fall into any one of the ‘protected characteristics’ under the Equality Act 2010, is inconceivable given the outstanding level of legal vulnerability.Keywords: coeliac, litigation, misrepresentation, negligence
Procedia PDF Downloads 36524695 Meanings and Concepts of Standardization in Systems Medicine
Authors: Imme Petersen, Wiebke Sick, Regine Kollek
Abstract:
In systems medicine, high-throughput technologies produce large amounts of data on different biological and pathological processes, including (disturbed) gene expressions, metabolic pathways and signaling. The large volume of data of different types, stored in separate databases and often located at different geographical sites have posed new challenges regarding data handling and processing. Tools based on bioinformatics have been developed to resolve the upcoming problems of systematizing, standardizing and integrating the various data. However, the heterogeneity of data gathered at different levels of biological complexity is still a major challenge in data analysis. To build multilayer disease modules, large and heterogeneous data of disease-related information (e.g., genotype, phenotype, environmental factors) are correlated. Therefore, a great deal of attention in systems medicine has been put on data standardization, primarily to retrieve and combine large, heterogeneous datasets into standardized and incorporated forms and structures. However, this data-centred concept of standardization in systems medicine is contrary to the debate in science and technology studies (STS) on standardization that rather emphasizes the dynamics, contexts and negotiations of standard operating procedures. Based on empirical work on research consortia that explore the molecular profile of diseases to establish systems medical approaches in the clinic in Germany, we trace how standardized data are processed and shaped by bioinformatics tools, how scientists using such data in research perceive such standard operating procedures and which consequences for knowledge production (e.g. modeling) arise from it. Hence, different concepts and meanings of standardization are explored to get a deeper insight into standard operating procedures not only in systems medicine, but also beyond.Keywords: data, science and technology studies (STS), standardization, systems medicine
Procedia PDF Downloads 34124694 Clean Energy and Free Trade: Redefining 'Like Products' to Account for Climate Change
Authors: M. Barsa
Abstract:
This paper argues that current jurisprudence under the Dormant Commerce Clause of the United States Constitution and the WTO should be altered to allow states to more freely foster clean energy production. In particular, free trade regimes typically prevent states from discriminating against 'like' products, and whether these products are considered 'like' is typically measured by how they appear to the consumer. This makes it challenging for states to discriminate in favor of clean energy, such as low-carbon fuels. However, this paper points out that certain courts in the US—and decisions of the WTO—have already begun taking into account how a product is manufactured in order to determine whether a state may discriminate against it. There are also compelling reasons for states to discriminate against energy sources with high carbon footprints in order to allow those states to protect themselves against climate change. In other words, fuel sources with high and low carbon footprints are not, in fact, 'like' products, and courts should more freely recognize this in order to foster clean energy production.Keywords: clean energy, climate change, discrimination, free trade
Procedia PDF Downloads 12124693 Integrated On-Board Diagnostic-II and Direct Controller Area Network Access for Vehicle Monitoring System
Authors: Kavian Khosravinia, Mohd Khair Hassan, Ribhan Zafira Abdul Rahman, Syed Abdul Rahman Al-Haddad
Abstract:
The CAN (controller area network) bus is introduced as a multi-master, message broadcast system. The messages sent on the CAN are used to communicate state information, referred as a signal between different ECUs, which provides data consistency in every node of the system. OBD-II Dongles that are based on request and response method is the wide-spread solution for extracting sensor data from cars among researchers. Unfortunately, most of the past researches do not consider resolution and quantity of their input data extracted through OBD-II technology. The maximum feasible scan rate is only 9 queries per second which provide 8 data points per second with using ELM327 as well-known OBD-II dongle. This study aims to develop and design a programmable, and latency-sensitive vehicle data acquisition system that improves the modularity and flexibility to extract exact, trustworthy, and fresh car sensor data with higher frequency rates. Furthermore, the researcher must break apart, thoroughly inspect, and observe the internal network of the vehicle, which may cause severe damages to the expensive ECUs of the vehicle due to intrinsic vulnerabilities of the CAN bus during initial research. Desired sensors data were collected from various vehicles utilizing Raspberry Pi3 as computing and processing unit with using OBD (request-response) and direct CAN method at the same time. Two types of data were collected for this study. The first, CAN bus frame data that illustrates data collected for each line of hex data sent from an ECU and the second type is the OBD data that represents some limited data that is requested from ECU under standard condition. The proposed system is reconfigurable, human-readable and multi-task telematics device that can be fitted into any vehicle with minimum effort and minimum time lag in the data extraction process. The standard operational procedure experimental vehicle network test bench is developed and can be used for future vehicle network testing experiment.Keywords: CAN bus, OBD-II, vehicle data acquisition, connected cars, telemetry, Raspberry Pi3
Procedia PDF Downloads 20324692 Big Data in Construction Project Management: The Colombian Northeast Case
Authors: Sergio Zabala-Vargas, Miguel Jiménez-Barrera, Luz VArgas-Sánchez
Abstract:
In recent years, information related to project management in organizations has been increasing exponentially. Performance data, management statistics, indicator results have forced the collection, analysis, traceability, and dissemination of project managers to be essential. In this sense, there are current trends to facilitate efficient decision-making in emerging technology projects, such as: Machine Learning, Data Analytics, Data Mining, and Big Data. The latter is the most interesting in this project. This research is part of the thematic line Construction methods and project management. Many authors present the relevance that the use of emerging technologies, such as Big Data, has taken in recent years in project management in the construction sector. The main focus is the optimization of time, scope, budget, and in general mitigating risks. This research was developed in the northeastern region of Colombia-South America. The first phase was aimed at diagnosing the use of emerging technologies (Big-Data) in the construction sector. In Colombia, the construction sector represents more than 50% of the productive system, and more than 2 million people participate in this economic segment. The quantitative approach was used. A survey was applied to a sample of 91 companies in the construction sector. Preliminary results indicate that the use of Big Data and other emerging technologies is very low and also that there is interest in modernizing project management. There is evidence of a correlation between the interest in using new data management technologies and the incorporation of Building Information Modeling BIM. The next phase of the research will allow the generation of guidelines and strategies for the incorporation of technological tools in the construction sector in Colombia.Keywords: big data, building information modeling, tecnology, project manamegent
Procedia PDF Downloads 12824691 Trade Policy Incentives and Economic Growth in Nigeria
Authors: Emmanuel Dele Balogun
Abstract:
This paper analyzes, using descriptive statistics and econometrics data which span the period 1981 to 2014 to gauge the effects of trade policy incentives on economic growth in Nigeria. It argues that the provided incentives penalize economic growth during pre-trade liberalization eras, but stimulated a rapid increase in total factor productivity during the post-liberalization period of 2000 to 2014. The trend analysis shows that Nigeria maintained high tariff walls in economic regulation eras which became low in post liberalization era. The protections were in favor of infant industries, which were mainly appendages of multinationals but against imports of competing food and finished consumer products. The trade openness index confirms the undue exposure of Nigeria’s economy to the vagaries of international market shocks; while banking sector recapitalization and new listing of telecommunications companies deepened the financial markets in post-liberalization era. The structure of economic incentives was biased in favor of construction, trade and services, but against the real sector despite protectionist policies. Total Factor Productivity (TFP) estimates show that the Nigerian economy suffered stagnation in pre-liberalization eras, but experienced rapid growth rates in post-liberalization eras. The regression results relating trade policy incentives to TFP growth rate yielded a significant but negative intercept suggesting that a non-interventionist policy could be detrimental to economic progress, while protective tariff which limits imports of competing products could spur productivity gains in domestic import substitutes beyond factor growth with market liberalization. The main constraint to the effectiveness of trade policy incentives is the failure of benefiting industries to leverage on the domestic factor endowments of the nation. This paper concludes that there is the need to review the current economic transformation strategies urgently with a view to provide policymakers with a better understanding of the most viable options that could make for rapid success.Keywords: economic growth, macroeconomic incentives, total factor productivity, trade policies
Procedia PDF Downloads 32224690 Minimum Data of a Speech Signal as Special Indicators of Identification in Phonoscopy
Authors: Nazaket Gazieva
Abstract:
Voice biometric data associated with physiological, psychological and other factors are widely used in forensic phonoscopy. There are various methods for identifying and verifying a person by voice. This article explores the minimum speech signal data as individual parameters of a speech signal. Monozygotic twins are believed to be genetically identical. Using the minimum data of the speech signal, we came to the conclusion that the voice imprint of monozygotic twins is individual. According to the conclusion of the experiment, we can conclude that the minimum indicators of the speech signal are more stable and reliable for phonoscopic examinations.Keywords: phonogram, speech signal, temporal characteristics, fundamental frequency, biometric fingerprints
Procedia PDF Downloads 14424689 Disaggregating and Forecasting the Total Energy Consumption of a Building: A Case Study of a High Cooling Demand Facility
Authors: Juliana Barcelos Cordeiro, Khashayar Mahani, Farbod Farzan, Mohsen A. Jafari
Abstract:
Energy disaggregation has been focused by many energy companies since energy efficiency can be achieved when the breakdown of energy consumption is known. Companies have been investing in technologies to come up with software and/or hardware solutions that can provide this type of information to the consumer. On the other hand, not all people can afford to have these technologies. Therefore, in this paper, we present a methodology for breaking down the aggregate consumption and identifying the highdemanding end-uses profiles. These energy profiles will be used to build the forecast model for optimal control purpose. A facility with high cooling load is used as an illustrative case study to demonstrate the results of proposed methodology. We apply a high level energy disaggregation through a pattern recognition approach in order to extract the consumption profile of its rooftop packaged units (RTUs) and present a forecast model for the energy consumption.Keywords: energy consumption forecasting, energy efficiency, load disaggregation, pattern recognition approach
Procedia PDF Downloads 27724688 Voltage and Current Control of Microgrid in Grid Connected and Islanded Modes
Authors: Megha Chavda, Parth Thummar, Rahul Ghetia
Abstract:
This paper presents the voltage and current control of microgrid accompanied by the synchronization of microgrid with the main utility grid in both islanded and grid-connected modes. Distributed Energy Resources (DERs) satisfy the wide-spread power demand of consumer by behaving as a micro source for a low voltage (LV) grid or microgrid. Synchronization of the microgrid with the main utility grid is done using PLL and PWM gate pulse generation technique is used for the Voltage Source Converter. Potential Function method achieves the voltage and current control of this microgrid in both islanded and grid-connected modes. A low voltage grid consisting of three distributed generators (DG) is considered for the study and is simulated in time-domain using PSCAD/EMTDC software. The simulation results depict the appropriateness of voltage and current control of microgrid and synchronization of microgrid with the medium voltage (MV) grid.Keywords: microgrid, distributed energy resources, voltage and current control, voltage source converter, pulse width modulation, phase locked loop
Procedia PDF Downloads 41424687 A Non-parametric Clustering Approach for Multivariate Geostatistical Data
Authors: Francky Fouedjio
Abstract:
Multivariate geostatistical data have become omnipresent in the geosciences and pose substantial analysis challenges. One of them is the grouping of data locations into spatially contiguous clusters so that data locations within the same cluster are more similar while clusters are different from each other, in some sense. Spatially contiguous clusters can significantly improve the interpretation that turns the resulting clusters into meaningful geographical subregions. In this paper, we develop an agglomerative hierarchical clustering approach that takes into account the spatial dependency between observations. It relies on a dissimilarity matrix built from a non-parametric kernel estimator of the spatial dependence structure of data. It integrates existing methods to find the optimal cluster number and to evaluate the contribution of variables to the clustering. The capability of the proposed approach to provide spatially compact, connected and meaningful clusters is assessed using bivariate synthetic dataset and multivariate geochemical dataset. The proposed clustering method gives satisfactory results compared to other similar geostatistical clustering methods.Keywords: clustering, geostatistics, multivariate data, non-parametric
Procedia PDF Downloads 47724686 Power of Doubling: Population Growth and Resource Consumption
Authors: Sarika Bahadure
Abstract:
Sustainability starts with conserving resources for future generations. Since human’s existence on this earth, he has been consuming natural resources. The resource consumption pace in the past was very slow, but industrialization in 18th century brought a change in the human lifestyle. New inventions and discoveries upgraded the human workforce to machines. The mass manufacture of goods provided easy access to products. In the last few decades, the globalization and change in technologies brought consumer oriented market. The consumption of resources has increased at a very high scale. This overconsumption pattern brought economic boom and provided multiple opportunities, but it also put stress on the natural resources. This paper tries to put forth the facts and figures of the population growth and consumption of resources with examples. This is explained with the help of the mathematical expression of doubling known as exponential growth. It compares the carrying capacity of the earth and resource consumption of humans’ i.e. ecological footprint and bio-capacity. Further, it presents the need to conserve natural resources and re-examine sustainable resource use approach for sustainability.Keywords: consumption, exponential growth, population, resources, sustainability
Procedia PDF Downloads 22924685 Big Data in Telecom Industry: Effective Predictive Techniques on Call Detail Records
Authors: Sara ElElimy, Samir Moustafa
Abstract:
Mobile network operators start to face many challenges in the digital era, especially with high demands from customers. Since mobile network operators are considered a source of big data, traditional techniques are not effective with new era of big data, Internet of things (IoT) and 5G; as a result, handling effectively different big datasets becomes a vital task for operators with the continuous growth of data and moving from long term evolution (LTE) to 5G. So, there is an urgent need for effective Big data analytics to predict future demands, traffic, and network performance to full fill the requirements of the fifth generation of mobile network technology. In this paper, we introduce data science techniques using machine learning and deep learning algorithms: the autoregressive integrated moving average (ARIMA), Bayesian-based curve fitting, and recurrent neural network (RNN) are employed for a data-driven application to mobile network operators. The main framework included in models are identification parameters of each model, estimation, prediction, and final data-driven application of this prediction from business and network performance applications. These models are applied to Telecom Italia Big Data challenge call detail records (CDRs) datasets. The performance of these models is found out using a specific well-known evaluation criteria shows that ARIMA (machine learning-based model) is more accurate as a predictive model in such a dataset than the RNN (deep learning model).Keywords: big data analytics, machine learning, CDRs, 5G
Procedia PDF Downloads 13924684 A Data Mining Approach for Analysing and Predicting the Bank's Asset Liability Management Based on Basel III Norms
Authors: Nidhin Dani Abraham, T. K. Sri Shilpa
Abstract:
Asset liability management is an important aspect in banking business. Moreover, the today’s banking is based on BASEL III which strictly regulates on the counterparty default. This paper focuses on prediction and analysis of counter party default risk, which is a type of risk occurs when the customers fail to repay the amount back to the lender (bank or any financial institutions). This paper proposes an approach to reduce the counterparty risk occurring in the financial institutions using an appropriate data mining technique and thus predicts the occurrence of NPA. It also helps in asset building and restructuring quality. Liability management is very important to carry out banking business. To know and analyze the depth of liability of bank, a suitable technique is required. For that a data mining technique is being used to predict the dormant behaviour of various deposit bank customers. Various models are implemented and the results are analyzed of saving bank deposit customers. All these data are cleaned using data cleansing approach from the bank data warehouse.Keywords: data mining, asset liability management, BASEL III, banking
Procedia PDF Downloads 55224683 Parallel Coordinates on a Spiral Surface for Visualizing High-Dimensional Data
Authors: Chris Suma, Yingcai Xiao
Abstract:
This paper presents Parallel Coordinates on a Spiral Surface (PCoSS), a parallel coordinate based interactive visualization method for high-dimensional data, and a test implementation of the method. Plots generated by the test system are compared with those generated by XDAT, a software implementing traditional parallel coordinates. Traditional parallel coordinate plots can be cluttered when the number of data points is large or when the dimensionality of the data is high. PCoSS plots display multivariate data on a 3D spiral surface and allow users to see the whole picture of high-dimensional data with less cluttering. Taking advantage of the 3D display environment in PCoSS, users can further reduce cluttering by zooming into an axis of interest for a closer view or by moving vantage points and by reorienting the viewing angle to obtain a desired view of the plots.Keywords: human computer interaction, parallel coordinates, spiral surface, visualization
Procedia PDF Downloads 1124682 A Dynamic Ensemble Learning Approach for Online Anomaly Detection in Alibaba Datacenters
Authors: Wanyi Zhu, Xia Ming, Huafeng Wang, Junda Chen, Lu Liu, Jiangwei Jiang, Guohua Liu
Abstract:
Anomaly detection is a first and imperative step needed to respond to unexpected problems and to assure high performance and security in large data center management. This paper presents an online anomaly detection system through an innovative approach of ensemble machine learning and adaptive differentiation algorithms, and applies them to performance data collected from a continuous monitoring system for multi-tier web applications running in Alibaba data centers. We evaluate the effectiveness and efficiency of this algorithm with production traffic data and compare with the traditional anomaly detection approaches such as a static threshold and other deviation-based detection techniques. The experiment results show that our algorithm correctly identifies the unexpected performance variances of any running application, with an acceptable false positive rate. This proposed approach has already been deployed in real-time production environments to enhance the efficiency and stability in daily data center operations.Keywords: Alibaba data centers, anomaly detection, big data computation, dynamic ensemble learning
Procedia PDF Downloads 20024681 Audio-Visual Co-Data Processing Pipeline
Authors: Rita Chattopadhyay, Vivek Anand Thoutam
Abstract:
Speech is the most acceptable means of communication where we can quickly exchange our feelings and thoughts. Quite often, people can communicate orally but cannot interact or work with computers or devices. It’s easy and quick to give speech commands than typing commands to computers. In the same way, it’s easy listening to audio played from a device than extract output from computers or devices. Especially with Robotics being an emerging market with applications in warehouses, the hospitality industry, consumer electronics, assistive technology, etc., speech-based human-machine interaction is emerging as a lucrative feature for robot manufacturers. Considering this factor, the objective of this paper is to design the “Audio-Visual Co-Data Processing Pipeline.” This pipeline is an integrated version of Automatic speech recognition, a Natural language model for text understanding, object detection, and text-to-speech modules. There are many Deep Learning models for each type of the modules mentioned above, but OpenVINO Model Zoo models are used because the OpenVINO toolkit covers both computer vision and non-computer vision workloads across Intel hardware and maximizes performance, and accelerates application development. A speech command is given as input that has information about target objects to be detected and start and end times to extract the required interval from the video. Speech is converted to text using the Automatic speech recognition QuartzNet model. The summary is extracted from text using a natural language model Generative Pre-Trained Transformer-3 (GPT-3). Based on the summary, essential frames from the video are extracted, and the You Only Look Once (YOLO) object detection model detects You Only Look Once (YOLO) objects on these extracted frames. Frame numbers that have target objects (specified objects in the speech command) are saved as text. Finally, this text (frame numbers) is converted to speech using text to speech model and will be played from the device. This project is developed for 80 You Only Look Once (YOLO) labels, and the user can extract frames based on only one or two target labels. This pipeline can be extended for more than two target labels easily by making appropriate changes in the object detection module. This project is developed for four different speech command formats by including sample examples in the prompt used by Generative Pre-Trained Transformer-3 (GPT-3) model. Based on user preference, one can come up with a new speech command format by including some examples of the respective format in the prompt used by the Generative Pre-Trained Transformer-3 (GPT-3) model. This pipeline can be used in many projects like human-machine interface, human-robot interaction, and surveillance through speech commands. All object detection projects can be upgraded using this pipeline so that one can give speech commands and output is played from the device.Keywords: OpenVINO, automatic speech recognition, natural language processing, object detection, text to speech
Procedia PDF Downloads 8024680 Unsupervised Text Mining Approach to Early Warning System
Authors: Ichihan Tai, Bill Olson, Paul Blessner
Abstract:
Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.Keywords: early warning system, knowledge management, market prediction, topic modeling.
Procedia PDF Downloads 33824679 The Role of Synthetic Data in Aerial Object Detection
Authors: Ava Dodd, Jonathan Adams
Abstract:
The purpose of this study is to explore the characteristics of developing a machine learning application using synthetic data. The study is structured to develop the application for the purpose of deploying the computer vision model. The findings discuss the realities of attempting to develop a computer vision model for practical purpose, and detail the processes, tools, and techniques that were used to meet accuracy requirements. The research reveals that synthetic data represents another variable that can be adjusted to improve the performance of a computer vision model. Further, a suite of tools and tuning recommendations are provided.Keywords: computer vision, machine learning, synthetic data, YOLOv4
Procedia PDF Downloads 22524678 Chemical Durability of Textured Glass-coat Suitable for Building Application
Authors: Adejo Andrew Ojonugwa, Jomboh Jeff Kator, Garkida Adele Dzikwi
Abstract:
This study investigates the behaviour of textured glass coat to chemical reactions upon application. Samples of textured glass coat developed from mixed post consumer glass were subjected to pH test (ASTM D5464), Chemical resistance test (ASTM D3260 and D1308), Adhesion test (ASTM D3359), and Abrasion test (ASTM D4060). Results shows a pH of 8.50, Chemical resistance of 5% flick rate when reacted with Sodium hydroxide (NaOH), a 3%, 5%, 10%, and 15% discolouration when reacted with Magnesium hydroxide (Mg(OH)2), Hydrogen fluoride (HF), Potassium hydroxide (KOH) and NaOH respectively, an adhesion of 4A and abrasion of 0.2g. The results confirm that the developed textured glass coat is in line with the standard pH range of 8-9, resistant to acid and base except for HF, NaOH, and Mg(OH)₂, good adhesion and abrasion properties, thereby making the coat resistant to chemical degradation and a good engineering material.Keywords: chemical durability, glass-coat, building, recycling
Procedia PDF Downloads 11324677 Perception-Oriented Model Driven Development for Designing Data Acquisition Process in Wireless Sensor Networks
Authors: K. Indra Gandhi
Abstract:
Wireless Sensor Networks (WSNs) have always been characterized for application-specific sensing, relaying and collection of information for further analysis. However, software development was not considered as a separate entity in this process of data collection which has posed severe limitations on the software development for WSN. Software development for WSN is a complex process since the components involved are data-driven, network-driven and application-driven in nature. This implies that there is a tremendous need for the separation of concern from the software development perspective. A layered approach for developing data acquisition design based on Model Driven Development (MDD) has been proposed as the sensed data collection process itself varies depending upon the application taken into consideration. This work focuses on the layered view of the data acquisition process so as to ease the software point of development. A metamodel has been proposed that enables reusability and realization of the software development as an adaptable component for WSN systems. Further, observing users perception indicates that proposed model helps in improving the programmer's productivity by realizing the collaborative system involved.Keywords: data acquisition, model-driven development, separation of concern, wireless sensor networks
Procedia PDF Downloads 43424676 Comparative Analysis of Data Gathering Protocols with Multiple Mobile Elements for Wireless Sensor Network
Authors: Bhat Geetalaxmi Jairam, D. V. Ashoka
Abstract:
Wireless Sensor Networks are used in many applications to collect sensed data from different sources. Sensed data has to be delivered through sensors wireless interface using multi-hop communication towards the sink. The data collection in wireless sensor networks consumes energy. Energy consumption is the major constraints in WSN .Reducing the energy consumption while increasing the amount of generated data is a great challenge. In this paper, we have implemented two data gathering protocols with multiple mobile sinks/elements to collect data from sensor nodes. First, is Energy-Efficient Data Gathering with Tour Length-Constrained Mobile Elements in Wireless Sensor Networks (EEDG), in which mobile sinks uses vehicle routing protocol to collect data. Second is An Intelligent Agent-based Routing Structure for Mobile Sinks in WSNs (IAR), in which mobile sinks uses prim’s algorithm to collect data. Authors have implemented concepts which are common to both protocols like deployment of mobile sinks, generating visiting schedule, collecting data from the cluster member. Authors have compared the performance of both protocols by taking statistics based on performance parameters like Delay, Packet Drop, Packet Delivery Ratio, Energy Available, Control Overhead. Authors have concluded this paper by proving EEDG is more efficient than IAR protocol but with few limitations which include unaddressed issues likes Redundancy removal, Idle listening, Mobile Sink’s pause/wait state at the node. In future work, we plan to concentrate more on these limitations to avail a new energy efficient protocol which will help in improving the life time of the WSN.Keywords: aggregation, consumption, data gathering, efficiency
Procedia PDF Downloads 497