Search results for: missing data
25025 Association of Social Data as a Tool to Support Government Decision Making
Authors: Diego Rodrigues, Marcelo Lisboa, Elismar Batista, Marcos Dias
Abstract:
Based on data on child labor, this work arises questions about how to understand and locate the factors that make up the child labor rates, and which properties are important to analyze these cases. Using data mining techniques to discover valid patterns on Brazilian social databases were evaluated data of child labor in the State of Tocantins (located north of Brazil with a territory of 277000 km2 and comprises 139 counties). This work aims to detect factors that are deterministic for the practice of child labor and their relationships with financial indicators, educational, regional and social, generating information that is not explicit in the government database, thus enabling better monitoring and updating policies for this purpose.Keywords: social data, government decision making, association of social data, data mining
Procedia PDF Downloads 37025024 A Particle Filter-Based Data Assimilation Method for Discrete Event Simulation
Authors: Zhi Zhu, Boquan Zhang, Tian Jing, Jingjing Li, Tao Wang
Abstract:
Data assimilation is a model and data hybrid-driven method that dynamically fuses new observation data with a numerical model to iteratively approach the real system state. It is widely used in state prediction and parameter inference of continuous systems. Because of the discrete event system’s non-linearity and non-Gaussianity, traditional Kalman Filter based on linear and Gaussian assumptions cannot perform data assimilation for such systems, so particle filter has gradually become a technical approach for discrete event simulation data assimilation. Hence, we proposed a particle filter-based discrete event simulation data assimilation method and took the unmanned aerial vehicle (UAV) maintenance service system as a proof of concept to conduct simulation experiments. The experimental results showed that the filtered state data is closer to the real state of the system, which verifies the effectiveness of the proposed method. This research can provide a reference framework for the data assimilation process of other complex nonlinear systems, such as discrete-time and agent simulation.Keywords: discrete event simulation, data assimilation, particle filter, model and data-driven
Procedia PDF Downloads 2025023 Outlier Detection in Stock Market Data using Tukey Method and Wavelet Transform
Authors: Sadam Alwadi
Abstract:
Outlier values become a problem that frequently occurs in the data observation or recording process. Thus, the need for data imputation has become an essential matter. In this work, it will make use of the methods described in the prior work to detect the outlier values based on a collection of stock market data. In order to implement the detection and find some solutions that maybe helpful for investors, real closed price data were obtained from the Amman Stock Exchange (ASE). Tukey and Maximum Overlapping Discrete Wavelet Transform (MODWT) methods will be used to impute the detect the outlier values.Keywords: outlier values, imputation, stock market data, detecting, estimation
Procedia PDF Downloads 8225022 PEINS: A Generic Compression Scheme Using Probabilistic Encoding and Irrational Number Storage
Authors: P. Jayashree, S. Rajkumar
Abstract:
With social networks and smart devices generating a multitude of data, effective data management is the need of the hour for networks and cloud applications. Some applications need effective storage while some other applications need effective communication over networks and data reduction comes as a handy solution to meet out both requirements. Most of the data compression techniques are based on data statistics and may result in either lossy or lossless data reductions. Though lossy reductions produce better compression ratios compared to lossless methods, many applications require data accuracy and miniature details to be preserved. A variety of data compression algorithms does exist in the literature for different forms of data like text, image, and multimedia data. In the proposed work, a generic progressive compression algorithm, based on probabilistic encoding, called PEINS is projected as an enhancement over irrational number stored coding technique to cater to storage issues of increasing data volumes as a cost effective solution, which also offers data security as a secondary outcome to some extent. The proposed work reveals cost effectiveness in terms of better compression ratio with no deterioration in compression time.Keywords: compression ratio, generic compression, irrational number storage, probabilistic encoding
Procedia PDF Downloads 29625021 Iot Device Cost Effective Storage Architecture and Real-Time Data Analysis/Data Privacy Framework
Authors: Femi Elegbeleye, Omobayo Esan, Muienge Mbodila, Patrick Bowe
Abstract:
This paper focused on cost effective storage architecture using fog and cloud data storage gateway and presented the design of the framework for the data privacy model and data analytics framework on a real-time analysis when using machine learning method. The paper began with the system analysis, system architecture and its component design, as well as the overall system operations. The several results obtained from this study on data privacy model shows that when two or more data privacy model is combined we tend to have a more stronger privacy to our data, and when fog storage gateway have several advantages over using the traditional cloud storage, from our result shows fog has reduced latency/delay, low bandwidth consumption, and energy usage when been compare with cloud storage, therefore, fog storage will help to lessen excessive cost. This paper dwelt more on the system descriptions, the researchers focused on the research design and framework design for the data privacy model, data storage, and real-time analytics. This paper also shows the major system components and their framework specification. And lastly, the overall research system architecture was shown, its structure, and its interrelationships.Keywords: IoT, fog, cloud, data analysis, data privacy
Procedia PDF Downloads 10025020 Development and Testing of an Instrument to Measure Beliefs about Cervical Cancer Screening among Women in Botswana
Authors: Ditsapelo M. McFarland
Abstract:
Background: Despite the availability of the Pap smear services in urban areas in Botswana, most women in such areas do not seem to screen regular for prevention of the cervical cancer disease. Reasons for non-use of the available Pap smear services are not well understood. Beliefs about cancer may influence participation in cancer screening in these women. The purpose of this study was to develop an instrument to measure beliefs about cervical cancer and Pap smear screening among Black women in Botswana, and evaluate the psychometric properties of the instrument. Significance: Instruments that are designed to measure beliefs about cervical cancer and screening among black women in Botswana, as well as in the surrounding region, are presently not available. Valid and reliable instruments are needed for exploration of the women’s beliefs about cervical cancer. Conceptual Framework: The Health Belief Model (HBM) provided a conceptual framework for the study. Methodology: The study was done in four phases: Phase 1: item generation: 15 items were generated from literature review and qualitative data for each of four conceptually defined HBM constructs: Perceived susceptibility, severity, benefits, and barriers (Version 1). Phase 2: content validity: Four experts who were advanced practice nurses of African descent and were familiar with the content and the HBM evaluated the content. Experts rated the items on a 4-point Likert scale ranging from: 1=not relevant, 2=somewhat relevant, 3=relevant and 4=very relevant. Fifty-five items were retained for instrument development: perceived susceptibility - 11, severity - 14, benefits - 15 and barriers - 15, all measuring on a 4-point Likert scale ranging from strongly disagree (1) to strongly agree (4). (Version 2). Phase 3: pilot testing: The instrument was pilot tested on a convenient sample of 30 women in Botswana and revised as needed. Phase 4: reliability: the revised instrument (Version 3) was submitted to a larger sample of women in Botswana (n=300) for reliability testing. The sample included women who were Batswana by birth and decent, were aged 30 years and above and could complete an English questionnaire. Data were collected with the assistance of trained research assistants. Major findings: confirmatory factor analysis of the 55 items found that a number of items did not adequately load in a four-factor solution. Items that exhibited reasonable reliability and had low frequency of missing values (n=36) were retained: perceived barriers (14 items), perceived benefits (8 items), perceived severity (4 items), and perceived susceptibility (10 items). confirmatory factor analysis (principle components) for a four factor solution using varimax rotation demonstrated that these four factors explained 43% of the variation in these 36 items. Conclusion: reliability analysis using Cronbach’s Alpha gave generally satisfactory results with values from 0.53 to 0.89.Keywords: cervical cancer, factor analysis, psychometric evaluation, varimax rotation
Procedia PDF Downloads 12725019 Comparison of Selected Pier-Scour Equations for Wide Piers Using Field Data
Authors: Nordila Ahmad, Thamer Mohammad, Bruce W. Melville, Zuliziana Suif
Abstract:
Current methods for predicting local scour at wide bridge piers, were developed on the basis of laboratory studies and very limited scour prediction were tested with field data. Laboratory wide pier scour equation from previous findings with field data were presented. A wide range of field data were used and it consists of both live-bed and clear-water scour. A method for assessing the quality of the data was developed and applied to the data set. Three other wide pier-scour equations from the literature were used to compare the performance of each predictive method. The best-performing scour equation were analyzed using statistical analysis. Comparisons of computed and observed scour depths indicate that the equation from the previous publication produced the smallest discrepancy ratio and RMSE value when compared with the large amount of laboratory and field data.Keywords: field data, local scour, scour equation, wide piers
Procedia PDF Downloads 41525018 The Maximum Throughput Analysis of UAV Datalink 802.11b Protocol
Authors: Inkyu Kim, SangMan Moon
Abstract:
This IEEE 802.11b protocol provides up to 11Mbps data rate, whereas aerospace industry wants to seek higher data rate COTS data link system in the UAV. The Total Maximum Throughput (TMT) and delay time are studied on many researchers in the past years This paper provides theoretical data throughput performance of UAV formation flight data link using the existing 802.11b performance theory. We operate the UAV formation flight with more than 30 quad copters with 802.11b protocol. We may be predicting that UAV formation flight numbers have to bound data link protocol performance limitations.Keywords: UAV datalink, UAV formation flight datalink, UAV WLAN datalink application, UAV IEEE 802.11b datalink application
Procedia PDF Downloads 39325017 Router 1X3 - RTL Design and Verification
Authors: Nidhi Gopal
Abstract:
Routing is the process of moving a packet of data from source to destination and enables messages to pass from one computer to another and eventually reach the target machine. A router is a networking device that forwards data packets between computer networks. It is connected to two or more data lines from different networks (as opposed to a network switch, which connects data lines from one single network). This paper mainly emphasizes upon the study of router device, its top level architecture, and how various sub-modules of router i.e. Register, FIFO, FSM and Synchronizer are synthesized, and simulated and finally connected to its top module.Keywords: data packets, networking, router, routing
Procedia PDF Downloads 81525016 Visual Preferences of Elementary School Children with Autism Spectrum Disorder: An Experimental Study
Authors: Larissa Pliska, Isabel Neitzel, Michael Buschermöhle, Olga Kunina-Habenicht, Ute Ritterfeld
Abstract:
Visual preferences, which can be assessed using eye tracking technologies, are considered one of the defining hallmarks of Autism Spectrum Disorder (ASD). Specifically, children with ASD show a decreased preference for social images rather than geometric images compared to typically developed (TD) children. Such differences are already prevalent at a very early age and indicate the severity of the disorder: toddlers with ASD who preferred geometric images when confronted with social and geometric images showed higher ASD symptom severity than toddlers with ASD who showed higher social attention. Furthermore, the complexity of social pictures (one child playing vs. two children playing together) as well as the mode of stimulus presentation (video or image), are not decisive for the marker. The average age of diagnosis for ASD in Germany is 6.5 years, and visual preference data on this age group is missing. In the present study, we therefore investigated whether visual preferences persist into school age. We examined the visual preferences of 16 boys aged 6 to 11 with ASD and unimpaired cognition as well as TD children (1:1 matching based on children's age and the parent's level of education) within an experimental setting. Different stimulus presentation formats (images vs. videos) and different levels of stimulus complexity were included. Children with and without ASD received pairs of social and non-social images and video stimuli on a screen while eye movements (i.e., eye position and gaze direction) were recorded. For this specific use case, KIZMO GmbH developed a customized, native iOS app (KIZMO Face-Analyzer) for use on iPads. Neither the format of stimulus presentation nor the complexity of the social images had a significant effect on the visual preference of children with and without ASD in this study. Despite the tendency for a difference between the groups for the video stimuli, there were no significant differences. Overall, no statistical differences in visual preference occurred between boys with and without ASD, suggesting that gaze preference in these groups is similar at primary school age. One limitation is that the children with ASD were already receiving Autism-specific intervention. The potential of a visual preference task as an indicator of ASD can be emphasized. The article discusses the clinical relevance of this marker in elementary school children.Keywords: autism spectrum disorder, eye tracking, hallmark, visual preference
Procedia PDF Downloads 6025015 Noise Reduction in Web Data: A Learning Approach Based on Dynamic User Interests
Authors: Julius Onyancha, Valentina Plekhanova
Abstract:
One of the significant issues facing web users is the amount of noise in web data which hinders the process of finding useful information in relation to their dynamic interests. Current research works consider noise as any data that does not form part of the main web page and propose noise web data reduction tools which mainly focus on eliminating noise in relation to the content and layout of web data. This paper argues that not all data that form part of the main web page is of a user interest and not all noise data is actually noise to a given user. Therefore, learning of noise web data allocated to the user requests ensures not only reduction of noisiness level in a web user profile, but also a decrease in the loss of useful information hence improves the quality of a web user profile. Noise Web Data Learning (NWDL) tool/algorithm capable of learning noise web data in web user profile is proposed. The proposed work considers elimination of noise data in relation to dynamic user interest. In order to validate the performance of the proposed work, an experimental design setup is presented. The results obtained are compared with the current algorithms applied in noise web data reduction process. The experimental results show that the proposed work considers the dynamic change of user interest prior to elimination of noise data. The proposed work contributes towards improving the quality of a web user profile by reducing the amount of useful information eliminated as noise.Keywords: web log data, web user profile, user interest, noise web data learning, machine learning
Procedia PDF Downloads 26525014 A New Obesity Index Derived from Waist Circumference and Hip Circumference Well-Matched with Other Indices in Children with Obesity
Authors: Mustafa M. Donma, Orkide Donma
Abstract:
Anthropometric obesity indices such as waist circumference (WC), indices derived from anthropometric measurements such as waist-to-hip ratio (WHR), and indices created from body fat mass composition such as trunk-to-leg fat ratio (TLFR) are commonly used for the evaluation of mild or severe forms of obesity. Their clinical utilities are being compared using body mass index (BMI) percentiles to classify obesity groups. The best of them is still being investigated to make a clear-cut discrimination between healthy normal individuals (N-BMI) and overweight or obese (OB) or morbid obese patients. The aim of this study is to derive a new index, which best suits the purpose for the discrimination of children with N-BMI from OB children. A total of eighty-three children participated in the study. Two groups were constituted. The first group comprised 42 children with N-BMI, and the second group was composed of 41 OB children, whose age- and sex- adjusted BMI percentile values vary between 95 and 99. The corresponding values for the first group were between 15 and 85. This classification was based upon the tables created by World Health Organization. The institutional ethics committee approved the study protocol. Informed consent forms were filled by the parents of the participants. Anthropometric measurements were taken and recorded following a detailed physical examination. Within this context, weight, height (Ht), WC, hip C (HC), neck C (NC) values were taken. Body mass index, WHR, (WC+HC)/2, WC/Ht, (WC/HC)/Ht, WC*NC were calculated. Bioelectrical impedance analysis was performed to obtain body’s fat compartments in terms of total fat, trunk fat, leg fat, arm fat masses. Trunk-to-leg fat ratio, trunk-to-appendicular fat ratio (TAFR), (trunk fat+leg fat)/2 ((TF+LF)/2) were calculated. Fat mass index (FMI) and diagnostic obesity notation model assessment-II (D2I) index values were calculated. Statistical analysis of the data was performed. Significantly increased values of (WC+HC)/2, (TF+LF)/2, D2I, and FMI were observed in OB group in comparison with those of N-BMI group. Significant correlations were calculated between BMI and WC, (WC+HC)/2, (TF+LF)/2, TLFR, TAFR, D2I as well as FMI both in N-BMI and OB groups. The same correlations were obtained for WC. (WC+HC)/2 was correlated with TLFR, TAFR, (TF+LF)/2, D2I, and FMI in N-BMI group. In OB group, the correlations were the same except those with TLFR and TAFR. These correlations were not present with WHR. Correlations were observed between TLFR and BMI, WC, (WC+HC)/2, (TF+LF)/2, D2I as well as FMI in N-BMI group. Same correlations were observed also with TAFR. In OB group, correlations between TLFR or TAFR and BMI, WC as well as (WC+HC)/2 were missing. None was noted with WHR. From these findings, it was concluded that (WC+HC)/2, but not WHR, was much more suitable as an anthropometric obesity index. The only correlation valid in both groups was that exists between (WC+HC)/2 and (TF+LF)/2. This index was suggested as a link between anthropometric and fat-based indices.Keywords: children, hip circumference, obesity, waist circumference
Procedia PDF Downloads 16825013 Data Mining and Knowledge Management Application to Enhance Business Operations: An Exploratory Study
Authors: Zeba Mahmood
Abstract:
The modern business organizations are adopting technological advancement to achieve competitive edge and satisfy their consumer. The development in the field of Information technology systems has changed the way of conducting business today. Business operations today rely more on the data they obtained and this data is continuously increasing in volume. The data stored in different locations is difficult to find and use without the effective implementation of Data mining and Knowledge management techniques. Organizations who smartly identify, obtain and then convert data in useful formats for their decision making and operational improvements create additional value for their customers and enhance their operational capabilities. Marketers and Customer relationship departments of firm use Data mining techniques to make relevant decisions, this paper emphasizes on the identification of different data mining and Knowledge management techniques that are applied to different business industries. The challenges and issues of execution of these techniques are also discussed and critically analyzed in this paper.Keywords: knowledge, knowledge management, knowledge discovery in databases, business, operational, information, data mining
Procedia PDF Downloads 53825012 Combined Safety and Cybersecurity Risk Assessment for Intelligent Distributed Grids
Authors: Anders Thorsén, Behrooz Sangchoolie, Peter Folkesson, Ted Strandberg
Abstract:
As more parts of the power grid become connected to the internet, the risk of cyberattacks increases. To identify the cybersecurity threats and subsequently reduce vulnerabilities, the common practice is to carry out a cybersecurity risk assessment. For safety classified systems and products, there is also a need for safety risk assessments in addition to the cybersecurity risk assessment in order to identify and reduce safety risks. These two risk assessments are usually done separately, but since cybersecurity and functional safety are often related, a more comprehensive method covering both aspects is needed. Some work addressing this has been done for specific domains like the automotive domain, but more general methods suitable for, e.g., intelligent distributed grids, are still missing. One such method from the automotive domain is the Security-Aware Hazard Analysis and Risk Assessment (SAHARA) method that combines safety and cybersecurity risk assessments. This paper presents an approach where the SAHARA method has been modified in order to be more suitable for larger distributed systems. The adapted SAHARA method has a more general risk assessment approach than the original SAHARA. The proposed method has been successfully applied on two use cases of an intelligent distributed grid.Keywords: intelligent distribution grids, threat analysis, risk assessment, safety, cybersecurity
Procedia PDF Downloads 15325011 Indexing and Incremental Approach Using Map Reduce Bipartite Graph (MRBG) for Mining Evolving Big Data
Authors: Adarsh Shroff
Abstract:
Big data is a collection of dataset so large and complex that it becomes difficult to process using data base management tools. To perform operations like search, analysis, visualization on big data by using data mining; which is the process of extraction of patterns or knowledge from large data set. In recent years, the data mining applications become stale and obsolete over time. Incremental processing is a promising approach to refreshing mining results. It utilizes previously saved states to avoid the expense of re-computation from scratch. This project uses i2MapReduce, an incremental processing extension to Map Reduce, the most widely used framework for mining big data. I2MapReduce performs key-value pair level incremental processing rather than task level re-computation, supports not only one-step computation but also more sophisticated iterative computation, which is widely used in data mining applications, and incorporates a set of novel techniques to reduce I/O overhead for accessing preserved fine-grain computation states. To optimize the mining results, evaluate i2MapReduce using a one-step algorithm and three iterative algorithms with diverse computation characteristics for efficient mining.Keywords: big data, map reduce, incremental processing, iterative computation
Procedia PDF Downloads 35325010 Analyzing Large Scale Recurrent Event Data with a Divide-And-Conquer Approach
Authors: Jerry Q. Cheng
Abstract:
Currently, in analyzing large-scale recurrent event data, there are many challenges such as memory limitations, unscalable computing time, etc. In this research, a divide-and-conquer method is proposed using parametric frailty models. Specifically, the data is randomly divided into many subsets, and the maximum likelihood estimator from each individual data set is obtained. Then a weighted method is proposed to combine these individual estimators as the final estimator. It is shown that this divide-and-conquer estimator is asymptotically equivalent to the estimator based on the full data. Simulation studies are conducted to demonstrate the performance of this proposed method. This approach is applied to a large real dataset of repeated heart failure hospitalizations.Keywords: big data analytics, divide-and-conquer, recurrent event data, statistical computing
Procedia PDF Downloads 16725009 Burnout and Salivary Cortisol Among Laboratory Personnel in Klang Valley, Malaysia During COVID-19 Pandemic
Authors: Maznieda Mahjom, Rohaida Ismail, Masita Arip, Mohd Shaiful Azlan, Nor’Ashikin Othman, Hafizah Abdullah, nor Zahrin Hasran, Joshita Jothimanickam, Syaqilah Shawaluddin, Nadia Mohamad, Raheel Nazakat, Tuan Mohd Amin, Mizanurfakhri Ghazali, Rosmanajihah Mat Lazim
Abstract:
COVID-19 outbreak is particularly detrimental to the mental health of everyone as well as leaving a long devastating crisis in the healthcare sector. Daily increment of COVID-19 cases and close contact, necessitating the testing of a large number of samples, thus increasing the workload and burden to laboratory personnel. This study aims to determine the prevalence of personal-, work- and client-related burnout as well as to measure the concentration of salivary cortisol among laboratory personnel in the main laboratories in Klang Valley, Malaysia. This cross-sectional study was conducted in late 2021 and recruited a total of 404 respondents from three laboratories in Klang Valley, Malaysia. The level of burnout was assessed using Copenhagen Burnout Inventory (CBI) comprising three sub-dimensions of personal-, work- and client-related burnout. The cut-off score of 50% and above indicated possible burnout. Meanwhile, salivary cortisol was measured using a competitive enzyme immunoassay kit (Salimetrics, State College, PA, USA). Normal levels of salivary cortisol concentration in adults are within 0.094 to 1.551 μg/dl (morning) and can be none detected to 0.359 μg/dl (evening). The prevalence of personal-, work- and client-related burnout among laboratory personnel were 36.1%, 17.8% and 7.2% respectively. Meanwhile, the abnormal morning and evening cortisol concentration recorded were 29.5% and 21.8% excluding 6.9%-7.4% missing data. While the IgA level is normal for most of the respondents, which recorded at 95.53%. Laboratory personnel were at risk of suffering burnout during the COVID-19 pandemic. Thus, mental health programs need to be addressed at the department and hospital level by regularly screening healthcare workers and designing an intervention program. It is also vital to improve the coping skills of laboratory personnel by increasing the awareness of good coping skill techniques. The training must be in an innovative way to ensure that the lab personnel can internalise the technique and practise it in real life.Keywords: burnout, COVID-19, laborotary personnel, salivary cortisol
Procedia PDF Downloads 7025008 A Review of Literature on Theories of Construction Accident Causation Models
Authors: Samuel Opeyemi Williams, Razali Bin Adul Hamid, M. S. Misnan, Taki Eddine Seghier, D. I. Ajayi
Abstract:
Construction sites are characterized with occupational risks. Review of literature on construction accidents reveals that a lot of theories have been propounded over the years by different theorists, coupled with multifarious models developed by different proponents at different times. Accidents are unplanned events that are prominent in construction sites, involving materials, objects and people with attendant damages, loses and injuries. Models were developed to investigate the causations of accident with the aim of preventing its occurrence. Though, some of these theories were criticized, most especially, the Heinrich Domino theory, being mostly faulted for placing much blame on operatives rather than the management. The purpose of this paper is to unravel the significant construction accident causation theories and models for the benefit of understanding of the theories, and consequently enabling construction stakeholders identify the possible potential hazards on construction sites, as all stakeholders have significant roles to play in preventing accident. Accidents are preventable; hence, understanding the risk factors of accident and the causation theories paves way for its prevention. However, findings reveal that still some gaps missing in the existing models, while it is recommended that further research can be made in order to develop more models in order to maintain zero accident on construction sites.Keywords: domino theory, construction site, site safety, accident causation model
Procedia PDF Downloads 30425007 Adoption of Big Data by Global Chemical Industries
Authors: Ashiff Khan, A. Seetharaman, Abhijit Dasgupta
Abstract:
The new era of big data (BD) is influencing chemical industries tremendously, providing several opportunities to reshape the way they operate and help them shift towards intelligent manufacturing. Given the availability of free software and the large amount of real-time data generated and stored in process plants, chemical industries are still in the early stages of big data adoption. The industry is just starting to realize the importance of the large amount of data it owns to make the right decisions and support its strategies. This article explores the importance of professional competencies and data science that influence BD in chemical industries to help it move towards intelligent manufacturing fast and reliable. This article utilizes a literature review and identifies potential applications in the chemical industry to move from conventional methods to a data-driven approach. The scope of this document is limited to the adoption of BD in chemical industries and the variables identified in this article. To achieve this objective, government, academia, and industry must work together to overcome all present and future challenges.Keywords: chemical engineering, big data analytics, industrial revolution, professional competence, data science
Procedia PDF Downloads 8625006 Secure Multiparty Computations for Privacy Preserving Classifiers
Authors: M. Sumana, K. S. Hareesha
Abstract:
Secure computations are essential while performing privacy preserving data mining. Distributed privacy preserving data mining involve two to more sites that cannot pool in their data to a third party due to the violation of law regarding the individual. Hence in order to model the private data without compromising privacy and information loss, secure multiparty computations are used. Secure computations of product, mean, variance, dot product, sigmoid function using the additive and multiplicative homomorphic property is discussed. The computations are performed on vertically partitioned data with a single site holding the class value.Keywords: homomorphic property, secure product, secure mean and variance, secure dot product, vertically partitioned data
Procedia PDF Downloads 41225005 Image Inpainting Model with Small-Sample Size Based on Generative Adversary Network and Genetic Algorithm
Authors: Jiawen Wang, Qijun Chen
Abstract:
The performance of most machine-learning methods for image inpainting depends on the quantity and quality of the training samples. However, it is very expensive or even impossible to obtain a great number of training samples in many scenarios. In this paper, an image inpainting model based on a generative adversary network (GAN) is constructed for the cases when the number of training samples is small. Firstly, a feature extraction network (F-net) is incorporated into the GAN network to utilize the available information of the inpainting image. The weighted sum of the extracted feature and the random noise acts as the input to the generative network (G-net). The proposed network can be trained well even when the sample size is very small. Secondly, in the phase of the completion for each damaged image, a genetic algorithm is designed to search an optimized noise input for G-net; based on this optimized input, the parameters of the G-net and F-net are further learned (Once the completion for a certain damaged image ends, the parameters restore to its original values obtained in the training phase) to generate an image patch that not only can fill the missing part of the damaged image smoothly but also has visual semantics.Keywords: image inpainting, generative adversary nets, genetic algorithm, small-sample size
Procedia PDF Downloads 13025004 Policies and Politics of Infrastructure Provisioning in Nigeria
Authors: Olufemi Adedamola Oyedele
Abstract:
Infrastructure provision in Nigeria is now at its lowest ebb in spite of its being critical to the socio-economic and political development of any nation. This is partly because the policy that will ensure its adequate provisioning is missing and partly because politics is affecting its provision. Policy is the basic principles by which a government is guided. Infrastructural development is the basis for measuring the performance of governments and it is the foundation of good governance. Demand for infrastructural development is higher and resources used in its provision are limited. Ethnic-interest agitation and lobbying for infrastructure provision are common things in multi-ethnic state like Nigeria. Most infrastructures are now decayed and need repair or replacement. Government is the system that organizes, control and sensitizes the people in a society in other for all to have an acceptable level of living. Governments have the power to put in place all measures that they deem fit will make an environment conducive for living for everybody. Infrastructure development in any environment requires needs assessment, feasibility and viability studies and carrying out physical development of the project. The challenge in Nigeria is largely carrying out development where they are not needed but where the people are loyal. There are numerous abandoned projects because they were started due to politics and not because they are feasible. Policies and politics greatly affect infrastructure provisioning in Nigeria and this is the premise of this paper.Keywords: infrastructure challenges, infrastructure development, policy making, politics, project finance
Procedia PDF Downloads 28125003 Cross Project Software Fault Prediction at Design Phase
Authors: Pradeep Singh, Shrish Verma
Abstract:
Software fault prediction models are created by using the source code, processed metrics from the same or previous version of code and related fault data. Some company do not store and keep track of all artifacts which are required for software fault prediction. To construct fault prediction model for such company, the training data from the other projects can be one potential solution. The earlier we predict the fault the less cost it requires to correct. The training data consists of metrics data and related fault data at function/module level. This paper investigates fault predictions at early stage using the cross-project data focusing on the design metrics. In this study, empirical analysis is carried out to validate design metrics for cross project fault prediction. The machine learning techniques used for evaluation is Naïve Bayes. The design phase metrics of other projects can be used as initial guideline for the projects where no previous fault data is available. We analyze seven data sets from NASA Metrics Data Program which offer design as well as code metrics. Overall, the results of cross project is comparable to the within company data learning.Keywords: software metrics, fault prediction, cross project, within project.
Procedia PDF Downloads 34425002 Comparing Emotion Recognition from Voice and Facial Data Using Time Invariant Features
Authors: Vesna Kirandziska, Nevena Ackovska, Ana Madevska Bogdanova
Abstract:
The problem of emotion recognition is a challenging problem. It is still an open problem from the aspect of both intelligent systems and psychology. In this paper, both voice features and facial features are used for building an emotion recognition system. A Support Vector Machine classifiers are built by using raw data from video recordings. In this paper, the results obtained for the emotion recognition are given, and a discussion about the validity and the expressiveness of different emotions is presented. A comparison between the classifiers build from facial data only, voice data only and from the combination of both data is made here. The need for a better combination of the information from facial expression and voice data is argued.Keywords: emotion recognition, facial recognition, signal processing, machine learning
Procedia PDF Downloads 31725001 Cryptosystems in Asymmetric Cryptography for Securing Data on Cloud at Various Critical Levels
Authors: Sartaj Singh, Amar Singh, Ashok Sharma, Sandeep Kaur
Abstract:
With upcoming threats in a digital world, we need to work continuously in the area of security in all aspects, from hardware to software as well as data modelling. The rise in social media activities and hunger for data by various entities leads to cybercrime and more attack on the privacy and security of persons. Cryptography has always been employed to avoid access to important data by using many processes. Symmetric key and asymmetric key cryptography have been used for keeping data secrets at rest as well in transmission mode. Various cryptosystems have evolved from time to time to make the data more secure. In this research article, we are studying various cryptosystems in asymmetric cryptography and their application with usefulness, and much emphasis is given to Elliptic curve cryptography involving algebraic mathematics.Keywords: cryptography, symmetric key cryptography, asymmetric key cryptography
Procedia PDF Downloads 12425000 Data Recording for Remote Monitoring of Autonomous Vehicles
Authors: Rong-Terng Juang
Abstract:
Autonomous vehicles offer the possibility of significant benefits to social welfare. However, fully automated cars might not be going to happen in the near further. To speed the adoption of the self-driving technologies, many governments worldwide are passing laws requiring data recorders for the testing of autonomous vehicles. Currently, the self-driving vehicle, (e.g., shuttle bus) has to be monitored from a remote control center. When an autonomous vehicle encounters an unexpected driving environment, such as road construction or an obstruction, it should request assistance from a remote operator. Nevertheless, large amounts of data, including images, radar and lidar data, etc., have to be transmitted from the vehicle to the remote center. Therefore, this paper proposes a data compression method of in-vehicle networks for remote monitoring of autonomous vehicles. Firstly, the time-series data are rearranged into a multi-dimensional signal space. Upon the arrival, for controller area networks (CAN), the new data are mapped onto a time-data two-dimensional space associated with the specific CAN identity. Secondly, the data are sampled based on differential sampling. Finally, the whole set of data are encoded using existing algorithms such as Huffman, arithmetic and codebook encoding methods. To evaluate system performance, the proposed method was deployed on an in-house built autonomous vehicle. The testing results show that the amount of data can be reduced as much as 1/7 compared to the raw data.Keywords: autonomous vehicle, data compression, remote monitoring, controller area networks (CAN), Lidar
Procedia PDF Downloads 16424999 Multimedia Data Fusion for Event Detection in Twitter by Using Dempster-Shafer Evidence Theory
Authors: Samar M. Alqhtani, Suhuai Luo, Brian Regan
Abstract:
Data fusion technology can be the best way to extract useful information from multiple sources of data. It has been widely applied in various applications. This paper presents a data fusion approach in multimedia data for event detection in twitter by using Dempster-Shafer evidence theory. The methodology applies a mining algorithm to detect the event. There are two types of data in the fusion. The first is features extracted from text by using the bag-ofwords method which is calculated using the term frequency-inverse document frequency (TF-IDF). The second is the visual features extracted by applying scale-invariant feature transform (SIFT). The Dempster - Shafer theory of evidence is applied in order to fuse the information from these two sources. Our experiments have indicated that comparing to the approaches using individual data source, the proposed data fusion approach can increase the prediction accuracy for event detection. The experimental result showed that the proposed method achieved a high accuracy of 0.97, comparing with 0.93 with texts only, and 0.86 with images only.Keywords: data fusion, Dempster-Shafer theory, data mining, event detection
Procedia PDF Downloads 41124998 Legal Issues of Collecting and Processing Big Health Data in the Light of European Regulation 679/2016
Authors: Ioannis Iglezakis, Theodoros D. Trokanas, Panagiota Kiortsi
Abstract:
This paper aims to explore major legal issues arising from the collection and processing of Health Big Data in the light of the new European secondary legislation for the protection of personal data of natural persons, placing emphasis on the General Data Protection Regulation 679/2016. Whether Big Health Data can be characterised as ‘personal data’ or not is really the crux of the matter. The legal ambiguity is compounded by the fact that, even though the processing of Big Health Data is premised on the de-identification of the data subject, the possibility of a combination of Big Health Data with other data circulating freely on the web or from other data files cannot be excluded. Another key point is that the application of some provisions of GPDR to Big Health Data may both absolve the data controller of his legal obligations and deprive the data subject of his rights (e.g., the right to be informed), ultimately undermining the fundamental right to the protection of personal data of natural persons. Moreover, data subject’s rights (e.g., the right not to be subject to a decision based solely on automated processing) are heavily impacted by the use of AI, algorithms, and technologies that reclaim health data for further use, resulting in sometimes ambiguous results that have a substantial impact on individuals. On the other hand, as the COVID-19 pandemic has revealed, Big Data analytics can offer crucial sources of information. In this respect, this paper identifies and systematises the legal provisions concerned, offering interpretative solutions that tackle dangers concerning data subject’s rights while embracing the opportunities that Big Health Data has to offer. In addition, particular attention is attached to the scope of ‘consent’ as a legal basis in the collection and processing of Big Health Data, as the application of data analytics in Big Health Data signals the construction of new data and subject’s profiles. Finally, the paper addresses the knotty problem of role assignment (i.e., distinguishing between controller and processor/joint controllers and joint processors) in an era of extensive Big Health data sharing. The findings are the fruit of a current research project conducted by a three-member research team at the Faculty of Law of the Aristotle University of Thessaloniki and funded by the Greek Ministry of Education and Religious Affairs.Keywords: big health data, data subject rights, GDPR, pandemic
Procedia PDF Downloads 12924997 Adaptive Data Approximations Codec (ADAC) for AI/ML-based Cyber-Physical Systems
Authors: Yong-Kyu Jung
Abstract:
The fast growth in information technology has led to de-mands to access/process data. CPSs heavily depend on the time of hardware/software operations and communication over the network (i.e., real-time/parallel operations in CPSs (e.g., autonomous vehicles). Since data processing is an im-portant means to overcome the issue confronting data management, reducing the gap between the technological-growth and the data-complexity and channel-bandwidth. An adaptive perpetual data approximation method is intro-duced to manage the actual entropy of the digital spectrum. An ADAC implemented as an accelerator and/or apps for servers/smart-connected devices adaptively rescales digital contents (avg.62.8%), data processing/access time/energy, encryption/decryption overheads in AI/ML applications (facial ID/recognition).Keywords: adaptive codec, AI, ML, HPC, cyber-physical, cybersecurity
Procedia PDF Downloads 7924996 Real-Time Visualization Using GPU-Accelerated Filtering of LiDAR Data
Authors: Sašo Pečnik, Borut Žalik
Abstract:
This paper presents a real-time visualization technique and filtering of classified LiDAR point clouds. The visualization is capable of displaying filtered information organized in layers by the classification attribute saved within LiDAR data sets. We explain the used data structure and data management, which enables real-time presentation of layered LiDAR data. Real-time visualization is achieved with LOD optimization based on the distance from the observer without loss of quality. The filtering process is done in two steps and is entirely executed on the GPU and implemented using programmable shaders.Keywords: filtering, graphics, level-of-details, LiDAR, real-time visualization
Procedia PDF Downloads 309