Search results for: incomplete data
24465 MapReduce Algorithm for Geometric and Topological Information Extraction from 3D CAD Models
Authors: Ahmed Fradi
Abstract:
In a digital world in perpetual evolution and acceleration, data more and more voluminous, rich and varied, the new software solutions emerged with the Big Data phenomenon offer new opportunities to the company enabling it not only to optimize its business and to evolve its production model, but also to reorganize itself to increase competitiveness and to identify new strategic axes. Design and manufacturing industrial companies, like the others, face these challenges, data represent a major asset, provided that they know how to capture, refine, combine and analyze them. The objective of our paper is to propose a solution allowing geometric and topological information extraction from 3D CAD model (precisely STEP files) databases, with specific algorithm based on the programming paradigm MapReduce. Our proposal is the first step of our future approach to 3D CAD object retrieval.Keywords: Big Data, MapReduce, 3D object retrieval, CAD, STEP format
Procedia PDF Downloads 54124464 First Attempts Using High-Throughput Sequencing in Senecio from the Andes
Authors: L. Salomon, P. Sklenar
Abstract:
The Andes hold the highest plant species diversity in the world. How this occurred is one of the most intriguing questions in studies addressing the origin and patterning of plant diversity worldwide. Recently, the explosive adaptive radiations found in high Andean groups have been pointed as triggers to this spectacular diversity. The Andes is the species-richest area for the biggest genus from the Asteraceae family: Senecio. There, the genus presents an incredible diversity of species, striking growth form variation, and large niche span. Even when some studies tried to disentangle the evolutionary story for some Andean species in Senecio, they obtained partially resolved and low supported phylogenies, as expected for recently radiated groups. The high-throughput sequencing (HTS) approaches have proved to be a powerful tool answering phylogenetic questions in those groups whose evolutionary stories are recent and traditional techniques like Sanger sequencing are not informative enough. Although these tools have been used to understand the evolution of an increasing number of Andean groups, nowadays, their scope has not been applied for Senecio. This project aims to contribute to a better knowledge of the mechanisms shaping the hyper diversity of Senecio in the Andean region, using HTS focusing on Senecio ser. Culcitium (Asteraceae), recently recircumscribed. Firstly, reconstructing a highly resolved and supported phylogeny, and after assessing the role of allopatric differentiation, hybridization, and genome duplication in the diversification of the group. Using the Hyb-Seq approach, combining target enrichment using Asteraceae COS loci baits and genome skimming, more than 100 new accessions were generated. HybPhyloMaker and HybPiper pipelines were used for the phylogenetic analyses, and another pipeline in development (Paralogue Wizard) was used to deal with paralogues. RAxML was used to generate gene trees and Astral for species tree reconstruction. Phyparts were used to explore as first step of gene tree discordance along the clades. Fully resolved with moderated supported trees were obtained, showing Senecio ser. Culcitium as monophyletic. Within the group, some species formed well-supported clades with morphologically related species, while some species would not have exclusive ancestry, in concordance with previous studies using amplified fragment length polymorphism (AFLP) showing geographical differentiation. Discordance between gene trees was detected. Paralogues were detected for many loci, indicating possible genome duplications; ploidy level estimation using flow cytometry will be carried out during the next months in order to identify the role of this process in the diversification of the group. Likewise, TreeSetViz package for Mesquite, hierarchical likelihood ratio congruence test using Concaterpillar, and Procrustean Approach to Cophylogeny (PACo), will be used to evaluate the congruence among different inheritance patterns. In order to evaluate the influence of hybridization and Incomplete Lineage Sorting (ILS) in each resultant clade from the phylogeny, Joly et al.'s 2009 method in a coalescent scenario and Paterson’s D-statistic will be performed. Even when the main discordance sources between gene trees were not explored in detail yet, the data show that at least to some degree, processes such as genome duplication, hybridization, and/or ILS could be involved in the evolution of the group.Keywords: adaptive radiations, Andes, genome duplication, hybridization, Senecio
Procedia PDF Downloads 14024463 Data Hiding in Gray Image Using ASCII Value and Scanning Technique
Authors: R. K. Pateriya, Jyoti Bharti
Abstract:
This paper presents an approach for data hiding methods which provides a secret communication between sender and receiver. The data is hidden in gray-scale images and the boundary of gray-scale image is used to store the mapping information. In this an approach data is in ASCII format and the mapping is in between ASCII value of hidden message and pixel value of cover image, since pixel value of an image as well as ASCII value is in range of 0 to 255 and this mapping information is occupying only 1 bit per character of hidden message as compared to 8 bit per character thus maintaining good quality of stego image.Keywords: ASCII value, cover image, PSNR, pixel value, stego image, secret message
Procedia PDF Downloads 41624462 DCASH: Dynamic Cache Synchronization Algorithm for Heterogeneous Reverse Y Synchronizing Mobile Database Systems
Authors: Gunasekaran Raja, Kottilingam Kottursamy, Rajakumar Arul, Ramkumar Jayaraman, Krithika Sairam, Lakshmi Ravi
Abstract:
The synchronization server maintains a dynamically changing cache, which contains the data items which were requested and collected by the mobile node from the server. The order and presence of tuples in the cache changes dynamically according to the frequency of updates performed on the data, by the server and client. To synchronize, the data which has been modified by client and the server at an instant are collected, batched together by the type of modification (insert/ update/ delete), and sorted according to their update frequencies. This ensures that the DCASH (Dynamic Cache Synchronization Algorithm for Heterogeneous Reverse Y synchronizing Mobile Database Systems) gives priority to the frequently accessed data with high usage. The optimal memory management algorithm is proposed to manage data items according to their frequency, theorems were written to show the current mobile data activity is reverse Y in nature and the experiments were tested with 2g and 3g networks for various mobile devices to show the reduced response time and energy consumption.Keywords: mobile databases, synchronization, cache, response time
Procedia PDF Downloads 40624461 Unified Structured Process for Health Analytics
Authors: Supunmali Ahangama, Danny Chiang Choon Poo
Abstract:
Health analytics (HA) is used in healthcare systems for effective decision-making, management, and planning of healthcare and related activities. However, user resistance, the unique position of medical data content, and structure (including heterogeneous and unstructured data) and impromptu HA projects have held up the progress in HA applications. Notably, the accuracy of outcomes depends on the skills and the domain knowledge of the data analyst working on the healthcare data. The success of HA depends on having a sound process model, effective project management and availability of supporting tools. Thus, to overcome these challenges through an effective process model, we propose an HA process model with features from the rational unified process (RUP) model and agile methodology.Keywords: agile methodology, health analytics, unified process model, UML
Procedia PDF Downloads 50624460 Use of Life Cycle Data for State-Oriented Maintenance
Authors: Maximilian Winkens, Matthias Goerke
Abstract:
The state-oriented maintenance enables the preventive intervention before the failure of a component and guarantees avoidance of expensive breakdowns. Because the timing of the maintenance is defined by the component’s state, the remaining service life can be exhausted to the limit. The basic requirement for the state-oriented maintenance is the ability to define the component’s state. New potential for this is offered by gentelligent components. They are developed at the Corporative Research Centre 653 of the German Research Foundation (DFG). Because of their sensory ability they enable the registration of stresses during the component’s use. The data is gathered and evaluated. The methodology developed determines the current state of the gentelligent component based on the gathered data. This article presents this methodology as well as current research. The main focus of the current scientific work is to improve the quality of the state determination based on the life-cycle data analysis. The methodology developed until now evaluates the data of the usage phase and based on it predicts the timing of the gentelligent component’s failure. The real failure timing though, deviate from the predicted one because the effects from the production phase aren’t considered. The goal of the current research is to develop a methodology for state determination which considers both production and usage data.Keywords: state-oriented maintenance, life-cycle data, gentelligent component, preventive intervention
Procedia PDF Downloads 49524459 A Hybrid System for Boreholes Soil Sample
Authors: Ali Ulvi Uzer
Abstract:
Data reduction is an important topic in the field of pattern recognition applications. The basic concept is the reduction of multitudinous amounts of data down to the meaningful parts. The Principal Component Analysis (PCA) method is frequently used for data reduction. The Support Vector Machine (SVM) method is a discriminative classifier formally defined by a separating hyperplane. In other words, given labeled training data, the algorithm outputs an optimal hyperplane which categorizes new examples. This study offers a hybrid approach that uses the PCA for data reduction and Support Vector Machines (SVM) for classification. In order to detect the accuracy of the suggested system, two boreholes taken from the soil sample was used. The classification accuracies for this dataset were obtained through using ten-fold cross-validation method. As the results suggest, this system, which is performed through size reduction, is a feasible system for faster recognition of dataset so our study result appears to be very promising.Keywords: feature selection, sequential forward selection, support vector machines, soil sample
Procedia PDF Downloads 45524458 Predicting Customer Purchasing Behaviour in Retail Marketing: A Research for a Supermarket Chain
Authors: Sabri Serkan Güllüoğlu
Abstract:
Analysis can be defined as the process of gathering, recording and researching data related to products and services, in order to learn something. But for marketers, analyses are not only used for learning but also an essential and critical part of the business, because this allows companies to offer products or services which are focused and well targeted. Market analysis also identify market trends, demographics, customer’s buying habits and important information on the competition. Data mining is used instead of traditional research, because it extracts predictive information about customer and sales from large databases. In contrast to traditional research, data mining relies on information that is already available. Simply the goal is to improve the efficiency of supermarkets. In this study, the purpose is to find dependency on products. For instance, which items are bought together, using association rules in data mining. Moreover, this information will be used for improving the profitability of customers such as increasing shopping time and sales of fewer sold items.Keywords: data mining, association rule mining, market basket analysis, purchasing
Procedia PDF Downloads 48324457 Predicting Medical Check-Up Patient Re-Coming Using Sequential Pattern Mining and Association Rules
Authors: Rizka Aisha Rahmi Hariadi, Chao Ou-Yang, Han-Cheng Wang, Rajesri Govindaraju
Abstract:
As the increasing of medical check-up popularity, there are a huge number of medical check-up data stored in database and have not been useful. These data actually can be very useful for future strategic planning if we mine it correctly. In other side, a lot of patients come with unpredictable coming and also limited available facilities make medical check-up service offered by hospital not maximal. To solve that problem, this study used those medical check-up data to predict patient re-coming. Sequential pattern mining (SPM) and association rules method were chosen because these methods are suitable for predicting patient re-coming using sequential data. First, based on patient personal information the data was grouped into … groups then discriminant analysis was done to check significant of the grouping. Second, for each group some frequent patterns were generated using SPM method. Third, based on frequent patterns of each group, pairs of variable can be extracted using association rules to get general pattern of re-coming patient. Last, discussion and conclusion was done to give some implications of the results.Keywords: patient re-coming, medical check-up, health examination, data mining, sequential pattern mining, association rules, discriminant analysis
Procedia PDF Downloads 64024456 Continual Learning Using Data Generation for Hyperspectral Remote Sensing Scene Classification
Authors: Samiah Alammari, Nassim Ammour
Abstract:
When providing a massive number of tasks successively to a deep learning process, a good performance of the model requires preserving the previous tasks data to retrain the model for each upcoming classification. Otherwise, the model performs poorly due to the catastrophic forgetting phenomenon. To overcome this shortcoming, we developed a successful continual learning deep model for remote sensing hyperspectral image regions classification. The proposed neural network architecture encapsulates two trainable subnetworks. The first module adapts its weights by minimizing the discrimination error between the land-cover classes during the new task learning, and the second module tries to learn how to replicate the data of the previous tasks by discovering the latent data structure of the new task dataset. We conduct experiments on HSI dataset Indian Pines. The results confirm the capability of the proposed method.Keywords: continual learning, data reconstruction, remote sensing, hyperspectral image segmentation
Procedia PDF Downloads 26624455 Local Differential Privacy-Based Data-Sharing Scheme for Smart Utilities
Authors: Veniamin Boiarkin, Bruno Bogaz Zarpelão, Muttukrishnan Rajarajan
Abstract:
The manufacturing sector is a vital component of most economies, which leads to a large number of cyberattacks on organisations, whereas disruption in operation may lead to significant economic consequences. Adversaries aim to disrupt the production processes of manufacturing companies, gain financial advantages, and steal intellectual property by getting unauthorised access to sensitive data. Access to sensitive data helps organisations to enhance the production and management processes. However, the majority of the existing data-sharing mechanisms are either susceptible to different cyber attacks or heavy in terms of computation overhead. In this paper, a privacy-preserving data-sharing scheme for smart utilities is proposed. First, a customer’s privacy adjustment mechanism is proposed to make sure that end-users have control over their privacy, which is required by the latest government regulations, such as the General Data Protection Regulation. Secondly, a local differential privacy-based mechanism is proposed to ensure the privacy of the end-users by hiding real data based on the end-user preferences. The proposed scheme may be applied to different industrial control systems, whereas in this study, it is validated for energy utility use cases consisting of smart, intelligent devices. The results show that the proposed scheme may guarantee the required level of privacy with an expected relative error in utility.Keywords: data-sharing, local differential privacy, manufacturing, privacy-preserving mechanism, smart utility
Procedia PDF Downloads 7624454 Changes in the Subjective Interpretation of Poverty Due to COVID-19: The Case of a Peripheral County of Hungary
Authors: Eszter Siposne Nandori
Abstract:
The paper describes how the subjective interpretation of poverty changed during the COVID-19 pandemic. The results of data collection at the end of 2020 are compared to the results of a similar survey from 2019. The methods of systematic data collection are used to collect data about the beliefs of the population about poverty. The analysis is carried out in Borsod-Abaúj-Zemplén County, one of the most backward areas in Hungary. The paper concludes that poverty is mainly linked to material values, and it did not change from 2019 to 2020. Some slight changes, however, highlight the effect of the pandemic: poverty is increasingly seen as a generational problem in 2020, and another important change is that isolation became more closely related to poverty.Keywords: Hungary, interpretation of poverty, pandemic, systematic data collection, subjective poverty
Procedia PDF Downloads 12624453 An Encapsulation of a Navigable Tree Position: Theory, Specification, and Verification
Authors: Nicodemus M. J. Mbwambo, Yu-Shan Sun, Murali Sitaraman, Joan Krone
Abstract:
This paper presents a generic data abstraction that captures a navigable tree position. The mathematical modeling of the abstraction encapsulates the current tree position, which can be used to navigate and modify the tree. The encapsulation of the tree position in the data abstraction specification avoids the use of explicit references and aliasing, thereby simplifying verification of (imperative) client code that uses the data abstraction. To ease the tasks of such specification and verification, a general tree theory, rich with mathematical notations and results, has been developed. The paper contains an example to illustrate automated verification ramifications. With sufficient tree theory development, automated proving seems plausible even in the absence of a special-purpose tree solver.Keywords: automation, data abstraction, maps, specification, tree, verification
Procedia PDF Downloads 16624452 Accurate Position Electromagnetic Sensor Using Data Acquisition System
Authors: Z. Ezzouine, A. Nakheli
Abstract:
This paper presents a high position electromagnetic sensor system (HPESS) that is applicable for moving object detection. The authors have developed a high-performance position sensor prototype dedicated to students’ laboratory. The challenge was to obtain a highly accurate and real-time sensor that is able to calculate position, length or displacement. An electromagnetic solution based on a two coil induction principal was adopted. The HPESS converts mechanical motion to electric energy with direct contact. The output signal can then be fed to an electronic circuit. The voltage output change from the sensor is captured by data acquisition system using LabVIEW software. The displacement of the moving object is determined. The measured data are transmitted to a PC in real-time via a DAQ (NI USB -6281). This paper also describes the data acquisition analysis and the conditioning card developed specially for sensor signal monitoring. The data is then recorded and viewed using a user interface written using National Instrument LabVIEW software. On-line displays of time and voltage of the sensor signal provide a user-friendly data acquisition interface. The sensor provides an uncomplicated, accurate, reliable, inexpensive transducer for highly sophisticated control systems.Keywords: electromagnetic sensor, accurately, data acquisition, position measurement
Procedia PDF Downloads 28524451 The Quality of the Presentation Influence the Audience Perceptions
Authors: Gilang Maulana, Dhika Rahma Qomariah, Yasin Fadil
Abstract:
Purpose: This research meant to measure the magnitude of the influence of the quality of the presentation to the targeted audience perception in catching information presentation. Design/Methodology/Approach: This research uses a quantitative research method. The kind of data that uses in this research is the primary data. The population in this research are students the economics faculty of Semarang State University. The sampling techniques uses in this research is purposive sampling. The retrieving data uses questionnaire on 30 respondents. The data analysis uses descriptive analysis. Result: The quality of presentation influential positive against perception of the audience. This proved that the more qualified presentation will increase the perception of the audience. Limitation: Respondents were limited to only 30 people.Keywords: quality of presentation, presentation, audience, perception, semarang state university
Procedia PDF Downloads 39224450 Managing Data from One Hundred Thousand Internet of Things Devices Globally for Mining Insights
Authors: Julian Wise
Abstract:
Newcrest Mining is one of the world’s top five gold and rare earth mining organizations by production, reserves and market capitalization in the world. This paper elaborates on the data acquisition processes employed by Newcrest in collaboration with Fortune 500 listed organization, Insight Enterprises, to standardize machine learning solutions which process data from over a hundred thousand distributed Internet of Things (IoT) devices located at mine sites globally. Through the utilization of software architecture cloud technologies and edge computing, the technological developments enable for standardized processes of machine learning applications to influence the strategic optimization of mineral processing. Target objectives of the machine learning optimizations include time savings on mineral processing, production efficiencies, risk identification, and increased production throughput. The data acquired and utilized for predictive modelling is processed through edge computing by resources collectively stored within a data lake. Being involved in the digital transformation has necessitated the standardization software architecture to manage the machine learning models submitted by vendors, to ensure effective automation and continuous improvements to the mineral process models. Operating at scale, the system processes hundreds of gigabytes of data per day from distributed mine sites across the globe, for the purposes of increased improved worker safety, and production efficiency through big data applications.Keywords: mineral technology, big data, machine learning operations, data lake
Procedia PDF Downloads 11224449 Examining Statistical Monitoring Approach against Traditional Monitoring Techniques in Detecting Data Anomalies during Conduct of Clinical Trials
Authors: Sheikh Omar Sillah
Abstract:
Introduction: Monitoring is an important means of ensuring the smooth implementation and quality of clinical trials. For many years, traditional site monitoring approaches have been critical in detecting data errors but not optimal in identifying fabricated and implanted data as well as non-random data distributions that may significantly invalidate study results. The objective of this paper was to provide recommendations based on best statistical monitoring practices for detecting data-integrity issues suggestive of fabrication and implantation early in the study conduct to allow implementation of meaningful corrective and preventive actions. Methodology: Electronic bibliographic databases (Medline, Embase, PubMed, Scopus, and Web of Science) were used for the literature search, and both qualitative and quantitative studies were sought. Search results were uploaded into Eppi-Reviewer Software, and only publications written in the English language from 2012 were included in the review. Gray literature not considered to present reproducible methods was excluded. Results: A total of 18 peer-reviewed publications were included in the review. The publications demonstrated that traditional site monitoring techniques are not efficient in detecting data anomalies. By specifying project-specific parameters such as laboratory reference range values, visit schedules, etc., with appropriate interactive data monitoring, statistical monitoring can offer early signals of data anomalies to study teams. The review further revealed that statistical monitoring is useful to identify unusual data patterns that might be revealing issues that could impact data integrity or may potentially impact study participants' safety. However, subjective measures may not be good candidates for statistical monitoring. Conclusion: The statistical monitoring approach requires a combination of education, training, and experience sufficient to implement its principles in detecting data anomalies for the statistical aspects of a clinical trial.Keywords: statistical monitoring, data anomalies, clinical trials, traditional monitoring
Procedia PDF Downloads 7724448 An ALM Matrix Completion Algorithm for Recovering Weather Monitoring Data
Authors: Yuqing Chen, Ying Xu, Renfa Li
Abstract:
The development of matrix completion theory provides new approaches for data gathering in Wireless Sensor Networks (WSN). The existing matrix completion algorithms for WSN mainly consider how to reduce the sampling number without considering the real-time performance when recovering the data matrix. In order to guarantee the recovery accuracy and reduce the recovery time consumed simultaneously, we propose a new ALM algorithm to recover the weather monitoring data. A lot of experiments have been carried out to investigate the performance of the proposed ALM algorithm by using different parameter settings, different sampling rates and sampling models. In addition, we compare the proposed ALM algorithm with some existing algorithms in the literature. Experimental results show that the ALM algorithm can obtain better overall recovery accuracy with less computing time, which demonstrate that the ALM algorithm is an effective and efficient approach for recovering the real world weather monitoring data in WSN.Keywords: wireless sensor network, matrix completion, singular value thresholding, augmented Lagrange multiplier
Procedia PDF Downloads 38424447 Field Production Data Collection, Analysis and Reporting Using Automated System
Authors: Amir AlAmeeri, Mohamed Ibrahim
Abstract:
Various data points are constantly being measured in the production system, and due to the nature of the wells, these data points, such as pressure, temperature, water cut, etc.., fluctuations are constant, which requires high frequency monitoring and collection. It is a very difficult task to analyze these parameters manually using spreadsheets and email. An automated system greatly enhances efficiency, reduce errors, the need for constant emails which take up disk space, and frees up time for the operator to perform other critical tasks. Various production data is being recorded in an oil field, and this huge volume of data can be seen as irrelevant to some, especially when viewed on its own with no context. In order to fully utilize all this information, it needs to be properly collected, verified and stored in one common place and analyzed for surveillance and monitoring purposes. This paper describes how data is recorded by different parties and departments in the field, and verified numerous times as it is being loaded into a repository. Once it is loaded, a final check is done before being entered into a production monitoring system. Once all this is collected, various calculations are performed to report allocated production. Calculated production data is used to report field production automatically. It is also used to monitor well and surface facility performance. Engineers can use this for their studies and analyses to ensure field is performing as it should be, predict and forecast production, and monitor any changes in wells that could affect field performance.Keywords: automation, oil production, Cheleken, exploration and production (E&P), Caspian Sea, allocation, forecast
Procedia PDF Downloads 15624446 Time Series Regression with Meta-Clusters
Authors: Monika Chuchro
Abstract:
This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain a subgroups of time series data with normal distribution from inflow into waste water treatment plant data which Composed of several groups differing by mean value. Two simple algorithms: K-mean and EM were chosen as a clustering method. The rand index was used to measure the similarity. After simple meta-clustering, regression model was performed for each subgroups. The final model was a sum of subgroups models. The quality of obtained model was compared with the regression model made using the same explanatory variables but with no clustering of data. Results were compared by determination coefficient (R2), measure of prediction accuracy mean absolute percentage error (MAPE) and comparison on linear chart. Preliminary results allows to foresee the potential of the presented technique.Keywords: clustering, data analysis, data mining, predictive models
Procedia PDF Downloads 46624445 Python Implementation for S1000D Applicability Depended Processing Model - SALERNO
Authors: Theresia El Khoury, Georges Badr, Amir Hajjam El Hassani, Stéphane N’Guyen Van Ky
Abstract:
The widespread adoption of machine learning and artificial intelligence across different domains can be attributed to the digitization of data over several decades, resulting in vast amounts of data, types, and structures. Thus, data processing and preparation turn out to be a crucial stage. However, applying these techniques to S1000D standard-based data poses a challenge due to its complexity and the need to preserve logical information. This paper describes SALERNO, an S1000d AppLicability dEpended pRocessiNg mOdel. This python-based model analyzes and converts the XML S1000D-based files into an easier data format that can be used in machine learning techniques while preserving the different logic and relationships in files. The model parses the files in the given folder, filters them, and extracts the required information to be saved in appropriate data frames and Excel sheets. Its main idea is to group the extracted information by applicability. In addition, it extracts the full text by replacing internal and external references while maintaining the relationships between files, as well as the necessary requirements. The resulting files can then be saved in databases and used in different models. Documents in both English and French languages were tested, and special characters were decoded. Updates on the technical manuals were taken into consideration as well. The model was tested on different versions of the S1000D, and the results demonstrated its ability to effectively handle the applicability, requirements, references, and relationships across all files and on different levels.Keywords: aeronautics, big data, data processing, machine learning, S1000D
Procedia PDF Downloads 15724444 Life Prediction Method of Lithium-Ion Battery Based on Grey Support Vector Machines
Authors: Xiaogang Li, Jieqiong Miao
Abstract:
As for the problem of the grey forecasting model prediction accuracy is low, an improved grey prediction model is put forward. Firstly, use trigonometric function transform the original data sequence in order to improve the smoothness of data , this model called SGM( smoothness of grey prediction model), then combine the improved grey model with support vector machine , and put forward the grey support vector machine model (SGM - SVM).Before the establishment of the model, we use trigonometric functions and accumulation generation operation preprocessing data in order to enhance the smoothness of the data and weaken the randomness of the data, then use support vector machine (SVM) to establish a prediction model for pre-processed data and select model parameters using genetic algorithms to obtain the optimum value of the global search. Finally, restore data through the "regressive generate" operation to get forecasting data. In order to prove that the SGM-SVM model is superior to other models, we select the battery life data from calce. The presented model is used to predict life of battery and the predicted result was compared with that of grey model and support vector machines.For a more intuitive comparison of the three models, this paper presents root mean square error of this three different models .The results show that the effect of grey support vector machine (SGM-SVM) to predict life is optimal, and the root mean square error is only 3.18%. Keywords: grey forecasting model, trigonometric function, support vector machine, genetic algorithms, root mean square errorKeywords: Grey prediction model, trigonometric functions, support vector machines, genetic algorithms, root mean square error
Procedia PDF Downloads 46124443 Development of a Data Security Model Using Steganography
Authors: Terungwa Simon Yange, Agana Moses A.
Abstract:
This paper studied steganography and designed a simplistic approach to a steganographic tool for hiding information in image files with the view of addressing the security challenges with data by hiding data from unauthorized users to improve its security. The Structured Systems Analysis and Design Method (SSADM) was used in this work. The system was developed using Java Development Kit (JDK) 1.7.0_10 and MySQL Server as its backend. The system was tested with some hypothetical health records which proved the possibility of protecting data from unauthorized users by making it secret so that its existence cannot be easily recognized by fraudulent users. It further strengthens the confidentiality of patient records kept by medical practitioners in the health setting. In conclusion, this work was able to produce a user friendly steganography software that is very fast to install and easy to operate to ensure privacy and secrecy of sensitive data. It also produced an exact copy of the original image and the one carrying the secret message when compared with each.Keywords: steganography, cryptography, encryption, decryption, secrecy
Procedia PDF Downloads 26624442 Analysis of Citation Rate and Data Reuse for Openly Accessible Biodiversity Datasets on Global Biodiversity Information Facility
Authors: Nushrat Khan, Mike Thelwall, Kayvan Kousha
Abstract:
Making research data openly accessible has been mandated by most funders over the last 5 years as it promotes reproducibility in science and reduces duplication of effort to collect the same data. There are evidence that articles that publicly share research data have higher citation rates in biological and social sciences. However, how and whether shared data is being reused is not always intuitive as such information is not easily accessible from the majority of research data repositories. This study aims to understand the practice of data citation and how data is being reused over the years focusing on biodiversity since research data is frequently reused in this field. Metadata of 38,878 datasets including citation counts were collected through the Global Biodiversity Information Facility (GBIF) API for this purpose. GBIF was used as a data source since it provides citation count for datasets, not a commonly available feature for most repositories. Analysis of dataset types, citation counts, creation and update time of datasets suggests that citation rate varies for different types of datasets, where occurrence datasets that have more granular information have higher citation rates than checklist and metadata-only datasets. Another finding is that biodiversity datasets on GBIF are frequently updated, which is unique to this field. Majority of the datasets from the earliest year of 2007 were updated after 11 years, with no dataset that was not updated since creation. For each year between 2007 and 2017, we compared the correlations between update time and citation rate of four different types of datasets. While recent datasets do not show any correlations, 3 to 4 years old datasets show weak correlation where datasets that were updated more recently received high citations. The results are suggestive that it takes several years to cumulate citations for research datasets. However, this investigation found that when searched on Google Scholar or Scopus databases for the same datasets, the number of citations is often not the same as GBIF. Hence future aim is to further explore the citation count system adopted by GBIF to evaluate its reliability and whether it can be applicable to other fields of studies as well.Keywords: data citation, data reuse, research data sharing, webometrics
Procedia PDF Downloads 17824441 Significance of Transient Data and Its Applications in Turbine Generators
Authors: Chandra Gupt Porwal, Preeti C. Porwal
Abstract:
Transient data reveals much about the machine's condition that steady-state data cannot. New technologies make this information much more available for evaluating the mechanical integrity of a machine train. Recent surveys at various stations indicate that simplicity is preferred over completeness in machine audits throughout the power generation industry. This is most clearly shown by the number of rotating machinery predictive maintenance programs in which only steady-state vibration amplitude is trended while important transient vibration data is not even acquired. Efforts have been made to explain what transient data is, its importance, the types of plots used for its display, and its effective utilization for analysis. In order to demonstrate the value of measuring transient data and its practical application in rotating machinery for resolving complex and persistent issues with turbine generators, the author presents a few case studies that highlight the presence of rotor instabilities due to the shaft moving towards the bearing centre in a 100 MM LMZ unit located in the Northern Capital Region (NCR), heavy misalignment noticed—especially after 2993 rpm—caused by loose coupling bolts, which prevented the machine from being synchronized for more than four months in a 250 MW KWU unit in the Western Region (WR), and heavy preload noticed at Intermediate pressure turbine (IPT) bearing near HP- IP coupling, caused by high points on coupling faces at a 500 MW KWU unit in the Northern region (NR), experienced at Indian power plants.Keywords: transient data, steady-state-data, intermediate -pressure-turbine, high-points
Procedia PDF Downloads 6924440 Geographic Information System for District Level Energy Performance Simulations
Authors: Avichal Malhotra, Jerome Frisch, Christoph van Treeck
Abstract:
The utilization of semantic, cadastral and topological data from geographic information systems (GIS) has exponentially increased for building and urban-scale energy performance simulations. Urban planners, simulation scientists, and researchers use virtual 3D city models for energy analysis, algorithms and simulation tools. For dynamic energy simulations at city and district level, this paper provides an overview of the available GIS data models and their levels of detail. Adhering to different norms and standards, these models also intend to describe building and construction industry data. For further investigations, CityGML data models are considered for simulations. Though geographical information modelling has considerably many different implementations, extensions of virtual city data can also be made for domain specific applications. Highlighting the use of the extended CityGML models for energy researches, a brief introduction to the Energy Application Domain Extension (ADE) along with its significance is made. Consequently, addressing specific input simulation data, a workflow using Modelica underlining the usage of GIS information and the quantification of its significance over annual heating energy demand is presented in this paper.Keywords: CityGML, EnergyADE, energy performance simulation, GIS
Procedia PDF Downloads 16924439 Visual Analytics in K 12 Education: Emerging Dimensions of Complexity
Authors: Linnea Stenliden
Abstract:
The aim of this paper is to understand emerging learning conditions, when a visual analytics is implemented and used in K 12 (education). To date, little attention has been paid to the role visual analytics (digital media and technology that highlight visual data communication in order to support analytical tasks) can play in education, and to the extent to which these tools can process actionable data for young students. This study was conducted in three public K 12 schools, in four social science classes with students aged 10 to 13 years, over a period of two to four weeks at each school. Empirical data were generated using video observations and analyzed with help of metaphors by Latour. The learning conditions are found to be distinguished by broad complexity characterized by four dimensions. These emerge from the actors’ deeply intertwined relations in the activities. The paper argues in relation to the found dimensions that novel approaches to teaching and learning could benefit students’ knowledge building as they work with visual analytics, analyzing visualized data.Keywords: analytical reasoning, complexity, data use, problem space, visual analytics, visual storytelling, translation
Procedia PDF Downloads 37624438 Achieving High Renewable Energy Penetration in Western Australia Using Data Digitisation and Machine Learning
Authors: A. D. Tayal
Abstract:
The energy industry is undergoing significant disruption. This research outlines that, whilst challenging; this disruption is also an emerging opportunity for electricity utilities. One such opportunity is leveraging the developments in data analytics and machine learning. As the uptake of renewable energy technologies and complimentary control systems increases, electricity grids will likely transform towards dense microgrids with high penetration of renewable generation sources, rich in network and customer data, and linked through intelligent, wireless communications. Data digitisation and analytics have already impacted numerous industries, and its influence on the energy sector is growing, as computational capabilities increase to manage big data, and as machines develop algorithms to solve the energy challenges of the future. The objective of this paper is to address how far the uptake of renewable technologies can go given the constraints of existing grid infrastructure and provides a qualitative assessment of how higher levels of renewable energy penetration can be facilitated by incorporating even broader technological advances in the fields of data analytics and machine learning. Western Australia is used as a contextualised case study, given its abundance and diverse renewable resources (solar, wind, biomass, and wave) and isolated networks, making a high penetration of renewables a feasible target for policy makers over coming decades.Keywords: data, innovation, renewable, solar
Procedia PDF Downloads 36524437 A New Paradigm to Make Cloud Computing Greener
Authors: Apurva Saxena, Sunita Gond
Abstract:
Demand of computation, data storage in large amount are rapidly increases day by day. Cloud computing technology fulfill the demand of today’s computation but this will lead to high power consumption in cloud data centers. Initiative for Green IT try to reduce power consumption and its adverse environmental impacts. Paper also focus on various green computing techniques, proposed models and efficient way to make cloud greener.Keywords: virtualization, cloud computing, green computing, data center
Procedia PDF Downloads 55524436 Physiological Action of Anthraquinone-Containing Preparations
Authors: Dmitry Yu. Korulkin, Raissa A. Muzychkina, Evgenii N. Kojaev
Abstract:
In review the generalized data about biological activity of anthraquinone-containing plants and specimens on their basis is presented. Data of traditional medicine, results of bioscreening and clinical researches of specimens are analyzed.Keywords: anthraquinones, physiologically active substances, phytopreparation, Ramon
Procedia PDF Downloads 376