Search results for: maximal data sets
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7764

Search results for: maximal data sets

7074 Use of Bayesian Network in Information Extraction from Unstructured Data Sources

Authors: Quratulain N. Rajput, Sajjad Haider

Abstract:

This paper applies Bayesian Networks to support information extraction from unstructured, ungrammatical, and incoherent data sources for semantic annotation. A tool has been developed that combines ontologies, machine learning, and information extraction and probabilistic reasoning techniques to support the extraction process. Data acquisition is performed with the aid of knowledge specified in the form of ontology. Due to the variable size of information available on different data sources, it is often the case that the extracted data contains missing values for certain variables of interest. It is desirable in such situations to predict the missing values. The methodology, presented in this paper, first learns a Bayesian network from the training data and then uses it to predict missing data and to resolve conflicts. Experiments have been conducted to analyze the performance of the presented methodology. The results look promising as the methodology achieves high degree of precision and recall for information extraction and reasonably good accuracy for predicting missing values.

Keywords: Information Extraction, Bayesian Network, ontology, Machine Learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2202
7073 Data Acquisition from Cell Phone using Logical Approach

Authors: Keonwoo Kim, Dowon Hong, Kyoil Chung, Jae-Cheol Ryou

Abstract:

Cell phone forensics to acquire and analyze data in the cellular phone is nowadays being used in a national investigation organization and a private company. In order to collect cellular phone flash memory data, we have two methods. Firstly, it is a logical method which acquires files and directories from the file system of the cell phone flash memory. Secondly, we can get all data from bit-by-bit copy of entire physical memory using a low level access method. In this paper, we describe a forensic tool to acquire cell phone flash memory data using a logical level approach. By our tool, we can get EFS file system and peek memory data with an arbitrary region from Korea CDMA cell phone.

Keywords: Forensics, logical method, acquisition, cell phone, flash memory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4090
7072 Data Migration Methodology from Relational to NoSQL Databases

Authors: Mohamed Hanine, Abdesadik Bendarag, Omar Boutkhoum

Abstract:

Currently, the field of data migration is very topical. As the number of applications developed rapidly, the ever-increasing volume of data collected has driven the architectural migration from Relational Database Management System (RDBMS) to NoSQL (Not Only SQL) database. This very recent technology is important enough in the field of database management. The main aim of this paper is to present a methodology for data migration from RDBMS to NoSQL database. To illustrate this methodology, we implement a software prototype using MySQL as a RDBMS and MongoDB as a NoSQL database. Although this is a hard engineering work, our results show that the proposed methodology can successfully accomplish the goal of this study.

Keywords: Data Migration, MySQL, RDBMS, NoSQL, MongoDB.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4337
7071 An Implicit Region-Based Deformable Model with Local Segmentation Applied to Weld Defects Extraction

Authors: Y. Boutiche, N. Ramou, M. Ben Gharsallah

Abstract:

This paper is devoted to present and discuss a model that allows a local segmentation by using statistical information of a given image. It is based on Chan-Vese model, curve evolution, partial differential equations and binary level sets method. The proposed model uses the piecewise constant approximation of Chan-Vese model to compute Signed Pressure Force (SPF) function, this one attracts the curve to the true object(s)-s boundaries. The implemented model is used to extract weld defects from weld radiographic images in the aim to calculate the perimeter and surfaces of those weld defects; encouraged resultants are obtained on synthetic and real radiographic images.

Keywords: Active contour, Chan-Vese Model, local segmentation, weld radiographic images.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1483
7070 Bayesian Deep Learning Algorithms for Classifying COVID-19 Images

Authors: I. Oloyede

Abstract:

The study investigates the accuracy and loss of deep learning algorithms with the set of coronavirus (COVID-19) images dataset by comparing Bayesian convolutional neural network and traditional convolutional neural network in low dimensional dataset. 50 sets of X-ray images out of which 25 were COVID-19 and the remaining 20 were normal, twenty images were set as training while five were set as validation that were used to ascertained the accuracy of the model. The study found out that Bayesian convolution neural network outperformed conventional neural network at low dimensional dataset that could have exhibited under fitting. The study therefore recommended Bayesian Convolutional neural network (BCNN) for android apps in computer vision for image detection.

Keywords: BCNN, CNN, Images, COVID-19, Deep Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 832
7069 Data Hiding by Vector Quantization in Color Image

Authors: Yung-Gi Wu

Abstract:

With the growing of computer and network, digital data can be spread to anywhere in the world quickly. In addition, digital data can also be copied or tampered easily so that the security issue becomes an important topic in the protection of digital data. Digital watermark is a method to protect the ownership of digital data. Embedding the watermark will influence the quality certainly. In this paper, Vector Quantization (VQ) is used to embed the watermark into the image to fulfill the goal of data hiding. This kind of watermarking is invisible which means that the users will not conscious the existing of embedded watermark even though the embedded image has tiny difference compared to the original image. Meanwhile, VQ needs a lot of computation burden so that we adopt a fast VQ encoding scheme by partial distortion searching (PDS) and mean approximation scheme to speed up the data hiding process. The watermarks we hide to the image could be gray, bi-level and color images. Texts are also can be regarded as watermark to embed. In order to test the robustness of the system, we adopt Photoshop to fulfill sharpen, cropping and altering to check if the extracted watermark is still recognizable. Experimental results demonstrate that the proposed system can resist the above three kinds of tampering in general cases.

Keywords: Data hiding, vector quantization, watermark.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1758
7068 Approximate Range-Sum Queries over Data Cubes Using Cosine Transform

Authors: Wen-Chi Hou, Cheng Luo, Zhewei Jiang, Feng Yan

Abstract:

In this research, we propose to use the discrete cosine transform to approximate the cumulative distributions of data cube cells- values. The cosine transform is known to have a good energy compaction property and thus can approximate data distribution functions easily with small number of coefficients. The derived estimator is accurate and easy to update. We perform experiments to compare its performance with a well-known technique - the (Haar) wavelet. The experimental results show that the cosine transform performs much better than the wavelet in estimation accuracy, speed, space efficiency, and update easiness.

Keywords: DCT, Data Cube

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1932
7067 System Survivability in Networks in the Context of Defense/Attack Strategies: The Large Scale

Authors: A. Ben Yaghlane, M. N. Azaiez, M. Mrad

Abstract:

We investigate the large scale of networks in the context of network survivability under attack. We use appropriate techniques to evaluate and the attacker-based- and the defenderbased- network survivability. The attacker is unaware of the operated links by the defender. Each attacked link has some pre-specified probability to be disconnected. The defender choice is so that to maximize the chance of successfully sending the flow to the destination node. The attacker however will select the cut-set with the highest chance to be disabled in order to partition the network. Moreover, we extend the problem to the case of selecting the best p paths to operate by the defender and the best k cut-sets to target by the attacker, for arbitrary integers p,k>1. We investigate some variations of the problem and suggest polynomial-time solutions.

Keywords: Defense/attack strategies, large scale, networks, partitioning a network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1464
7066 Digital filters for Hot-Mix Asphalt Complex Modulus Test Data Using Genetic Algorithm Strategies

Authors: Madhav V. Chitturi, Anshu Manik, Kasthurirangan Gopalakrishnan

Abstract:

The dynamic or complex modulus test is considered to be a mechanistically based laboratory test to reliably characterize the strength and load-resistance of Hot-Mix Asphalt (HMA) mixes used in the construction of roads. The most common observation is that the data collected from these tests are often noisy and somewhat non-sinusoidal. This hampers accurate analysis of the data to obtain engineering insight. The goal of the work presented in this paper is to develop and compare automated evolutionary computational techniques to filter test noise in the collection of data for the HMA complex modulus test. The results showed that the Covariance Matrix Adaptation-Evolutionary Strategy (CMA-ES) approach is computationally efficient for filtering data obtained from the HMA complex modulus test.

Keywords: HMA, dynamic modulus, GA, evolutionarycomputation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1553
7065 A Case Study on the Efficacy of Technical Laboratory Safety in Polytechnic

Authors: Zulhisyam Salleh, Erita M. Mazlan, Saiful A. Mazlan, Norzainariah A. Hassan, Fizatul A. Patakor

Abstract:

Technical laboratories are typically considered as highly hazardous places in the polytechnic institution when addressing the problems of high incidences and fatality rates. In conjunction with several topics covered in the technical curricular, safety and health precaution should be highlighted in order to connect to few key ideas of being safe. Therefore the assessment of safety awareness in terms of safety and health about hazardous and risks at laboratories is needed and has to be incorporated with technical education and other training programmes. The purpose of this study was to determine the efficacy of technical laboratory safety in one of the polytechnics in northern region. The study examined three related issues that were; the availability of safety material and equipment, safety practice adopted by technical teachers and administrator-s safety attitudes in enforcing safety to the students. A model of efficacy technical laboratory was developed to test the linear relationship between existing safety material and equipment, teachers- safety practice and administrators- attitude in enforcing safety and to identify which of technical laboratory safety issues was the most pertinent factor to realize safety in technical laboratory. This was done by analyzing survey-based data sets particularly those obtained from samples of 210 students in the polytechnic. The Pearson Correlation was used to measure the association between the variables and to test the research hypotheses. The result of the study has found that there was a significant correlation between existing safety material and equipment, safety practice adopted by teacher and administrator-s attitude. There was also a significant relationship between technical laboratory safety and safety practice adopted by teacher and between technical laboratory safety and administrator attitude. Hence, safety practice adopted by teacher and administrator attitude is vital in realizing technical laboratory safety.

Keywords: Polytechnic, Safety attitudes, Safety practices, Technical laboratory

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2401
7064 The Feasibility of Augmenting an Augmented Reality Image Card on a Quick Response Code

Authors: Alfred Chen, Shr Yu Lu, Cong Seng Hong, Yur-June Wang

Abstract:

This research attempts to study the feasibility of augmenting an augmented reality (AR) image card on a Quick Response (QR) code. The authors have developed a new visual tag, which contains a QR code and an augmented AR image card. The new visual tag has features of reading both of the revealed data of the QR code and the instant data from the AR image card. Furthermore, a handheld communicating device is used to read and decode the new visual tag, and then the concealed data of the new visual tag can be revealed and read through its visual display. In general, the QR code is designed to store the corresponding data or, as a key, to access the corresponding data from the server through internet. Those reveled data from the QR code are represented in text. Normally, the AR image card is designed to store the corresponding data in 3-Dimensional or animation/video forms. By using QR code's property of high fault tolerant rate, the new visual tag can access those two different types of data by using a handheld communicating device. The new visual tag has an advantage of carrying much more data than independent QR code or AR image card. The major findings of this research are: 1) the most efficient area for the designed augmented AR card augmenting on the QR code is 9% coverage area out of the total new visual tag-s area, and 2) the best location for the augmented AR image card augmenting on the QR code is located in the bottom-right corner of the new visual tag.

Keywords: Augmented reality, QR code, Visual tag, Handheldcommunicating device

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1532
7063 Evaluating the Validity of Computational Fluid Dynamics Model of Dispersion in a Complex Urban Geometry Using Two Sets of Experimental Measurements

Authors: Mohammad R. Kavian Nezhad, Carlos F. Lange, Brian A. Fleck

Abstract:

This research presents the validation study of a computational fluid dynamics (CFD) model developed to simulate the scalar dispersion emitted from rooftop sources around the buildings at the University of Alberta North Campus. The ANSYS CFX code was used to perform the numerical simulation of the wind regime and pollutant dispersion by solving the 3D steady Reynolds-averaged Navier-Stokes (RANS) equations on a building-scale high-resolution grid. The validation study was performed in two steps. First, the CFD model performance in 24 cases (eight wind directions and three wind speeds) was evaluated by comparing the predicted flow fields with the available data from the previous measurement campaign designed at the North Campus, using the standard deviation method (SDM), while the estimated results of the numerical model showed maximum average percent errors of approximately 53% and 37% for wind incidents from the North and Northwest, respectively. Good agreement with the measurements was observed for the other six directions, with an average error of less than 30%. In the second step, the reliability of the implemented turbulence model, numerical algorithm, modeling techniques, and the grid generation scheme was further evaluated using the Mock Urban Setting Test (MUST) dispersion dataset. Different statistical measures, including the fractional bias (FB), the mean geometric bias (MG), and the normalized mean square error (NMSE), were used to assess the accuracy of the predicted dispersion field. Our CFD results are in very good agreement with the field measurements.

Keywords: CFD, plume dispersion, complex urban geometry, validation study, wind flow.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 321
7062 A Competitive Replica Placement Methodology for Ad Hoc Networks

Authors: Samee Ullah Khan, C. Ardil

Abstract:

In this paper, a mathematical model for data object replication in ad hoc networks is formulated. The derived model is general, flexible and adaptable to cater for various applications in ad hoc networks. We propose a game theoretical technique in which players (mobile hosts) continuously compete in a non-cooperative environment to improve data accessibility by replicating data objects. The technique incorporates the access frequency from mobile hosts to each data object, the status of the network connectivity, and communication costs. The proposed technique is extensively evaluated against four well-known ad hoc network replica allocation methods. The experimental results reveal that the proposed approach outperforms the four techniques in both the execution time and solution quality

Keywords: Data replication, auctions, static allocation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1386
7061 Multidimensional Data Mining by Means of Randomly Travelling Hyper-Ellipsoids

Authors: Pavel Y. Tabakov, Kevin Duffy

Abstract:

The present study presents a new approach to automatic data clustering and classification problems in large and complex databases and, at the same time, derives specific types of explicit rules describing each cluster. The method works well in both sparse and dense multidimensional data spaces. The members of the data space can be of the same nature or represent different classes. A number of N-dimensional ellipsoids are used for enclosing the data clouds. Due to the geometry of an ellipsoid and its free rotation in space the detection of clusters becomes very efficient. The method is based on genetic algorithms that are used for the optimization of location, orientation and geometric characteristics of the hyper-ellipsoids. The proposed approach can serve as a basis for the development of general knowledge systems for discovering hidden knowledge and unexpected patterns and rules in various large databases.

Keywords: Classification, clustering, data minig, genetic algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1752
7060 Predictions Using Data Mining and Case-based Reasoning: A Case Study for Retinopathy

Authors: Vimala Balakrishnan, Mohammad R. Shakouri, Hooman Hoodeh, Loo, Huck-Soo

Abstract:

Diabetes is one of the high prevalence diseases worldwide with increased number of complications, with retinopathy as one of the most common one. This paper describes how data mining and case-based reasoning were integrated to predict retinopathy prevalence among diabetes patients in Malaysia. The knowledge base required was built after literature reviews and interviews with medical experts. A total of 140 diabetes patients- data were used to train the prediction system. A voting mechanism selects the best prediction results from the two techniques used. It has been successfully proven that both data mining and case-based reasoning can be used for retinopathy prediction with an improved accuracy of 85%.

Keywords: Case-Based Reasoning, Data Mining, Prediction, Retinopathy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2990
7059 Neutrosophic Multiple Criteria Decision Making Analysis Method for Selecting Stealth Fighter Aircraft

Authors: C. Ardil

Abstract:

In this paper, a neutrosophic multiple criteria decision analysis method is proposed to select stealth fighter aircraft. Neutrosophic multiple criteria decision analysis methods are used to analyze the neutrosophic environment and give results under uncertainty and incompleteness. Neutrosophic numbers are used to evaluate alternatives over a set of evaluation criteria in decision making problems. Finally, the proposed model is applied to a practical decision problem for selecting stealth fighter aircraft.

Keywords: neutrosophic sets, multiple criteria decision making analysis, stealth fighter aircraft, aircraft selection, MCDMA, SVNNs

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 562
7058 Li-Fi Technology: Data Transmission through Visible Light

Authors: Shahzad Hassan, Kamran Saeed

Abstract:

People are always in search of Wi-Fi hotspots because Internet is a major demand nowadays. But like all other technologies, there is still room for improvement in the Wi-Fi technology with regards to the speed and quality of connectivity. In order to address these aspects, Harald Haas, a professor at the University of Edinburgh, proposed what we know as the Li-Fi (Light Fidelity). Li-Fi is a new technology in the field of wireless communication to provide connectivity within a network environment. It is a two-way mode of wireless communication using light. Basically, the data is transmitted through Light Emitting Diodes which can vary the intensity of light very fast, even faster than the blink of an eye. From the research and experiments conducted so far, it can be said that Li-Fi can increase the speed and reliability of the transfer of data. This paper pays particular attention on the assessment of the performance of this technology. In other words, it is a 5G technology which uses LED as the medium of data transfer. For coverage within the buildings, Wi-Fi is good but Li-Fi can be considered favorable in situations where large amounts of data are to be transferred in areas with electromagnetic interferences. It brings a lot of data related qualities such as efficiency, security as well as large throughputs to the table of wireless communication. All in all, it can be said that Li-Fi is going to be a future phenomenon where the presence of light will mean access to the Internet as well as speedy data transfer.

Keywords: Communication, LED, Li-Fi, Wi-Fi.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2128
7057 Entomological Origin of Honey Discriminated by NMR Chloroform Extracts in Ecuadorian Honey

Authors: P. Vit, J. Uddin, V. Zuccato, F. Maza, E. Schievano

Abstract:

Honeys are produced by Apis mellifera and stingless bees (Meliponini) in Ecuador. We studied honey produced in beeswax combs by Apis mellifera, and honey produced in pots by Geotrigona and Scaptotrigona bees. Chloroform extracts of honey were obtained for fast NMR spectra. The 1D spectra were acquired at 298 K, with a 600 MHz NMR Bruker instrument, using a modified double pulsed field gradient spin echoes (DPFGSE) sequence. Signals of 1H NMR spectra were integrated and used as inputs for PCA, PLS-DA analysis, and labelled sets of classes were successfully identified, enhancing the separation between the three groups of honey according to the entomological origin: A. mellifera, Geotrigona and Scaptotrigona. This procedure is therefore recommended for authenticity test of honey in Ecuador.

Keywords: Apis mellifera, honey, 1H NMR, entomological origin, Meliponini.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3112
7056 A Hamiltonian Decomposition of 5-star

Authors: Walter Hussak, Heiko Schröder

Abstract:

Star graphs are Cayley graphs of symmetric groups of permutations, with transpositions as the generating sets. A star graph is a preferred interconnection network topology to a hypercube for its ability to connect a greater number of nodes with lower degree. However, an attractive property of the hypercube is that it has a Hamiltonian decomposition, i.e. its edges can be partitioned into disjoint Hamiltonian cycles, and therefore a simple routing can be found in the case of an edge failure. The existence of Hamiltonian cycles in Cayley graphs has been known for some time. So far, there are no published results on the much stronger condition of the existence of Hamiltonian decompositions. In this paper, we give a construction of a Hamiltonian decomposition of the star graph 5-star of degree 4, by defining an automorphism for 5-star and a Hamiltonian cycle which is edge-disjoint with its image under the automorphism.

Keywords: interconnection networks, paths and cycles, graphs andgroups.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1725
7055 Business Rules for Data Warehouse

Authors: Rajeev Kaula

Abstract:

Business rules and data warehouse are concepts and technologies that impact a wide variety of organizational tasks. In general, each area has evolved independently, impacting application development and decision-making. Generating knowledge from data warehouse is a complex process. This paper outlines an approach to ease import of information and knowledge from a data warehouse star schema through an inference class of business rules. The paper utilizes the Oracle database for illustrating the working of the concepts. The star schema structure and the business rules are stored within a relational database. The approach is explained through a prototype in Oracle-s PL/SQL Server Pages.

Keywords: Business Rules, Data warehouse, PL/SQL ServerPages, Relational model, Web Application.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2958
7054 Degradation of Endosulfan in Different Soils by Indigenous and Adapted Microorganisms

Authors: A. Özyer, N. G. Turan, Y. Ardalı

Abstract:

The environmental fate of organic contaminants in soils is influenced significantly by the pH, texture of soil, water content and also presence of organic matter. In this study, biodegradation of endosulfan isomers was studied in two different soils (Soil A and Soil B) that have contrasting properties in terms of their texture, pH, organic content, etc. Two Nocardia sp., which were isolated from soil, were used for degradation of endosulfan. Soils were contaminated with commercial endosulfan. Six sets were maintained from two different soils, contaminated with different endosulfan concentrations for degradation experiments. Inoculated and uninoculated mineral media with Nocardia isolates were added to the soils and mixed. Soils were incubated at a certain temperature (30 °C) during ten weeks. Residue endosulfan and its metabolites’ concentrations were determined weekly during the incubation period. The changes of the soil microorganisms were investigated weekly.

Keywords: Endosulfan, biodegradation, Nocardia sp., soil, organochlorine pesticide.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1380
7053 Authorization of Commercial Communication Satellite Grounds for Promoting Turkish Data Relay System

Authors: Celal Dudak, Aslı Utku, Burak Yağlioğlu

Abstract:

Uninterrupted and continuous satellite communication through the whole orbit time is becoming more indispensable every day. Data relay systems are developed and built for various high/low data rate information exchanges like TDRSS of USA and EDRSS of Europe. In these missions, a couple of task-dedicated communication satellites exist. In this regard, for Turkey a data relay system is attempted to be defined exchanging low data rate information (i.e. TTC) for Earth-observing LEO satellites appointing commercial GEO communication satellites all over the world. First, justification of this attempt is given, demonstrating duration enhancements in the link. Discussion of preference of RF communication is, also, given instead of laser communication. Then, preferred communication GEOs – including TURKSAT4A already belonging to Turkey- are given, together with the coverage enhancements through STK simulations and the corresponding link budget. Also, a block diagram of the communication system is given on the LEO satellite.

Keywords: Communication, satellite, data relay system, coverage.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1398
7052 An Efficient Approach to Mining Frequent Itemsets on Data Streams

Authors: Sara Ansari, Mohammad Hadi Sadreddini

Abstract:

The increasing importance of data stream arising in a wide range of advanced applications has led to the extensive study of mining frequent patterns. Mining data streams poses many new challenges amongst which are the one-scan nature, the unbounded memory requirement and the high arrival rate of data streams. In this paper, we propose a new approach for mining itemsets on data stream. Our approach SFIDS has been developed based on FIDS algorithm. The main attempts were to keep some advantages of the previous approach and resolve some of its drawbacks, and consequently to improve run time and memory consumption. Our approach has the following advantages: using a data structure similar to lattice for keeping frequent itemsets, separating regions from each other with deleting common nodes that results in a decrease in search space, memory consumption and run time; and Finally, considering CPU constraint, with increasing arrival rate of data that result in overloading system, SFIDS automatically detect this situation and discard some of unprocessing data. We guarantee that error of results is bounded to user pre-specified threshold, based on a probability technique. Final results show that SFIDS algorithm could attain about 50% run time improvement than FIDS approach.

Keywords: Data stream, frequent itemset, stream mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1397
7051 Anomaly Detection in a Data Center with a Reconstruction Method Using a Multi-Autoencoders Model

Authors: Victor Breux, Jérôme Boutet, Alain Goret, Viviane Cattin

Abstract:

Early detection of anomalies in data centers is important to reduce downtimes and the costs of periodic maintenance. However, there is little research on this topic and even fewer on the fusion of sensor data for the detection of abnormal events. The goal of this paper is to propose a method for anomaly detection in data centers by combining sensor data (temperature, humidity, power) and deep learning models. The model described in the paper uses one autoencoder per sensor to reconstruct the inputs. The auto-encoders contain Long-Short Term Memory (LSTM) layers and are trained using the normal samples of the relevant sensors selected by correlation analysis. The difference signal between the input and its reconstruction is then used to classify the samples using feature extraction and a random forest classifier. The data measured by the sensors of a data center between January 2019 and May 2020 are used to train the model, while the data between June 2020 and May 2021 are used to assess it. Performances of the model are assessed a posteriori through F1-score by comparing detected anomalies with the data center’s history. The proposed model outperforms the state-of-the-art reconstruction method, which uses only one autoencoder taking multivariate sequences and detects an anomaly with a threshold on the reconstruction error, with an F1-score of 83.60% compared to 24.16%.

Keywords: Anomaly detection, autoencoder, data centers, deep learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 703
7050 AnQL: A Query Language for Annotation Documents

Authors: Neerja Bhatnagar, Ben A. Juliano, Renee S. Renner

Abstract:

This paper presents data annotation models at five levels of granularity (database, relation, column, tuple, and cell) of relational data to address the problem of unsuitability of most relational databases to express annotations. These models do not require any structural and schematic changes to the underlying database. These models are also flexible, extensible, customizable, database-neutral, and platform-independent. This paper also presents an SQL-like query language, named Annotation Query Language (AnQL), to query annotation documents. AnQL is simple to understand and exploits the already-existent wide knowledge and skill set of SQL.

Keywords: Annotation query language, data annotations, data annotation models, semantic data annotations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1788
7049 Machine Learning-Enabled Classification of Climbing Using Small Data

Authors: Nicholas Milburn, Yu Liang, Dalei Wu

Abstract:

Athlete performance scoring within the climbing domain presents interesting challenges as the sport does not have an objective way to assign skill. Assessing skill levels within any sport is valuable as it can be used to mark progress while training, and it can help an athlete choose appropriate climbs to attempt. Machine learning-based methods are popular for complex problems like this. The dataset available was composed of dynamic force data recorded during climbing; however, this dataset came with challenges such as data scarcity, imbalance, and it was temporally heterogeneous. Investigated solutions to these challenges include data augmentation, temporal normalization, conversion of time series to the spectral domain, and cross validation strategies. The investigated solutions to the classification problem included light weight machine classifiers KNN and SVM as well as the deep learning with CNN. The best performing model had an 80% accuracy. In conclusion, there seems to be enough information within climbing force data to accurately categorize climbers by skill.

Keywords: Classification, climbing, data imbalance, data scarcity, machine learning, time sequence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 523
7048 Body Mass Index, Components of Metabolic Syndrome and Hyperuricemia among Women in Postmenopausal Period

Authors: Vladyslav Povoroznyuk, Galina Dubetska, Roksolana Povoroznyuk

Abstract:

In recent years, the problem of hyperuricemia is getting a particular importance due to its increased incidence in the world population. The aim of this study was to determine uriс acid level in blood serum, incidence of hyperuricemia among women in postmenopausal period and their association with body mass index and some components of metabolic syndrome (triglyceride, cholesterol, systolic and diastolic pressure). We examined 412 women in postmenopausal period. They were divided in to the following groups: I group (BMI = 18,5-24,9), II group (BMI = 25,0-29,9), III group (BMI = 30,0-34,9), IV group (BMI > 35). We determined uric acid level among women during postmenopausal period depending on their body mass index. The higher level of uric acid was found in patients with the maximal body mass index (BMI > 35). In the I group it was 277,52 ± 8,40; in the II group – 286,81 ± 7,79; in the III group – 291,81 ± 7,56; in the IV group – 327,17 ± 12,17. Incidence of hyperuricemia among women in the I group was 10,2%, in the II group – 15,9%; in the III group – 21,2%, in the IV group – 34,2%. We found an interdependence between an uric acid level and BMI in the examined women (r = 0,21, p < 0,05). We determined that the highest level of triglyceride (F = 18,62, p < 0,05), cholesterol (F = 3,64, p < 0,05), atherogenic coefficient (F = 22,64, p < 0,05), systolic (F = 10,5, p < 0,05) and diastolic pressure (F = 4,30, p < 0,05) was among women with hyperuricemia. It was an interdependence between an uric acid level and triglyceride (r = 0,26, p < 0,05), atherogenic coefficient (r = 0,24, p < 0,05) among women in postmenopausal period.

Keywords: Hyperuricemia, uric acid, body mass index, metabolic syndrome, triglyceride, cholesterol, atherogenic coefficient, systolic and diastolic pressure, women.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 423
7047 Optimization of Real Time Measured Data Transmission, Given the Amount of Data Transmitted

Authors: Michal Kopcek, Tomas Skulavik, Michal Kebisek, Gabriela Krizanova

Abstract:

The operation of nuclear power plants involves continuous monitoring of the environment in their area. This monitoring is performed using a complex data acquisition system, which collects status information about the system itself and values of many important physical variables e.g. temperature, humidity, dose rate etc. This paper describes a proposal and optimization of communication that takes place in teledosimetric system between the central control server responsible for the data processing and storing and the decentralized measuring stations, which are measuring the physical variables. Analyzes of ongoing communication were performed and consequently the optimization of the system architecture and communication was done.

Keywords: Communication protocol, transmission optimization, data acquisition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1792
7046 Dignity and Suffering: Reading of Human Rights in Untouchable by Anand

Authors: Norah A. Elgibreen

Abstract:

Cultural stories are political. They register cultural phenomena and their relations with the world and society in term of their existence, function, characteristics by using different context. This paper will provide a new way of rethinking which will help us to rethink the relationship between fiction and politics. It discusses the theme of human rights and it shows the relevance between art and politics by studying the civil society through a literary framework. Reasons to establish a relationship between fiction and politics are the relevant themes and universal issues among the two disciplines. Both disciplines are sets of views and ideas formulated by the human mind to explain political or cultural phenomenon. Other reasons are the complexity and depth of the author-s vision, and the need to explain the violations of human rights in a more active structure which can relate to emotional and social existence.

Keywords: dignity, human rights, politics and literature, Untouchable.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3280
7045 Empirical Process Monitoring Via Chemometric Analysis of Partially Unbalanced Data

Authors: Hyun-Woo Cho

Abstract:

Real-time or in-line process monitoring frameworks are designed to give early warnings for a fault along with meaningful identification of its assignable causes. In artificial intelligence and machine learning fields of pattern recognition various promising approaches have been proposed such as kernel-based nonlinear machine learning techniques. This work presents a kernel-based empirical monitoring scheme for batch type production processes with small sample size problem of partially unbalanced data. Measurement data of normal operations are easy to collect whilst special events or faults data are difficult to collect. In such situations, noise filtering techniques can be helpful in enhancing process monitoring performance. Furthermore, preprocessing of raw process data is used to get rid of unwanted variation of data. The performance of the monitoring scheme was demonstrated using three-dimensional batch data. The results showed that the monitoring performance was improved significantly in terms of detection success rate of process fault.

Keywords: Process Monitoring, kernel methods, multivariate filtering, data-driven techniques, quality improvement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1728