Search results for: general data protection regulation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9030

Search results for: general data protection regulation

8100 Robust Regression and its Application in Financial Data Analysis

Authors: Mansoor Momeni, Mahmoud Dehghan Nayeri, Ali Faal Ghayoumi, Hoda Ghorbani

Abstract:

This research is aimed to describe the application of robust regression and its advantages over the least square regression method in analyzing financial data. To do this, relationship between earning per share, book value of equity per share and share price as price model and earning per share, annual change of earning per share and return of stock as return model is discussed using both robust and least square regressions, and finally the outcomes are compared. Comparing the results from the robust regression and the least square regression shows that the former can provide the possibility of a better and more realistic analysis owing to eliminating or reducing the contribution of outliers and influential data. Therefore, robust regression is recommended for getting more precise results in financial data analysis.

Keywords: Financial data analysis, Influential data, Outliers, Robust regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1921
8099 An On-chip LDO Voltage Regulator with Improved Current Buffer Compensation

Authors: Lv Xiaopeng, Bian Qiang, Yue Suge

Abstract:

A fully on-chip low drop-out (LDO) voltage regulator with 100pF output load capacitor is presented. A novel frequency compensation scheme using current buffer is adopted to realize single dominant pole within the unit gain frequency of the regulation loop, the phase margin (PM) is at least 50 degree under the full range of the load current, and the power supply rejection (PSR) character is improved compared with conventional Miller compensation. Besides, the differentiator provides a high speed path during the load current transient. Implemented in 0.18μm CMOS technology, the LDO voltage regulator provides 100mA load current with a stable 1.8V output voltage consuming 80μA quiescent current.

Keywords: capacitor-less LDO, frequency compensation, transient response, power supply rejection

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4659
8098 Hierarchical Checkpoint Protocol in Data Grids

Authors: Rahma Souli-Jbali, Minyar Sassi Hidri, Rahma Ben Ayed

Abstract:

Grid of computing nodes has emerged as a representative means of connecting distributed computers or resources scattered all over the world for the purpose of computing and distributed storage. Since fault tolerance becomes complex due to the availability of resources in decentralized grid environment, it can be used in connection with replication in data grids. The objective of our work is to present fault tolerance in data grids with data replication-driven model based on clustering. The performance of the protocol is evaluated with Omnet++ simulator. The computational results show the efficiency of our protocol in terms of recovery time and the number of process in rollbacks.

Keywords: Data grids, fault tolerance, chandy-lamport, clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 935
8097 Fuzzy Based Problem-Solution Data Structureas a Data Oriented Model for ABS Controlling

Authors: Ahmad Habibizad Navin, Mehdi Naghian Fesharaki, Mohamad Teshnelab, Ehsan Shahamatnia

Abstract:

The anti-lock braking systems installed on vehicles for safe and effective braking, are high-order nonlinear and timevariant. Using fuzzy logic controllers increase efficiency of such systems, but impose a high computational complexity as well. The main concept introduced by this paper is reducing computational complexity of fuzzy controllers by deploying problem-solution data structure. Unlike conventional methods that are based on calculations, this approach is based on data oriented modeling.

Keywords: ABS, Fuzzy controller, PSDS, Time-Memory tradeoff, Data oriented modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1720
8096 Class Outliers Mining: Distance-Based Approach

Authors: Nabil M. Hewahi, Motaz K. Saad

Abstract:

In large datasets, identifying exceptional or rare cases with respect to a group of similar cases is considered very significant problem. The traditional problem (Outlier Mining) is to find exception or rare cases in a dataset irrespective of the class label of these cases, they are considered rare events with respect to the whole dataset. In this research, we pose the problem that is Class Outliers Mining and a method to find out those outliers. The general definition of this problem is “given a set of observations with class labels, find those that arouse suspicions, taking into account the class labels". We introduce a novel definition of Outlier that is Class Outlier, and propose the Class Outlier Factor (COF) which measures the degree of being a Class Outlier for a data object. Our work includes a proposal of a new algorithm towards mining of the Class Outliers, presenting experimental results applied on various domains of real world datasets and finally a comparison study with other related methods is performed.

Keywords: Class Outliers, Distance-Based Approach, Outliers Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3380
8095 Design and Application of NFC-Based Identity and Access Management in Cloud Services

Authors: Shin-Jer Yang, Kai-Tai Yang

Abstract:

In response to a changing world and the fast growth of the Internet, more and more enterprises are replacing web-based services with cloud-based ones. Multi-tenancy technology is becoming more important especially with Software as a Service (SaaS). This in turn leads to a greater focus on the application of Identity and Access Management (IAM). Conventional Near-Field Communication (NFC) based verification relies on a computer browser and a card reader to access an NFC tag. This type of verification does not support mobile device login and user-based access management functions. This study designs an NFC-based third-party cloud identity and access management scheme (NFC-IAM) addressing this shortcoming. Data from simulation tests analyzed with Key Performance Indicators (KPIs) suggest that the NFC-IAM not only takes less time in identity identification but also cuts time by 80% in terms of two-factor authentication and improves verification accuracy to 99.9% or better. In functional performance analyses, NFC-IAM performed better in salability and portability. The NFC-IAM App (Application Software) and back-end system to be developed and deployed in mobile device are to support IAM features and also offers users a more user-friendly experience and stronger security protection. In the future, our NFC-IAM can be employed to different environments including identification for mobile payment systems, permission management for remote equipment monitoring, among other applications.

Keywords: Cloud service, multi-tenancy, NFC, IAM, mobile device.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1109
8094 Use of Bayesian Network in Information Extraction from Unstructured Data Sources

Authors: Quratulain N. Rajput, Sajjad Haider

Abstract:

This paper applies Bayesian Networks to support information extraction from unstructured, ungrammatical, and incoherent data sources for semantic annotation. A tool has been developed that combines ontologies, machine learning, and information extraction and probabilistic reasoning techniques to support the extraction process. Data acquisition is performed with the aid of knowledge specified in the form of ontology. Due to the variable size of information available on different data sources, it is often the case that the extracted data contains missing values for certain variables of interest. It is desirable in such situations to predict the missing values. The methodology, presented in this paper, first learns a Bayesian network from the training data and then uses it to predict missing data and to resolve conflicts. Experiments have been conducted to analyze the performance of the presented methodology. The results look promising as the methodology achieves high degree of precision and recall for information extraction and reasonably good accuracy for predicting missing values.

Keywords: Information Extraction, Bayesian Network, ontology, Machine Learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2219
8093 The Development of Online-Class Scheduling Management System Conducted by the Case Study of Department of Social Science: Faculty of Humanities and Social Sciences Suan Sunandha Rajabhat University

Authors: Wipada Chaiwchan, Patcharee Klinhom

Abstract:

This research is aimed to develop the online-class scheduling management system and improve as a complex problem solution, this must take into consideration in various conditions and factors. In addition to the number of courses, the number of students and a timetable to study, the physical characteristics of each class room and regulations used in the class scheduling must also be taken into consideration. This system is developed to assist management in the class scheduling for convenience and efficiency. It can provide several instructors to schedule simultaneously. Both lecturers and students can check and publish a timetable and other documents associated with the system online immediately. It is developed in a web-based application. PHP is used as a developing tool. The database management system was MySQL. The tool that is used for efficiency testing of the system is questionnaire. The system was evaluated by using a Black-Box testing. The sample was composed of 2 groups: 5 experts and 100 general users. The average and the standard deviation of results from the experts were 3.50 and 0.67. The average and the standard deviation of results from the general users were 3.54 and 0.54. In summary, the results from the research indicated that the satisfaction of users were in a good level. Therefore, this system could be implemented in an actual workplace and satisfy the users’ requirement effectively.

Keywords: Timetable, schedule, management system, online.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6385
8092 Data Acquisition from Cell Phone using Logical Approach

Authors: Keonwoo Kim, Dowon Hong, Kyoil Chung, Jae-Cheol Ryou

Abstract:

Cell phone forensics to acquire and analyze data in the cellular phone is nowadays being used in a national investigation organization and a private company. In order to collect cellular phone flash memory data, we have two methods. Firstly, it is a logical method which acquires files and directories from the file system of the cell phone flash memory. Secondly, we can get all data from bit-by-bit copy of entire physical memory using a low level access method. In this paper, we describe a forensic tool to acquire cell phone flash memory data using a logical level approach. By our tool, we can get EFS file system and peek memory data with an arbitrary region from Korea CDMA cell phone.

Keywords: Forensics, logical method, acquisition, cell phone, flash memory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4100
8091 Accounting for SMEs – How Important is Size in Choosing between Global and Local Standards?

Authors: Cătălin Nicolae Albu, Nadia Albu, Maria Mădălina Gîrbină

Abstract:

There is limited evidence from various countries about the possible impact of various criteria to be used to determine the scope of the IFRS for SMEs issued in 2009 and, research is needed in this area. We provide evidence from Romania, an emerging economy member of the European Union. The aim of this paper is to analyze in a local setting if size is a relevant factor for deciding between local and global standards for SMEs. Our results indicate that size is a moderate indicator of the existence of possible users interested in financial statements and that there is a difference between the scopes of the standard determined on various criteria.. Also, we suggest that the international exposure is quite reduced in the case of SMEs, but is sufficient to suggest that at least some SMEs would benefit from international comparability of financial statements

Keywords: SMEs, IFRS for SMEs, accounting regulation, entity's size.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1848
8090 New Approach for Constructing a Secure Biometric Database

Authors: A. Kebbeb, M. Mostefai, F. Benmerzoug, Y. Chahir

Abstract:

The multimodal biometric identification is the combination of several biometric systems; the challenge of this combination is to reduce some limitations of systems based on a single modality while significantly improving performance. In this paper, we propose a new approach to the construction and the protection of a multimodal biometric database dedicated to an identification system. We use a topological watermarking to hide the relation between face image and the registered descriptors extracted from other modalities of the same person for more secure user identification.

Keywords: Biometric databases, Multimodal biometrics, security authentication, Digital watermarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2074
8089 Data Migration Methodology from Relational to NoSQL Databases

Authors: Mohamed Hanine, Abdesadik Bendarag, Omar Boutkhoum

Abstract:

Currently, the field of data migration is very topical. As the number of applications developed rapidly, the ever-increasing volume of data collected has driven the architectural migration from Relational Database Management System (RDBMS) to NoSQL (Not Only SQL) database. This very recent technology is important enough in the field of database management. The main aim of this paper is to present a methodology for data migration from RDBMS to NoSQL database. To illustrate this methodology, we implement a software prototype using MySQL as a RDBMS and MongoDB as a NoSQL database. Although this is a hard engineering work, our results show that the proposed methodology can successfully accomplish the goal of this study.

Keywords: Data Migration, MySQL, RDBMS, NoSQL, MongoDB.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4348
8088 Advanced Polymorphic Techniques

Authors: Philippe Beaucamps

Abstract:

Nowadays viruses use polymorphic techniques to mutate their code on each replication, thus evading detection by antiviruses. However detection by emulation can defeat simple polymorphism: thus metamorphic techniques are used which thoroughly change the viral code, even after decryption. We briefly detail this evolution of virus protection techniques against detection and then study the METAPHOR virus, today's most advanced metamorphic virus.

Keywords: Computer virus, Viral mutation, Polymorphism, Meta¬morphism, MetaPHOR, Virus history, Obfuscation, Viral genetic techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2471
8087 Performance Comparison of Particle Swarm Optimization with Traditional Clustering Algorithms used in Self-Organizing Map

Authors: Anurag Sharma, Christian W. Omlin

Abstract:

Self-organizing map (SOM) is a well known data reduction technique used in data mining. It can reveal structure in data sets through data visualization that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOM, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of an adaptive heuristic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOM. The application of our method to several standard data sets demonstrates its feasibility. PSO algorithm utilizes a so-called U-matrix of SOM to determine cluster boundaries; the results of this novel automatic method compare very favorably to boundary detection through traditional algorithms namely k-means and hierarchical based approach which are normally used to interpret the output of SOM.

Keywords: cluster boundaries, clustering, code vectors, data mining, particle swarm optimization, self-organizing maps, U-matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1898
8086 Approximate Range-Sum Queries over Data Cubes Using Cosine Transform

Authors: Wen-Chi Hou, Cheng Luo, Zhewei Jiang, Feng Yan

Abstract:

In this research, we propose to use the discrete cosine transform to approximate the cumulative distributions of data cube cells- values. The cosine transform is known to have a good energy compaction property and thus can approximate data distribution functions easily with small number of coefficients. The derived estimator is accurate and easy to update. We perform experiments to compare its performance with a well-known technique - the (Haar) wavelet. The experimental results show that the cosine transform performs much better than the wavelet in estimation accuracy, speed, space efficiency, and update easiness.

Keywords: DCT, Data Cube

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1949
8085 Digital filters for Hot-Mix Asphalt Complex Modulus Test Data Using Genetic Algorithm Strategies

Authors: Madhav V. Chitturi, Anshu Manik, Kasthurirangan Gopalakrishnan

Abstract:

The dynamic or complex modulus test is considered to be a mechanistically based laboratory test to reliably characterize the strength and load-resistance of Hot-Mix Asphalt (HMA) mixes used in the construction of roads. The most common observation is that the data collected from these tests are often noisy and somewhat non-sinusoidal. This hampers accurate analysis of the data to obtain engineering insight. The goal of the work presented in this paper is to develop and compare automated evolutionary computational techniques to filter test noise in the collection of data for the HMA complex modulus test. The results showed that the Covariance Matrix Adaptation-Evolutionary Strategy (CMA-ES) approach is computationally efficient for filtering data obtained from the HMA complex modulus test.

Keywords: HMA, dynamic modulus, GA, evolutionarycomputation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1562
8084 Emotions Triggered by Children’s Literature Images

Authors: A. Breda, C. Cruz

Abstract:

The role of images/illustrations in communicating meanings and triggering emotions assumes an increasingly relevant role in contemporary texts, regardless of the age group for which they are intended or the nature of the texts that host them. It is no coincidence that children's books are full of illustrations and that the image/text ratio decreases as the age group grows. The vast majority of children's books can be considered as multimodal texts containing text and images/illustrations, interacting with each other, to provide the young reader with a broader and more creative understanding of the book's narrative. This interaction is very diverse, ranging from images/illustrations that are not essential for understanding the storytelling to those that contribute significantly to the meaning of the story. Usually, these books are also read by adults, namely by parents, educators, and teachers who act as mediators between the book and the children, explaining aspects that are or seem to be too complex for the child's context. It should be noted that there are books labeled as children's books, that are clearly intended for both children and adults. In this work, following a qualitative and interpretative methodology based on written productions, participant observation, and field notes, we will describe the perceptions of future teachers of the 1st cycle of basic education, attending a master’s degree at a Portuguese university, about the role of the image in literary and non-literary texts, namely in mathematical texts, and how these can constitute precious resources for emotional regulation and for the design of creative didactic situations. The analysis of the collected data allowed us to obtain evidence regarding the evolution of the participants' perception regarding the crucial role of images in children's literature, not only as an emotional regulator for young readers but also as a creative source for the design of meaningful didactical situations, crossing other scientific areas, other than the mother tongue, namely mathematics.

Keywords: Children’s literature, emotions, multimodal texts, soft skills.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 166
8083 A New Nonlinear Excitation Controller for Transient Stability Enhancement in Power Systems

Authors: M. Ouassaid, A. Nejmi, M. Cherkaoui, M. Maaroufi

Abstract:

The very nonlinear nature of the generator and system behaviour following a severe disturbance precludes the use of classical linear control technique. In this paper, a new approach of nonlinear control is proposed for transient and steady state stability analysis of a synchronous generator. The control law of the generator excitation is derived from the basis of Lyapunov stability criterion. The overall stability of the system is shown using Lyapunov technique. The application of the proposed controller to simulated generator excitation control under a large sudden fault and wide range of operating conditions demonstrates that the new control strategy is superior to conventional automatic voltage regulator (AVR), and show very promising results.

Keywords: Excitation control, Lyapunov technique, non linearcontrol, synchronous generator, transient stability, voltage regulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2603
8082 Dynamic Simulation of a Hybrid Wind Farm with Wind Turbines and Distributed Compressed Air Energy Storage System

Authors: Eronini Umez-Eronini

Abstract:

Compressed air energy storage (CAES) coupled with wind farms have gained attention as a means to address the intermittency and variability of wind power. However, most existing studies and implementations focus on bulk or centralized CAES plants. This study presents a dynamic model of a hybrid wind farm with distributed CAES, using air storage tanks and compressor and expander trains at each wind turbine station. It introduces the concept of a distributed CAES with linked air cooling and heating, and presents an approach to scheduling and regulating the production of compressed air and power in such a system. Mathematical models of the dynamic components of this hybrid wind farm system, including a simple transient wake field model, were developed and simulated using MATLAB, with real wind data and Transmission System Operator (TSO) absolute power reference signals as inputs. The simulation results demonstrate that the proposed ad hoc supervisory controller is able to track the minute-scale power demand signal within an error band size comparable to the electrical power rating of a single expander. This suggests that combining the global distributed CAES control with power regulation for individual wind turbines could further improve the system’s performance. The round trip electrical storage efficiency computed for the distributed CAES was also in the range of reported round trip storage electrical efficiencies for improved bulk CAES. These findings contribute to the enhancement of efficiency of wind farms without access to large-scale storage or underground caverns.

Keywords: Distributed CAES, compressed air, energy storage, hybrid wind farm, wind turbines, dynamic simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21
8081 Predictions Using Data Mining and Case-based Reasoning: A Case Study for Retinopathy

Authors: Vimala Balakrishnan, Mohammad R. Shakouri, Hooman Hoodeh, Loo, Huck-Soo

Abstract:

Diabetes is one of the high prevalence diseases worldwide with increased number of complications, with retinopathy as one of the most common one. This paper describes how data mining and case-based reasoning were integrated to predict retinopathy prevalence among diabetes patients in Malaysia. The knowledge base required was built after literature reviews and interviews with medical experts. A total of 140 diabetes patients- data were used to train the prediction system. A voting mechanism selects the best prediction results from the two techniques used. It has been successfully proven that both data mining and case-based reasoning can be used for retinopathy prediction with an improved accuracy of 85%.

Keywords: Case-Based Reasoning, Data Mining, Prediction, Retinopathy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3000
8080 Zero Truncated Strict Arcsine Model

Authors: Y. N. Phang, E. F. Loh

Abstract:

The zero truncated model is usually used in modeling count data without zero. It is the opposite of zero inflated model. Zero truncated Poisson and zero truncated negative binomial models are discussed and used by some researchers in analyzing the abundance of rare species and hospital stay. Zero truncated models are used as the base in developing hurdle models. In this study, we developed a new model, the zero truncated strict arcsine model, which can be used as an alternative model in modeling count data without zero and with extra variation. Two simulated and one real life data sets are used and fitted into this developed model. The results show that the model provides a good fit to the data. Maximum likelihood estimation method is used in estimating the parameters.

Keywords: Hurdle models, maximum likelihood estimation method, positive count data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1843
8079 Li-Fi Technology: Data Transmission through Visible Light

Authors: Shahzad Hassan, Kamran Saeed

Abstract:

People are always in search of Wi-Fi hotspots because Internet is a major demand nowadays. But like all other technologies, there is still room for improvement in the Wi-Fi technology with regards to the speed and quality of connectivity. In order to address these aspects, Harald Haas, a professor at the University of Edinburgh, proposed what we know as the Li-Fi (Light Fidelity). Li-Fi is a new technology in the field of wireless communication to provide connectivity within a network environment. It is a two-way mode of wireless communication using light. Basically, the data is transmitted through Light Emitting Diodes which can vary the intensity of light very fast, even faster than the blink of an eye. From the research and experiments conducted so far, it can be said that Li-Fi can increase the speed and reliability of the transfer of data. This paper pays particular attention on the assessment of the performance of this technology. In other words, it is a 5G technology which uses LED as the medium of data transfer. For coverage within the buildings, Wi-Fi is good but Li-Fi can be considered favorable in situations where large amounts of data are to be transferred in areas with electromagnetic interferences. It brings a lot of data related qualities such as efficiency, security as well as large throughputs to the table of wireless communication. All in all, it can be said that Li-Fi is going to be a future phenomenon where the presence of light will mean access to the Internet as well as speedy data transfer.

Keywords: Communication, LED, Li-Fi, Wi-Fi.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2147
8078 Authorization of Commercial Communication Satellite Grounds for Promoting Turkish Data Relay System

Authors: Celal Dudak, Aslı Utku, Burak Yağlioğlu

Abstract:

Uninterrupted and continuous satellite communication through the whole orbit time is becoming more indispensable every day. Data relay systems are developed and built for various high/low data rate information exchanges like TDRSS of USA and EDRSS of Europe. In these missions, a couple of task-dedicated communication satellites exist. In this regard, for Turkey a data relay system is attempted to be defined exchanging low data rate information (i.e. TTC) for Earth-observing LEO satellites appointing commercial GEO communication satellites all over the world. First, justification of this attempt is given, demonstrating duration enhancements in the link. Discussion of preference of RF communication is, also, given instead of laser communication. Then, preferred communication GEOs – including TURKSAT4A already belonging to Turkey- are given, together with the coverage enhancements through STK simulations and the corresponding link budget. Also, a block diagram of the communication system is given on the LEO satellite.

Keywords: Communication, satellite, data relay system, coverage.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1407
8077 An Efficient Approach to Mining Frequent Itemsets on Data Streams

Authors: Sara Ansari, Mohammad Hadi Sadreddini

Abstract:

The increasing importance of data stream arising in a wide range of advanced applications has led to the extensive study of mining frequent patterns. Mining data streams poses many new challenges amongst which are the one-scan nature, the unbounded memory requirement and the high arrival rate of data streams. In this paper, we propose a new approach for mining itemsets on data stream. Our approach SFIDS has been developed based on FIDS algorithm. The main attempts were to keep some advantages of the previous approach and resolve some of its drawbacks, and consequently to improve run time and memory consumption. Our approach has the following advantages: using a data structure similar to lattice for keeping frequent itemsets, separating regions from each other with deleting common nodes that results in a decrease in search space, memory consumption and run time; and Finally, considering CPU constraint, with increasing arrival rate of data that result in overloading system, SFIDS automatically detect this situation and discard some of unprocessing data. We guarantee that error of results is bounded to user pre-specified threshold, based on a probability technique. Final results show that SFIDS algorithm could attain about 50% run time improvement than FIDS approach.

Keywords: Data stream, frequent itemset, stream mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1405
8076 Anomaly Detection in a Data Center with a Reconstruction Method Using a Multi-Autoencoders Model

Authors: Victor Breux, Jérôme Boutet, Alain Goret, Viviane Cattin

Abstract:

Early detection of anomalies in data centers is important to reduce downtimes and the costs of periodic maintenance. However, there is little research on this topic and even fewer on the fusion of sensor data for the detection of abnormal events. The goal of this paper is to propose a method for anomaly detection in data centers by combining sensor data (temperature, humidity, power) and deep learning models. The model described in the paper uses one autoencoder per sensor to reconstruct the inputs. The auto-encoders contain Long-Short Term Memory (LSTM) layers and are trained using the normal samples of the relevant sensors selected by correlation analysis. The difference signal between the input and its reconstruction is then used to classify the samples using feature extraction and a random forest classifier. The data measured by the sensors of a data center between January 2019 and May 2020 are used to train the model, while the data between June 2020 and May 2021 are used to assess it. Performances of the model are assessed a posteriori through F1-score by comparing detected anomalies with the data center’s history. The proposed model outperforms the state-of-the-art reconstruction method, which uses only one autoencoder taking multivariate sequences and detects an anomaly with a threshold on the reconstruction error, with an F1-score of 83.60% compared to 24.16%.

Keywords: Anomaly detection, autoencoder, data centers, deep learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 715
8075 AnQL: A Query Language for Annotation Documents

Authors: Neerja Bhatnagar, Ben A. Juliano, Renee S. Renner

Abstract:

This paper presents data annotation models at five levels of granularity (database, relation, column, tuple, and cell) of relational data to address the problem of unsuitability of most relational databases to express annotations. These models do not require any structural and schematic changes to the underlying database. These models are also flexible, extensible, customizable, database-neutral, and platform-independent. This paper also presents an SQL-like query language, named Annotation Query Language (AnQL), to query annotation documents. AnQL is simple to understand and exploits the already-existent wide knowledge and skill set of SQL.

Keywords: Annotation query language, data annotations, data annotation models, semantic data annotations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1801
8074 Using Interval Constrained Petri Nets for the Fuzzy Regulation of Quality: Case of Assembly Process Mechanics

Authors: Nabli L., Dhouibi H., Collart Dutilleul S., Craye E.

Abstract:

The indistinctness of the manufacturing processes makes that a parts cannot be realized in an absolutely exact way towards the specifications on the dimensions. It is thus necessary to assume that the effectively realized product has to belong in a very strict way to compatible intervals with a correct functioning of the parts. In this paper we present an approach based on mixing tow different characteristics theories, the fuzzy system and Petri net system. This tool has been proposed to model and control the quality in an assembly system. A robust command of a mechanical assembly process is presented as an application. This command will then have to maintain the specifications interval of parts in front of the variations. It also illustrates how the technique reacts when the product quality is high, medium, or low.

Keywords: Petri nets, production rate, performance evaluation, tolerant system, fuzzy sets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1268
8073 Machine Learning-Enabled Classification of Climbing Using Small Data

Authors: Nicholas Milburn, Yu Liang, Dalei Wu

Abstract:

Athlete performance scoring within the climbing domain presents interesting challenges as the sport does not have an objective way to assign skill. Assessing skill levels within any sport is valuable as it can be used to mark progress while training, and it can help an athlete choose appropriate climbs to attempt. Machine learning-based methods are popular for complex problems like this. The dataset available was composed of dynamic force data recorded during climbing; however, this dataset came with challenges such as data scarcity, imbalance, and it was temporally heterogeneous. Investigated solutions to these challenges include data augmentation, temporal normalization, conversion of time series to the spectral domain, and cross validation strategies. The investigated solutions to the classification problem included light weight machine classifiers KNN and SVM as well as the deep learning with CNN. The best performing model had an 80% accuracy. In conclusion, there seems to be enough information within climbing force data to accurately categorize climbers by skill.

Keywords: Classification, climbing, data imbalance, data scarcity, machine learning, time sequence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 544
8072 Dynamic Modeling of Wind Farms in the Jeju Power System

Authors: Dae-Hee Son, Sang-Hee Kang, Soon-Ryul Nam

Abstract:

In this paper, we develop a dynamic modeling of wind farms in the Jeju power system. The dynamic model of wind farms is developed to study their dynamic effects on the Jeju power system. PSS/E is used to develop the dynamic model of a wind farm composed of 1.5-MW doubly fed induction generators. The output of a wind farm is regulated based on pitch angle control, in which the two controllable parameters are speed and power references. The simulation results confirm that the pitch angle is successfully controlled, regardless of the variation in wind speed and output regulation.

Keywords: Dynamic model, Jeju power system, pitch angle control, PSS/E, wind farm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1757
8071 FCNN-MR: A Parallel Instance Selection Method Based on Fast Condensed Nearest Neighbor Rule

Authors: Lu Si, Jie Yu, Shasha Li, Jun Ma, Lei Luo, Qingbo Wu, Yongqi Ma, Zhengji Liu

Abstract:

Instance selection (IS) technique is used to reduce the data size to improve the performance of data mining methods. Recently, to process very large data set, several proposed methods divide the training set into some disjoint subsets and apply IS algorithms independently to each subset. In this paper, we analyze the limitation of these methods and give our viewpoint about how to divide and conquer in IS procedure. Then, based on fast condensed nearest neighbor (FCNN) rule, we propose a large data sets instance selection method with MapReduce framework. Besides ensuring the prediction accuracy and reduction rate, it has two desirable properties: First, it reduces the work load in the aggregation node; Second and most important, it produces the same result with the sequential version, which other parallel methods cannot achieve. We evaluate the performance of FCNN-MR on one small data set and two large data sets. The experimental results show that it is effective and practical.

Keywords: Instance selection, data reduction, MapReduce, kNN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1005