Search results for: monitoring data

7305 Methods and Algorithms of Ensuring Data Privacy in AI-Based Healthcare Systems and Technologies

Authors: Omar Farshad Jeelani, Makaire Njie, Viktoriia M. Korzhuk

Abstract:

Recently, the application of AI-powered algorithms in healthcare continues to flourish. Particularly, access to healthcare information, including patient health history, diagnostic data, and PII (Personally Identifiable Information) is paramount in the delivery of efficient patient outcomes. However, as the exchange of healthcare information between patients and healthcare providers through AI-powered solutions increases, protecting a person’s information and their privacy has become even more important. Arguably, the increased adoption of healthcare AI has resulted in a significant concentration on the security risks and protection measures to the security and privacy of healthcare data, leading to escalated analyses and enforcement. Since these challenges are brought by the use of AI-based healthcare solutions to manage healthcare data, AI-based data protection measures are used to resolve the underlying problems. Consequently, these projects propose AI-powered safeguards and policies/laws to protect the privacy of healthcare data. The project present the best-in-school techniques used to preserve data privacy of AI-powered healthcare applications. Popular privacy-protecting methods like Federated learning, cryptography techniques, differential privacy methods, and hybrid methods are discussed together with potential cyber threats, data security concerns, and prospects. Also, the project discusses some of the relevant data security acts/laws that govern the collection, storage, and processing of healthcare data to guarantee owners’ privacy is preserved. This inquiry discusses various gaps and uncertainties associated with healthcare AI data collection procedures, and identifies potential correction/mitigation measures.

Keywords: Data privacy, artificial intelligence, healthcare AI, data sharing, healthcare organizations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 120

7304 Use of Bayesian Network in Information Extraction from Unstructured Data Sources

Authors: Quratulain N. Rajput, Sajjad Haider

Abstract:

This paper applies Bayesian Networks to support information extraction from unstructured, ungrammatical, and incoherent data sources for semantic annotation. A tool has been developed that combines ontologies, machine learning, and information extraction and probabilistic reasoning techniques to support the extraction process. Data acquisition is performed with the aid of knowledge specified in the form of ontology. Due to the variable size of information available on different data sources, it is often the case that the extracted data contains missing values for certain variables of interest. It is desirable in such situations to predict the missing values. The methodology, presented in this paper, first learns a Bayesian network from the training data and then uses it to predict missing data and to resolve conflicts. Experiments have been conducted to analyze the performance of the presented methodology. The results look promising as the methodology achieves high degree of precision and recall for information extraction and reasonably good accuracy for predicting missing values.

Keywords: Information Extraction, Bayesian Network, ontology, Machine Learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2232

7303 Data Acquisition from Cell Phone using Logical Approach

Authors: Keonwoo Kim, Dowon Hong, Kyoil Chung, Jae-Cheol Ryou

Abstract:

Cell phone forensics to acquire and analyze data in the cellular phone is nowadays being used in a national investigation organization and a private company. In order to collect cellular phone flash memory data, we have two methods. Firstly, it is a logical method which acquires files and directories from the file system of the cell phone flash memory. Secondly, we can get all data from bit-by-bit copy of entire physical memory using a low level access method. In this paper, we describe a forensic tool to acquire cell phone flash memory data using a logical level approach. By our tool, we can get EFS file system and peek memory data with an arbitrary region from Korea CDMA cell phone.

Keywords: Forensics, logical method, acquisition, cell phone, flash memory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4123

7302 Data Migration Methodology from Relational to NoSQL Databases

Authors: Mohamed Hanine, Abdesadik Bendarag, Omar Boutkhoum

Abstract:

Currently, the field of data migration is very topical. As the number of applications developed rapidly, the ever-increasing volume of data collected has driven the architectural migration from Relational Database Management System (RDBMS) to NoSQL (Not Only SQL) database. This very recent technology is important enough in the field of database management. The main aim of this paper is to present a methodology for data migration from RDBMS to NoSQL database. To illustrate this methodology, we implement a software prototype using MySQL as a RDBMS and MongoDB as a NoSQL database. Although this is a hard engineering work, our results show that the proposed methodology can successfully accomplish the goal of this study.

Keywords: Data Migration, MySQL, RDBMS, NoSQL, MongoDB.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4367

7301 Performance Comparison of Particle Swarm Optimization with Traditional Clustering Algorithms used in Self-Organizing Map

Authors: Anurag Sharma, Christian W. Omlin

Abstract:

Self-organizing map (SOM) is a well known data reduction technique used in data mining. It can reveal structure in data sets through data visualization that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOM, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of an adaptive heuristic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOM. The application of our method to several standard data sets demonstrates its feasibility. PSO algorithm utilizes a so-called U-matrix of SOM to determine cluster boundaries; the results of this novel automatic method compare very favorably to boundary detection through traditional algorithms namely k-means and hierarchical based approach which are normally used to interpret the output of SOM.

Keywords: cluster boundaries, clustering, code vectors, data mining, particle swarm optimization, self-organizing maps, U-matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1910

7300 Data Hiding by Vector Quantization in Color Image

Authors: Yung-Gi Wu

Abstract:

With the growing of computer and network, digital data can be spread to anywhere in the world quickly. In addition, digital data can also be copied or tampered easily so that the security issue becomes an important topic in the protection of digital data. Digital watermark is a method to protect the ownership of digital data. Embedding the watermark will influence the quality certainly. In this paper, Vector Quantization (VQ) is used to embed the watermark into the image to fulfill the goal of data hiding. This kind of watermarking is invisible which means that the users will not conscious the existing of embedded watermark even though the embedded image has tiny difference compared to the original image. Meanwhile, VQ needs a lot of computation burden so that we adopt a fast VQ encoding scheme by partial distortion searching (PDS) and mean approximation scheme to speed up the data hiding process. The watermarks we hide to the image could be gray, bi-level and color images. Texts are also can be regarded as watermark to embed. In order to test the robustness of the system, we adopt Photoshop to fulfill sharpen, cropping and altering to check if the extracted watermark is still recognizable. Experimental results demonstrate that the proposed system can resist the above three kinds of tampering in general cases.

Keywords: Data hiding, vector quantization, watermark.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1777

7299 Use of Treated Municipal Wastewater on Artichoke Crop

Authors: Disciglio G., Gatta G., Libutti A., Tarantino A., Frabboni L., Tarantino E.

Abstract:

Results of a field study carried out at Trinitapoli (Puglia region, southern Italy) on the irrigation of an artichoke crop with three types of water (secondary-treated wastewater, SW; tertiary-treated wastewater, TW; and freshwater, FW) are reported. Physical, chemical and microbiological analyses were performed on the irrigation water, and on soil and yield samples.

The levels of most of the chemical parameters, such as electrical conductivity, total suspended solids, Na⁺, Ca²⁺, Mg⁺², K⁺, sodium adsorption ratio, chemical oxygen demand, biological oxygen demand over 5 days, NO₃ –N, total N, CO₃², HCO₃, phenols and chlorides of the applied irrigation water were significantly higher in SW compared to GW and TW. No differences were found for Mg²⁺, PO₄-P, K⁺ only between SW and TW. Although the chemical parameters of the three irrigation water sources were different, few effects on the soil were observed. Even though monitoring of Escherichia coli showed high SW levels, which were above the limits allowed under Italian law (DM 152/2006), contamination of the soil and the marketable yield were never observed. Moreover, no Salmonella spp. were detected in these irrigation waters; consequently, they were absent in the plants. Finally, the data on the quantitative-qualitative parameters of the artichoke yield with the various treatments show no significant differences between the three irrigation water sources. Therefore, if adequately treated, municipal wastewater can be used for irrigation and represents a sound alternative to conventional water resources.

Keywords: Artichoke, soil chemical characteristics, fecal indicators, treated municipal wastewater, water recycling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1866

7298 Approximate Range-Sum Queries over Data Cubes Using Cosine Transform

Authors: Wen-Chi Hou, Cheng Luo, Zhewei Jiang, Feng Yan

Abstract:

In this research, we propose to use the discrete cosine transform to approximate the cumulative distributions of data cube cells- values. The cosine transform is known to have a good energy compaction property and thus can approximate data distribution functions easily with small number of coefficients. The derived estimator is accurate and easy to update. We perform experiments to compare its performance with a well-known technique - the (Haar) wavelet. The experimental results show that the cosine transform performs much better than the wavelet in estimation accuracy, speed, space efficiency, and update easiness.

Keywords: DCT, Data Cube

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1963

7297 Digital filters for Hot-Mix Asphalt Complex Modulus Test Data Using Genetic Algorithm Strategies

Authors: Madhav V. Chitturi, Anshu Manik, Kasthurirangan Gopalakrishnan

Abstract:

The dynamic or complex modulus test is considered to be a mechanistically based laboratory test to reliably characterize the strength and load-resistance of Hot-Mix Asphalt (HMA) mixes used in the construction of roads. The most common observation is that the data collected from these tests are often noisy and somewhat non-sinusoidal. This hampers accurate analysis of the data to obtain engineering insight. The goal of the work presented in this paper is to develop and compare automated evolutionary computational techniques to filter test noise in the collection of data for the HMA complex modulus test. The results showed that the Covariance Matrix Adaptation-Evolutionary Strategy (CMA-ES) approach is computationally efficient for filtering data obtained from the HMA complex modulus test.

Keywords: HMA, dynamic modulus, GA, evolutionarycomputation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1571

7296 Web-Based Control and Notification for Home Automation Alarm Systems

Authors: Helder Adão, Rui Antunes, Frederico Grilo

Abstract:

This paper describes the project and development of a very low-cost and small electronic prototype, especially designed for monitoring and controlling existing home automation alarm systems (intruder, smoke, gas, flood, etc.), via TCP/IP, with a typical web browser. Its use will allow home owners to be immediately alerted and aware when an alarm event occurs, and being also able to interact with their home automation alarm system, disarming, arming and watching event alerts, with a personal wireless Wi-Fi PDA or smartphone logged on to a dedicated predefined web page, and using also a PC or Laptop.

Keywords: Alarm Systems, Home Automation, Web-Server, TCP/IP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3202

7295 School Emergency Drills Evaluation through E-PreS Monitoring System

Authors: A. Kourou, A. Ioakeimidou, V. Avramea

Abstract:

Planning for natural disasters and emergencies is something every school or educational institution must consider, regardless of its size or location. Preparedness is the key to save lives if a disaster strikes. School disaster management mirrors individual and family disaster prevention, and wider community disaster prevention efforts. This paper presents the usage of E-PreS System as a helpful, managerial tool during the school earthquake drill, in order to support schools in developing effective disaster and emergency plans specific to their local needs. The project comes up with a holistic methodology using real-time evaluation involving different categories of actors, districts, steps and metrics. The main outcomes of E-PreS project are the development of E-PreS web platform that host the needed data of school emergency planning; the development of E-PreS System; the implementation of disaster drills using E-PreS System in educational premises and local schools; and the evaluation of E-PreS System. Taking into consideration that every disaster drill aims to test and valid school plan and procedures; clarify and train personnel in roles and responsibilities; improve interagency coordination; identify gaps in resources; improve individual performance; and identify opportunities for improvement, E-PreS Project was submitted and approved by the European Commission (EC).

Keywords: Disaster drills, earthquake preparedness, E-PreS system, school emergency plans.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1129

7294 The Feasibility of Augmenting an Augmented Reality Image Card on a Quick Response Code

Authors: Alfred Chen, Shr Yu Lu, Cong Seng Hong, Yur-June Wang

Abstract:

This research attempts to study the feasibility of augmenting an augmented reality (AR) image card on a Quick Response (QR) code. The authors have developed a new visual tag, which contains a QR code and an augmented AR image card. The new visual tag has features of reading both of the revealed data of the QR code and the instant data from the AR image card. Furthermore, a handheld communicating device is used to read and decode the new visual tag, and then the concealed data of the new visual tag can be revealed and read through its visual display. In general, the QR code is designed to store the corresponding data or, as a key, to access the corresponding data from the server through internet. Those reveled data from the QR code are represented in text. Normally, the AR image card is designed to store the corresponding data in 3-Dimensional or animation/video forms. By using QR code's property of high fault tolerant rate, the new visual tag can access those two different types of data by using a handheld communicating device. The new visual tag has an advantage of carrying much more data than independent QR code or AR image card. The major findings of this research are: 1) the most efficient area for the designed augmented AR card augmenting on the QR code is 9% coverage area out of the total new visual tag-s area, and 2) the best location for the augmented AR image card augmenting on the QR code is located in the bottom-right corner of the new visual tag.

Keywords: Augmented reality, QR code, Visual tag, Handheldcommunicating device

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1556

7293 Design and Application of NFC-Based Identity and Access Management in Cloud Services

Authors: Shin-Jer Yang, Kai-Tai Yang

Abstract:

In response to a changing world and the fast growth of the Internet, more and more enterprises are replacing web-based services with cloud-based ones. Multi-tenancy technology is becoming more important especially with Software as a Service (SaaS). This in turn leads to a greater focus on the application of Identity and Access Management (IAM). Conventional Near-Field Communication (NFC) based verification relies on a computer browser and a card reader to access an NFC tag. This type of verification does not support mobile device login and user-based access management functions. This study designs an NFC-based third-party cloud identity and access management scheme (NFC-IAM) addressing this shortcoming. Data from simulation tests analyzed with Key Performance Indicators (KPIs) suggest that the NFC-IAM not only takes less time in identity identification but also cuts time by 80% in terms of two-factor authentication and improves verification accuracy to 99.9% or better. In functional performance analyses, NFC-IAM performed better in salability and portability. The NFC-IAM App (Application Software) and back-end system to be developed and deployed in mobile device are to support IAM features and also offers users a more user-friendly experience and stronger security protection. In the future, our NFC-IAM can be employed to different environments including identification for mobile payment systems, permission management for remote equipment monitoring, among other applications.

Keywords: Cloud service, multi-tenancy, NFC, IAM, mobile device.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1118

7292 A Competitive Replica Placement Methodology for Ad Hoc Networks

Authors: Samee Ullah Khan, C. Ardil

Abstract:

In this paper, a mathematical model for data object replication in ad hoc networks is formulated. The derived model is general, flexible and adaptable to cater for various applications in ad hoc networks. We propose a game theoretical technique in which players (mobile hosts) continuously compete in a non-cooperative environment to improve data accessibility by replicating data objects. The technique incorporates the access frequency from mobile hosts to each data object, the status of the network connectivity, and communication costs. The proposed technique is extensively evaluated against four well-known ad hoc network replica allocation methods. The experimental results reveal that the proposed approach outperforms the four techniques in both the execution time and solution quality

Keywords: Data replication, auctions, static allocation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1402

7291 Multidimensional Data Mining by Means of Randomly Travelling Hyper-Ellipsoids

Authors: Pavel Y. Tabakov, Kevin Duffy

Abstract:

The present study presents a new approach to automatic data clustering and classification problems in large and complex databases and, at the same time, derives specific types of explicit rules describing each cluster. The method works well in both sparse and dense multidimensional data spaces. The members of the data space can be of the same nature or represent different classes. A number of N-dimensional ellipsoids are used for enclosing the data clouds. Due to the geometry of an ellipsoid and its free rotation in space the detection of clusters becomes very efficient. The method is based on genetic algorithms that are used for the optimization of location, orientation and geometric characteristics of the hyper-ellipsoids. The proposed approach can serve as a basis for the development of general knowledge systems for discovering hidden knowledge and unexpected patterns and rules in various large databases.

Keywords: Classification, clustering, data minig, genetic algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1773

7290 Predictions Using Data Mining and Case-based Reasoning: A Case Study for Retinopathy

Authors: Vimala Balakrishnan, Mohammad R. Shakouri, Hooman Hoodeh, Loo, Huck-Soo

Abstract:

Diabetes is one of the high prevalence diseases worldwide with increased number of complications, with retinopathy as one of the most common one. This paper describes how data mining and case-based reasoning were integrated to predict retinopathy prevalence among diabetes patients in Malaysia. The knowledge base required was built after literature reviews and interviews with medical experts. A total of 140 diabetes patients- data were used to train the prediction system. A voting mechanism selects the best prediction results from the two techniques used. It has been successfully proven that both data mining and case-based reasoning can be used for retinopathy prediction with an improved accuracy of 85%.

Keywords: Case-Based Reasoning, Data Mining, Prediction, Retinopathy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3022

7289 English Language Learning Strategies Used by University Students: A Case Study of English and Business English Major at Suan Sunandha Rajabhat in Bangkok

Authors: Pranee Pathomchaiwat

Abstract:

The purposes of this research are 1) to study English language learning strategies used by the fourth-year students majoring in English and Business English, 2) to study the English language learning strategies which have an affect on English learning achievement, and 3) to compare the English language learning strategies used by the students majoring in English and Business English. The population and sampling comprise of 139 university students of the Suan Sunandha Rajabhat University. Research instruments are language learning strategies questionnaire which was constructed by the researcher and improved on by three experts and the transcripts that show the results of English learning achievement. The questionnaire includes 1) Language Practice Strategy 2)Memory Strategy 3) Communication Strategy 4)Making an Intelligent Guess or Compensation Strategy 5) Self-discipline in Learning Management Strategy 6) Affective Strategy 7)Self-Monitoring Strategy 8) Self-studySkill Strategy. Statistics used in the study are mean, standard deviation, T-test and One Way ANOVA, Pearson product moment correlation coefficient and Regression Analysis. The results of the findings reveal that the English language learning strategies most frequently used by the students are affective strategy, making an intelligent guess or compensation strategy, self-studyskill strategy and self-monitoring strategy respectively. The aspect of making an intelligent guess or compensation strategy had the most significant affect on English learning achievement. It is found that the English language learning strategies mostly used by the Business English major students and moderately used by the English major students. Their language practice strategies uses were significantly different at the 0.05 level and their communication strategies uses were significantly different at the 0.01 level. In addition, it is found that the poor students and the fair ones most frequently used affective strategy while the good ones most frequently used making an intelligent guess or compensation strategy. KeywordsEnglish language, language learning strategies, English learning achievement, and students majoring in English, Business English. Pranee Pathomchaiwat is an Assistant Professor in Business English Program, Suan Sunandha Rajabhat University, Bangkok, Thailand (e-mail: [email protected]).

Keywords: English language, language learning strategies, English learning achievement, students majoring in English, Business English

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3823

7288 Zero Truncated Strict Arcsine Model

Authors: Y. N. Phang, E. F. Loh

Abstract:

The zero truncated model is usually used in modeling count data without zero. It is the opposite of zero inflated model. Zero truncated Poisson and zero truncated negative binomial models are discussed and used by some researchers in analyzing the abundance of rare species and hospital stay. Zero truncated models are used as the base in developing hurdle models. In this study, we developed a new model, the zero truncated strict arcsine model, which can be used as an alternative model in modeling count data without zero and with extra variation. Two simulated and one real life data sets are used and fitted into this developed model. The results show that the model provides a good fit to the data. Maximum likelihood estimation method is used in estimating the parameters.

Keywords: Hurdle models, maximum likelihood estimation method, positive count data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1857

7287 Li-Fi Technology: Data Transmission through Visible Light

Authors: Shahzad Hassan, Kamran Saeed

Abstract:

People are always in search of Wi-Fi hotspots because Internet is a major demand nowadays. But like all other technologies, there is still room for improvement in the Wi-Fi technology with regards to the speed and quality of connectivity. In order to address these aspects, Harald Haas, a professor at the University of Edinburgh, proposed what we know as the Li-Fi (Light Fidelity). Li-Fi is a new technology in the field of wireless communication to provide connectivity within a network environment. It is a two-way mode of wireless communication using light. Basically, the data is transmitted through Light Emitting Diodes which can vary the intensity of light very fast, even faster than the blink of an eye. From the research and experiments conducted so far, it can be said that Li-Fi can increase the speed and reliability of the transfer of data. This paper pays particular attention on the assessment of the performance of this technology. In other words, it is a 5G technology which uses LED as the medium of data transfer. For coverage within the buildings, Wi-Fi is good but Li-Fi can be considered favorable in situations where large amounts of data are to be transferred in areas with electromagnetic interferences. It brings a lot of data related qualities such as efficiency, security as well as large throughputs to the table of wireless communication. All in all, it can be said that Li-Fi is going to be a future phenomenon where the presence of light will mean access to the Internet as well as speedy data transfer.

Keywords: Communication, LED, Li-Fi, Wi-Fi.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2171

7286 Business Rules for Data Warehouse

Authors: Rajeev Kaula

Abstract:

Business rules and data warehouse are concepts and technologies that impact a wide variety of organizational tasks. In general, each area has evolved independently, impacting application development and decision-making. Generating knowledge from data warehouse is a complex process. This paper outlines an approach to ease import of information and knowledge from a data warehouse star schema through an inference class of business rules. The paper utilizes the Oracle database for illustrating the working of the concepts. The star schema structure and the business rules are stored within a relational database. The approach is explained through a prototype in Oracle-s PL/SQL Server Pages.

Keywords: Business Rules, Data warehouse, PL/SQL ServerPages, Relational model, Web Application.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2985

7285 Authorization of Commercial Communication Satellite Grounds for Promoting Turkish Data Relay System

Authors: Celal Dudak, Aslı Utku, Burak Yağlioğlu

Abstract:

Uninterrupted and continuous satellite communication through the whole orbit time is becoming more indispensable every day. Data relay systems are developed and built for various high/low data rate information exchanges like TDRSS of USA and EDRSS of Europe. In these missions, a couple of task-dedicated communication satellites exist. In this regard, for Turkey a data relay system is attempted to be defined exchanging low data rate information (i.e. TTC) for Earth-observing LEO satellites appointing commercial GEO communication satellites all over the world. First, justification of this attempt is given, demonstrating duration enhancements in the link. Discussion of preference of RF communication is, also, given instead of laser communication. Then, preferred communication GEOs – including TURKSAT4A already belonging to Turkey- are given, together with the coverage enhancements through STK simulations and the corresponding link budget. Also, a block diagram of the communication system is given on the LEO satellite.

Keywords: Communication, satellite, data relay system, coverage.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1416

7284 An Efficient Approach to Mining Frequent Itemsets on Data Streams

Authors: Sara Ansari, Mohammad Hadi Sadreddini

Abstract:

The increasing importance of data stream arising in a wide range of advanced applications has led to the extensive study of mining frequent patterns. Mining data streams poses many new challenges amongst which are the one-scan nature, the unbounded memory requirement and the high arrival rate of data streams. In this paper, we propose a new approach for mining itemsets on data stream. Our approach SFIDS has been developed based on FIDS algorithm. The main attempts were to keep some advantages of the previous approach and resolve some of its drawbacks, and consequently to improve run time and memory consumption. Our approach has the following advantages: using a data structure similar to lattice for keeping frequent itemsets, separating regions from each other with deleting common nodes that results in a decrease in search space, memory consumption and run time; and Finally, considering CPU constraint, with increasing arrival rate of data that result in overloading system, SFIDS automatically detect this situation and discard some of unprocessing data. We guarantee that error of results is bounded to user pre-specified threshold, based on a probability technique. Final results show that SFIDS algorithm could attain about 50% run time improvement than FIDS approach.

Keywords: Data stream, frequent itemset, stream mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1420

7283 Anomaly Detection in a Data Center with a Reconstruction Method Using a Multi-Autoencoders Model

Authors: Victor Breux, Jérôme Boutet, Alain Goret, Viviane Cattin

Abstract:

Early detection of anomalies in data centers is important to reduce downtimes and the costs of periodic maintenance. However, there is little research on this topic and even fewer on the fusion of sensor data for the detection of abnormal events. The goal of this paper is to propose a method for anomaly detection in data centers by combining sensor data (temperature, humidity, power) and deep learning models. The model described in the paper uses one autoencoder per sensor to reconstruct the inputs. The auto-encoders contain Long-Short Term Memory (LSTM) layers and are trained using the normal samples of the relevant sensors selected by correlation analysis. The difference signal between the input and its reconstruction is then used to classify the samples using feature extraction and a random forest classifier. The data measured by the sensors of a data center between January 2019 and May 2020 are used to train the model, while the data between June 2020 and May 2021 are used to assess it. Performances of the model are assessed a posteriori through F1-score by comparing detected anomalies with the data center’s history. The proposed model outperforms the state-of-the-art reconstruction method, which uses only one autoencoder taking multivariate sequences and detects an anomaly with a threshold on the reconstruction error, with an F1-score of 83.60% compared to 24.16%.

Keywords: Anomaly detection, autoencoder, data centers, deep learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 744

7282 AnQL: A Query Language for Annotation Documents

Authors: Neerja Bhatnagar, Ben A. Juliano, Renee S. Renner

Abstract:

This paper presents data annotation models at five levels of granularity (database, relation, column, tuple, and cell) of relational data to address the problem of unsuitability of most relational databases to express annotations. These models do not require any structural and schematic changes to the underlying database. These models are also flexible, extensible, customizable, database-neutral, and platform-independent. This paper also presents an SQL-like query language, named Annotation Query Language (AnQL), to query annotation documents. AnQL is simple to understand and exploits the already-existent wide knowledge and skill set of SQL.

Keywords: Annotation query language, data annotations, data annotation models, semantic data annotations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1847

7281 Machine Learning-Enabled Classification of Climbing Using Small Data

Authors: Nicholas Milburn, Yu Liang, Dalei Wu

Abstract:

Athlete performance scoring within the climbing domain presents interesting challenges as the sport does not have an objective way to assign skill. Assessing skill levels within any sport is valuable as it can be used to mark progress while training, and it can help an athlete choose appropriate climbs to attempt. Machine learning-based methods are popular for complex problems like this. The dataset available was composed of dynamic force data recorded during climbing; however, this dataset came with challenges such as data scarcity, imbalance, and it was temporally heterogeneous. Investigated solutions to these challenges include data augmentation, temporal normalization, conversion of time series to the spectral domain, and cross validation strategies. The investigated solutions to the classification problem included light weight machine classifiers KNN and SVM as well as the deep learning with CNN. The best performing model had an 80% accuracy. In conclusion, there seems to be enough information within climbing force data to accurately categorize climbers by skill.

Keywords: Classification, climbing, data imbalance, data scarcity, machine learning, time sequence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 569

7280 Numerical Model of Low Cost Rubber Isolators for Masonry Housing in High Seismic Regions

Authors: Ahmad B. Habieb, Gabriele Milani, Tavio Tavio, Federico Milani

Abstract:

Housings in developing countries have often inadequate seismic protection, particularly for masonry. People choose this type of structure since the cost and application are relatively cheap. Seismic protection of masonry remains an interesting issue among researchers. In this study, we develop a low-cost seismic isolation system for masonry using fiber reinforced elastomeric isolators. The elastomer proposed consists of few layers of rubber pads and fiber lamina, making it lower in cost comparing to the conventional isolators. We present a finite element (FE) analysis to predict the behavior of the low cost rubber isolators undergoing moderate deformations. The FE model of the elastomer involves a hyperelastic material property for the rubber pad. We adopt a Yeoh hyperelasticity model and estimate its coefficients through the available experimental data. Having the shear behavior of the elastomers, we apply that isolation system onto small masonry housing. To attach the isolators on the building, we model the shear behavior of the isolation system by means of a damped nonlinear spring model. By this attempt, the FE analysis becomes computationally inexpensive. Several ground motion data are applied to observe its sensitivity. Roof acceleration and tensile damage of walls become the parameters to evaluate the performance of the isolators. In this study, a concrete damage plasticity model is used to model masonry in the nonlinear range. This tool is available in the standard package of Abaqus FE software. Finally, the results show that the low-cost isolators proposed are capable of reducing roof acceleration and damage level of masonry housing. Through this study, we are also capable of monitoring the shear deformation of isolators during seismic motion. It is useful to determine whether the isolator is applicable. According to the results, the deformations of isolators on the benchmark one story building are relatively small.

Keywords: Masonry, low cost elastomeric isolator, finite element analysis, hyperelasticity, damped non-linear spring, concrete damage plasticity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1187

7279 Evaluation of Water Quality of the Beshar River

Authors: Fardin Boustani, Mohammah Hosein Hojati, Masoud Hashemi

Abstract:

The Beshar River is one aquatic ecosystem, which is located next to the city of Yasuj in southern Iran. The Beshar river has been contaminated by industrial factories such as effluent of sugar factory, agricultural and other activities in this region such as, Imam Sajjad hospital, drainage from agricultural farms, Yasuj urban surface runoff and effluent of wastewater treatment plants ,specially Yasuj waste water treatment plant. In order to evaluate the effects of these pollutants on the quality of the Beshar river, five monitoring stations were selected along its course. The first station is located upstream of Yasuj near the Dehnow village; stations 2 to 4 are located east, south and west of city; and the 5th station is located downstream of Yasuj. Several water quality parameters were sampled. These include pH, dissolved oxygen, biological oxygen demand (BOD), temperature, conductivity, turbidity, total dissolved solids and discharge or flow measurements. Water samples from the five stations were collected and analyzed to determine the following physicochemical parameters: EC, pH, T.D.S, T.H, No2, DO, BOD5, COD during 2008 to 2010. The study shows that the BOD5 value of station 1 is at a minimum (1.7 ppm) and increases downstream from stations 2 to 4 to a maximum (11.6 ppm), and then decreases at station 5. The DO values of station 1 is a maximum (8.45 ppm), decreases downstream to stations 2 - 4 which are at a minimum (3.1 ppm), before increasing at station 5. The amount of BOD and TDS are highest at the 4th station and the amount of DO is lowest at this station, marking the 4th station as more highly polluted than the other stations .This study shows average amount of the water quality parameters in first year of sampling (2008) have had a better quality relation to third year in 2010 because of recent drought in this region and pollutant increasing .As the Beshar river path after 5th station goes through the mountain area with more slope and flow velocity ,so the physicochemical parameters improve at the 5th station due to pollutant degradation and dilution. Finally the point and nonpoint pollutant sources of Beshar river were determined and compared to the monitoring results.

Keywords: Beshar river, physicochemical parameter, waterpollution, water quality, Yasuj

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1650

7278 FCNN-MR: A Parallel Instance Selection Method Based on Fast Condensed Nearest Neighbor Rule

Authors: Lu Si, Jie Yu, Shasha Li, Jun Ma, Lei Luo, Qingbo Wu, Yongqi Ma, Zhengji Liu

Abstract:

Instance selection (IS) technique is used to reduce the data size to improve the performance of data mining methods. Recently, to process very large data set, several proposed methods divide the training set into some disjoint subsets and apply IS algorithms independently to each subset. In this paper, we analyze the limitation of these methods and give our viewpoint about how to divide and conquer in IS procedure. Then, based on fast condensed nearest neighbor (FCNN) rule, we propose a large data sets instance selection method with MapReduce framework. Besides ensuring the prediction accuracy and reduction rate, it has two desirable properties: First, it reduces the work load in the aggregation node; Second and most important, it produces the same result with the sequential version, which other parallel methods cannot achieve. We evaluate the performance of FCNN-MR on one small data set and two large data sets. The experimental results show that it is effective and practical.

Keywords: Instance selection, data reduction, MapReduce, kNN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1018

7277 Computation of Natural Logarithm Using Abstract Chemical Reaction Networks

Authors: Iuliia Zarubiieva, Joyun Tseng, Vishwesh Kulkarni

Abstract:

Recent researches has focused on nucleic acids as a substrate for designing biomolecular circuits for in situ monitoring and control. A common approach is to express them by a set of idealised abstract chemical reaction networks (ACRNs). Here, we present new results on how abstract chemical reactions, viz., catalysis, annihilation and degradation, can be used to implement circuit that accurately computes logarithm function using the method of Arithmetic-Geometric Mean (AGM), which has not been previously used in conjunction with ACRNs.

Keywords: Abstract chemical reaction network, DNA strand displacement, natural logarithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1028

7276 Methods for Material and Process Monitoring by Characterization of (Second and Third Order) Elastic Properties with Lamb Waves

Authors: R. Meier, M. Pander

Abstract:

In accordance with the industry 4.0 concept, manufacturing process steps as well as the materials themselves are going to be more and more digitalized within the next years. The “digital twin” representing the simulated and measured dataset of the (semi-finished) product can be used to control and optimize the individual processing steps and help to reduce costs and expenditure of time in product development, manufacturing, and recycling. In the present work, two material characterization methods based on Lamb waves were evaluated and compared. For demonstration purpose, both methods were shown at a standard industrial product - copper ribbons, often used in photovoltaic modules as well as in high-current microelectronic devices. By numerical approximation of the Rayleigh-Lamb dispersion model on measured phase velocities second order elastic constants (Young’s modulus, Poisson’s ratio) were determined. Furthermore, the effective third order elastic constants were evaluated by applying elastic, “non-destructive”, mechanical stress on the samples. In this way, small microstructural variations due to mechanical preconditioning could be detected for the first time. Both methods were compared with respect to precision and inline application capabilities. Microstructure of the samples was systematically varied by mechanical loading and annealing. Changes in the elastic ultrasound transport properties were correlated with results from microstructural analysis and mechanical testing. In summary, monitoring the elastic material properties of plate-like structures using Lamb waves is valuable for inline and non-destructive material characterization and manufacturing process control. Second order elastic constants analysis is robust over wide environmental and sample conditions, whereas the effective third order elastic constants highly increase the sensitivity with respect to small microstructural changes. Both Lamb wave based characterization methods are fitting perfectly into the industry 4.0 concept.

Keywords: Lamb waves, industry 4.0, process control, elasticity, acoustoelasticity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1099