Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 6979

Search results for: Data tables

6979 A Methodology for Data Migration between Different Database Management Systems

Authors: Bogdan Walek, Cyril Klimes

Abstract:

In present days the area of data migration is very topical. Current tools for data migration in the area of relational database have several disadvantages that are presented in this paper. We propose a methodology for data migration of the database tables and their data between various types of relational database systems (RDBMS). The proposed methodology contains an expert system. The expert system contains a knowledge base that is composed of IFTHEN rules and based on the input data suggests appropriate data types of columns of database tables. The proposed tool, which contains an expert system, also includes the possibility of optimizing the data types in the target RDBMS database tables based on processed data of the source RDBMS database tables. The proposed expert system is shown on data migration of selected database of the source RDBMS to the target RDBMS.

Keywords: Expert system, fuzzy, data migration, database, relational database, data type, relational database management system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3024
6978 Dynamic Bus Binding for Low Power Using Multiple Binding Tables

Authors: Jihyung Kim, Taejin Kim, Sungho Park, Jun-Dong Cho

Abstract:

A conventional binding method for low power in a high-level synthesis mainly focuses on finding an optimal binding for an assumed input data, and obtains only one binding table. In this paper, we show that a binding method which uses multiple binding tables gets better solution compared with the conventional methods which use a single binding table, and propose a dynamic bus binding scheme for low power using multiple binding tables. The proposed method finds multiple binding tables for the proper partitions of an input data, and switches binding tables dynamically to produce the minimum total switching activity. Experimental result shows that the proposed method obtains a binding solution having 12.6-28.9% smaller total switching activity compared with the conventional methods.

Keywords: low power, bus binding, switching activity, multiplebinding tables

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 956
6977 Spatial Data Mining by Decision Trees

Authors: S. Oujdi, H. Belbachir

Abstract:

Existing methods of data mining cannot be applied on spatial data because they require spatial specificity consideration, as spatial relationships. This paper focuses on the classification with decision trees, which are one of the data mining techniques. We propose an extension of the C4.5 algorithm for spatial data, based on two different approaches Join materialization and Querying on the fly the different tables. Similar works have been done on these two main approaches, the first - Join materialization - favors the processing time in spite of memory space, whereas the second - Querying on the fly different tables- promotes memory space despite of the processing time. The modified C4.5 algorithm requires three entries tables: a target table, a neighbor table, and a spatial index join that contains the possible spatial relationship among the objects in the target table and those in the neighbor table. Thus, the proposed algorithms are applied to a spatial data pattern in the accidentology domain. A comparative study of our approach with other works of classification by spatial decision trees will be detailed.

Keywords: C4.5 Algorithm, Decision trees, S-CART, Spatial data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2532
6976 Balanced k-Anonymization

Authors: Sabah S. Al-Fedaghi

Abstract:

The technique of k-anonymization has been proposed to obfuscate private data through associating it with at least k identities. This paper investigates the basic tabular structures that underline the notion of k-anonymization using cell suppression. These structures are studied under idealized conditions to identify the essential features of the k-anonymization notion. We optimize data kanonymization through requiring a minimum number of anonymized values that are balanced over all columns and rows. We study the relationship between the sizes of the anonymized tables, the value k, and the number of attributes. This study has a theoretical value through contributing to develop a mathematical foundation of the kanonymization concept. Its practical significance is still to be investigated.

Keywords: Balanced tables, k-anonymization, private data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 984
6975 Migration of the Relational Data Base (RDB) to the Object Relational Data Base (ORDB)

Authors: Alae El Alami, Mohamed Bahaj

Abstract:

This paper proposes an approach for translating an existing relational database (RDB) schema into ORDB. The transition is done with methods that can extract various functions from a RDB which is based on aggregations, associations between the various tables, and the reflexive relationships. These methods can extract even the inheritance knowing that no process of reverse engineering can know that it is an Inheritance; therefore, our approach exceeded all of the previous studies made for ​​the transition from RDB to ORDB. In summation, the creation of the New Data Model (NDM) that stocks the RDB in a form of a structured table, and from the NDM we create our navigational model in order to simplify the implementation object from which we develop our different types. Through these types we precede to the last step, the creation of tables.

The step mentioned above does not require any human interference. All this is done automatically, and a prototype has already been created which proves the effectiveness of this approach.

Keywords: Relational databases, Object-relational databases, Semantic enrichment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1700
6974 Developing Structured Sizing Systems for Manufacturing Ready-Made Garments of Indian Females Using Decision Tree-Based Data Mining

Authors: Hina Kausher, Sangita Srivastava

Abstract:

In India, there is a lack of standard, systematic sizing approach for producing readymade garments. Garments manufacturing companies use their own created size tables by modifying international sizing charts of ready-made garments. The purpose of this study is to tabulate the anthropometric data which cover the variety of figure proportions in both height and girth. 3,000 data have been collected by an anthropometric survey undertaken over females between the ages of 16 to 80 years from the some states of India to produce the sizing system suitable for clothing manufacture and retailing. The data are used for the statistical analysis of body measurements, the formulation of sizing systems and body measurements tables. Factor analysis technique is used to filter the control body dimensions from the large number of variables. Decision tree-based data mining is used to cluster the data. The standard and structured sizing system can facilitate pattern grading and garment production. Moreover, it can exceed buying ratios and upgrade size allocations to retail segments.

Keywords: Anthropometric data, data mining, decision tree, garments manufacturing, ready-made garments, sizing systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 435
6973 A Very Efficient Pseudo-Random Number Generator Based On Chaotic Maps and S-Box Tables

Authors: M. Hamdi, R. Rhouma, S. Belghith

Abstract:

Generating random numbers are mainly used to create secret keys or random sequences. It can be carried out by various techniques. In this paper we present a very simple and efficient pseudo random number generator (PRNG) based on chaotic maps and S-Box tables. This technique adopted two main operations one to generate chaotic values using two logistic maps and the second to transform them into binary words using random S-Box tables. The simulation analysis indicates that our PRNG possessing excellent statistical and cryptographic properties.

Keywords: Chaotic map, Cryptography, Random Numbers, Statistical tests, S-box.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3241
6972 Data and Spatial Analysis for Economy and Education of 28 E.U. Member-States for 2014

Authors: Alexiou Dimitra, Fragkaki Maria

Abstract:

The objective of the paper is the study of geographic, economic and educational variables and their contribution to determine the position of each member-state among the EU-28 countries based on the values of seven variables as given by Eurostat. The Data Analysis methods of Multiple Factorial Correspondence Analysis (MFCA) Principal Component Analysis and Factor Analysis have been used. The cross tabulation tables of data consist of the values of seven variables for the 28 countries for 2014. The data are manipulated using the CHIC Analysis V 1.1 software package. The results of this program using MFCA and Ascending Hierarchical Classification are given in arithmetic and graphical form. For comparison reasons with the same data the Factor procedure of Statistical package IBM SPSS 20 has been used. The numerical and graphical results presented with tables and graphs, demonstrate the agreement between the two methods. The most important result is the study of the relation between the 28 countries and the position of each country in groups or clouds, which are formed according to the values of the corresponding variables.

Keywords: Multiple factorial correspondence analysis, principal component analysis, factor analysis, E.U.-28 countries, statistical package IBM SPSS 20, CHIC Analysis V 1.1 Software, Eurostat.eu statistics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 764
6971 Metadata Update Mechanism Improvements in Data Grid

Authors: S. Farokhzad, M. Reza Salehnamadi

Abstract:

Grid environments include aggregation of geographical distributed resources. Grid is put forward in three types of computational, data and storage. This paper presents a research on data grid. Data grid is used for covering and securing accessibility to data from among many heterogeneous sources. Users are not worry on the place where data is located in it, provided that, they should get access to the data. Metadata is used for getting access to data in data grid. Presently, application metadata catalogue and SRB middle-ware package are used in data grids for management of metadata. At this paper, possibility of updating, streamlining and searching is provided simultaneously and rapidly through classified table of preserving metadata and conversion of each table to numerous tables. Meanwhile, with regard to the specific application, the most appropriate and best division is set and determined. Concurrency of implementation of some of requests and execution of pipeline is adaptability as a result of this technique.

Keywords: Grids, data grid, metadata, update.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1424
6970 Programming Language Extension Using Structured Query Language for Database Access

Authors: Chapman Eze Nnadozie

Abstract:

Relational databases constitute a very vital tool for the effective management and administration of both personal and organizational data. Data access ranges from a single user database management software to a more complex distributed server system. This paper intends to appraise the use a programming language extension like structured query language (SQL) to establish links to a relational database (Microsoft Access 2013) using Visual C++ 9 programming language environment. The methodology used involves the creation of tables to form a database using Microsoft Access 2013, which is Object Linking and Embedding (OLE) database compliant. The SQL command is used to query the tables in the database for easy extraction of expected records inside the visual C++ environment. The findings of this paper reveal that records can easily be accessed and manipulated to filter exactly what the user wants, such as retrieval of records with specified criteria, updating of records, and deletion of part or the whole records in a table.

Keywords: Data access, database, database management system, OLE, programming language, records, relational database, software, SQL, table.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 440
6969 Concurrency without Locking in Parallel Hash Structures used for Data Processing

Authors: Ákos Dudás, Sándor Juhász

Abstract:

Various mechanisms providing mutual exclusion and thread synchronization can be used to support parallel processing within a single computer. Instead of using locks, semaphores, barriers or other traditional approaches in this paper we focus on alternative ways for making better use of modern multithreaded architectures and preparing hash tables for concurrent accesses. Hash structures will be used to demonstrate and compare two entirely different approaches (rule based cooperation and hardware synchronization support) to an efficient parallel implementation using traditional locks. Comparison includes implementation details, performance ranking and scalability issues. We aim at understanding the effects the parallelization schemes have on the execution environment with special focus on the memory system and memory access characteristics.

Keywords: Lock-free synchronization, mutual exclusion, parallel hash tables, parallel performance

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1550
6968 Reduce the Complexity of Material Requirement Planning on Excel by an Algorithm

Authors: Sumitra Nuanmeesri, Kanate Ploydanai

Abstract:

Many companies have excel, it is economy and well perform to use in material requirement planning (MRP) on excel. For several products, it, however, is complex problem to link the relationship between the tables of products because the relationship depends on bill of material (BOM). This paper presents algorithm to create MRP on excel, and links relationship between tables. The study reveals MRP that is created by the algorithm which is easier and faster than MRP that created by human. By this technique, MRP on excel might be good ways to improve a productivity of companies.

Keywords: Material requirement planning, Algorithm, Spreadsheet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2973
6967 DD Models for Reports Building

Authors: Ljerka Hrženjak-Šego, Željko Polić, Zdravka Aljinović

Abstract:

In general, reports are a form of representing data in such way that user gets the information he needs. They can be built in various ways, from the simplest (“select from") to the most complex ones (results derived from different sources/tables with complex formulas applied). Furthermore, rules of calculations could be written as a program hard code or built in the database to be used by dynamic code. This paper will introduce two types of reports, defined in the DB structure. The main goal is to manage calculations in optimal way, keeping maintenance of reports as simple and smooth as possible.

Keywords: Data Definition diagram, Server Model Diagram, system modelling, reports.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1049
6966 Design and Development of Real-Time Optimal Energy Management System for Hybrid Electric Vehicles

Authors: Masood Roohi, Amir Taghavipour

Abstract:

This paper describes a strategy to develop an energy management system (EMS) for a charge-sustaining power-split hybrid electric vehicle. This kind of hybrid electric vehicles (HEVs) benefit from the advantages of both parallel and series architecture. However, it gets relatively more complicated to manage power flow between the battery and the engine optimally. The applied strategy in this paper is based on nonlinear model predictive control approach. First of all, an appropriate control-oriented model which was accurate enough and simple was derived. Towards utilization of this controller in real-time, the problem was solved off-line for a vast area of reference signals and initial conditions and stored the computed manipulated variables inside look-up tables. Look-up tables take a little amount of memory. Also, the computational load dramatically decreased, because to find required manipulated variables the controller just needed a simple interpolation between tables.

Keywords: Hybrid electric vehicles, energy management system, nonlinear model predictive control, real-time.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 965
6965 Exploiting Machine Learning Techniques for the Enhancement of Acceptance Sampling

Authors: Aikaterini Fountoulaki, Nikos Karacapilidis, Manolis Manatakis

Abstract:

This paper proposes an innovative methodology for Acceptance Sampling by Variables, which is a particular category of Statistical Quality Control dealing with the assurance of products quality. Our contribution lies in the exploitation of machine learning techniques to address the complexity and remedy the drawbacks of existing approaches. More specifically, the proposed methodology exploits Artificial Neural Networks (ANNs) to aid decision making about the acceptance or rejection of an inspected sample. For any type of inspection, ANNs are trained by data from corresponding tables of a standard-s sampling plan schemes. Once trained, ANNs can give closed-form solutions for any acceptance quality level and sample size, thus leading to an automation of the reading of the sampling plan tables, without any need of compromise with the values of the specific standard chosen each time. The proposed methodology provides enough flexibility to quality control engineers during the inspection of their samples, allowing the consideration of specific needs, while it also reduces the time and the cost required for these inspections. Its applicability and advantages are demonstrated through two numerical examples.

Keywords: Acceptance Sampling, Neural Networks, Statistical Quality Control.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1370
6964 Materialized View Effect on Query Performance

Authors: Yusuf Ziya Ayık, Ferhat Kahveci

Abstract:

Currently, database management systems have various tools such as backup and maintenance, and also provide statistical information such as resource usage and security. In terms of query performance, this paper covers query optimization, views, indexed tables, pre-computation materialized view, query performance analysis in which query plan alternatives can be created and the least costly one selected to optimize a query. Indexes and views can be created for related table columns. The literature review of this study showed that, in the course of time, despite the growing capabilities of the database management system, only database administrators are aware of the need for dealing with archival and transactional data types differently. These data may be constantly changing data used in everyday life, and also may be from the completed questionnaire whose data input was completed. For both types of data, the database uses its capabilities; but as shown in the findings section, instead of repeating similar heavy calculations which are carrying out same results with the same query over a survey results, using materialized view results can be in a more simple way. In this study, this performance difference was observed quantitatively considering the cost of the query.

Keywords: Materialized view, pre-computation, query cost, query performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 806
6963 The Impact of Parent Involvement in Preschool Disabled Children

Authors: Sheng-Min Cheng

Abstract:

The purpose of this study was to investigate the relationship between parent involvement and preschool disabled children’s development. Parents of 3 year old disabled children (N=440) and 5 year old disabled children (N=937) participating in the Special Needs Education Longitudinal Study were interviewed or answered the web design questionnaire about their actions in parenting their disabled children. These children’s developments were also evaluated by their teachers. Data were analyzed using Structural Equation Modeling. Results were showed by tables and figures. Based on the results, the researcher made some suggestions for future studies.

Keywords: Child development, longitudinal data analysis, parent involvement, preschool disabled children.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1898
6962 An Efficient Architecture for Interleaved Modular Multiplication

Authors: Ahmad M. Abdel Fattah, Ayman M. Bahaa El-Din, Hossam M.A. Fahmy

Abstract:

Modular multiplication is the basic operation in most public key cryptosystems, such as RSA, DSA, ECC, and DH key exchange. Unfortunately, very large operands (in order of 1024 or 2048 bits) must be used to provide sufficient security strength. The use of such big numbers dramatically slows down the whole cipher system, especially when running on embedded processors. So far, customized hardware accelerators - developed on FPGAs or ASICs - were the best choice for accelerating modular multiplication in embedded environments. On the other hand, many algorithms have been developed to speed up such operations. Examples are the Montgomery modular multiplication and the interleaved modular multiplication algorithms. Combining both customized hardware with an efficient algorithm is expected to provide a much faster cipher system. This paper introduces an enhanced architecture for computing the modular multiplication of two large numbers X and Y modulo a given modulus M. The proposed design is compared with three previous architectures depending on carry save adders and look up tables. Look up tables should be loaded with a set of pre-computed values. Our proposed architecture uses the same carry save addition, but replaces both look up tables and pre-computations with an enhanced version of sign detection techniques. The proposed architecture supports higher frequencies than other architectures. It also has a better overall absolute time for a single operation.

Keywords: Montgomery multiplication, modular multiplication, efficient architecture, FPGA, RSA

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2135
6961 Applying Spanning Tree Graph Theory for Automatic Database Normalization

Authors: Chetneti Srisa-an

Abstract:

In Knowledge and Data Engineering field, relational database is the best repository to store data in a real world. It has been using around the world more than eight decades. Normalization is the most important process for the analysis and design of relational databases. It aims at creating a set of relational tables with minimum data redundancy that preserve consistency and facilitate correct insertion, deletion, and modification. Normalization is a major task in the design of relational databases. Despite its importance, very few algorithms have been developed to be used in the design of commercial automatic normalization tools. It is also rare technique to do it automatically rather manually. Moreover, for a large and complex database as of now, it make even harder to do it manually. This paper presents a new complete automated relational database normalization method. It produces the directed graph and spanning tree, first. It then proceeds with generating the 2NF, 3NF and also BCNF normal forms. The benefit of this new algorithm is that it can cope with a large set of complex function dependencies.

Keywords: Relational Database, Functional Dependency, Automatic Normalization, Primary Key, Spanning tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2571
6960 Target Concept Selection by Property Overlap in Ontology Population

Authors: Seong-Bae Park, Sang-Soo Kim, Sewook Oh, Zooyl Zeong, Hojin Lee, Seong Rae Park

Abstract:

An ontology is widely used in many kinds of applications as a knowledge representation tool for domain knowledge. However, even though an ontology schema is well prepared by domain experts, it is tedious and cost-intensive to add instances into the ontology. The most confident and trust-worthy way to add instances into the ontology is to gather instances from tables in the related Web pages. In automatic populating of instances, the primary task is to find the most proper concept among all possible concepts within the ontology for a given table. This paper proposes a novel method for this problem by defining the similarity between the table and the concept using the overlap of their properties. According to a series of experiments, the proposed method achieves 76.98% of accuracy. This implies that the proposed method is a plausible way for automatic ontology population from Web tables.

Keywords: Ontology population, domain knowledge consolidation, target concept selection, property overlap.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1405
6959 An Intelligent Approach of Rough Set in Knowledge Discovery Databases

Authors: Hrudaya Ku. Tripathy, B. K. Tripathy, Pradip K. Das

Abstract:

Knowledge Discovery in Databases (KDD) has evolved into an important and active area of research because of theoretical challenges and practical applications associated with the problem of discovering (or extracting) interesting and previously unknown knowledge from very large real-world databases. Rough Set Theory (RST) is a mathematical formalism for representing uncertainty that can be considered an extension of the classical set theory. It has been used in many different research areas, including those related to inductive machine learning and reduction of knowledge in knowledge-based systems. One important concept related to RST is that of a rough relation. In this paper we presented the current status of research on applying rough set theory to KDD, which will be helpful for handle the characteristics of real-world databases. The main aim is to show how rough set and rough set analysis can be effectively used to extract knowledge from large databases.

Keywords: Data mining, Data tables, Knowledge discovery in database (KDD), Rough sets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2040
6958 Power Production Performance of Different Wave Energy Converters in the Southwestern Black Sea

Authors: Ajab G. Majidi, Bilal Bingölbali, Adem Akpınar

Abstract:

This study aims to investigate the amount of energy (economic wave energy potential) that can be obtained from the existing wave energy converters in the high wave energy potential region of the Black Sea in terms of wave energy potential and their performance at different depths in the region. The data needed for this purpose were obtained using the calibrated nested layered SWAN wave modeling program version 41.01AB, which was forced with Climate Forecast System Reanalysis (CFSR) winds from 1979 to 2009. The wave dataset at a time interval of 2 hours was accumulated for a sub-grid domain for around Karaburun beach in Arnavutkoy, a district of Istanbul city. The annual sea state characteristic matrices for the five different depths along with a vertical line to the coastline were calculated for 31 years. According to the power matrices of different wave energy converter systems and characteristic matrices for each possible installation depth, the probability distribution tables of the specified mean wave period or wave energy period and significant wave height were calculated. Then, by using the relationship between these distribution tables, according to the present wave climate, the energy that the wave energy converter systems at each depth can produce was determined. Thus, the economically feasible potential of the relevant coastal zone was revealed, and the effect of different depths on energy converter systems is presented. The Oceantic at 50, 75 and 100 m depths and Oyster at 5 and 25 m depths presents the best performance. In the 31-year long period 1998 the most and 1989 is the least dynamic year.

Keywords: Annual power production, Black Sea, efficiency, power production performance, wave energy converter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 281
6957 Measurement Tools of the Maturity Model for IT Service Outsourcing in Higher Education Institutions

Authors: Victoriano Valencia García, Luis Usero Aragonés, Eugenio J. Fernández Vicente

Abstract:

Nowadays, the successful implementation of ICTs is vital for almost any kind of organization. Good governance and ICT management are essential for delivering value, managing technological risks, managing resources and performance measurement. In addition, outsourcing is a strategic IT service solution which complements IT services provided internally in organizations. This paper proposes the measurement tools of a new holistic maturity model based on standards ISO/IEC 20000 and ISO/IEC 38500, and the frameworks and best practices of ITIL and COBIT, with a specific focus on IT outsourcing. These measurement tools allow independent validation and practical application in the field of higher education, using a questionnaire, metrics tables, and continuous improvement plan tables as part of the measurement process. Guidelines and standards are proposed in the model for facilitating adaptation to universities and achieving excellence in the outsourcing of IT services.

Keywords: IT Governance, IT Management, IT Services, Maturity Model, Measurement Tools, Outsourcing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2544
6956 Big Data: Big Challenges to Privacy and Data Protection

Authors: Abu Bakar Munir, Siti Hajar Mohd Yasin, Firdaus Muhammad-Sukki

Abstract:

This paper seeks to analyse the benefits of big data and more importantly the challenges it pose to the subject of privacy and data protection. First, the nature of big data will be briefly deliberated before presenting the potential of big data in the present days. Afterwards, the issue of privacy and data protection is highlighted before discussing the challenges of implementing this issue in big data. In conclusion, the paper will put forward the debate on the adequacy of the existing legal framework in protecting personal data in the era of big data.

Keywords: Big data, data protection, information, privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3319
6955 A Visual Analytics Tool for the Structural Health Monitoring of an Aircraft Panel

Authors: F. M. Pisano, M. Ciminello

Abstract:

Aerospace, mechanical, and civil engineering infrastructures can take advantages from damage detection and identification strategies in terms of maintenance cost reduction and operational life improvements, as well for safety scopes. The challenge is to detect so called “barely visible impact damage” (BVID), due to low/medium energy impacts, that can progressively compromise the structure integrity. The occurrence of any local change in material properties, that can degrade the structure performance, is to be monitored using so called Structural Health Monitoring (SHM) systems, in charge of comparing the structure states before and after damage occurs. SHM seeks for any "anomalous" response collected by means of sensor networks and then analyzed using appropriate algorithms. Independently of the specific analysis approach adopted for structural damage detection and localization, textual reports, tables and graphs describing possible outlier coordinates and damage severity are usually provided as artifacts to be elaborated for information extraction about the current health conditions of the structure under investigation. Visual Analytics can support the processing of monitored measurements offering data navigation and exploration tools leveraging the native human capabilities of understanding images faster than texts and tables. Herein, a SHM system enrichment by integration of a Visual Analytics component is investigated. Analytical dashboards have been created by combining worksheets, so that a useful Visual Analytics tool is provided to structural analysts for exploring the structure health conditions examined by a Principal Component Analysis based algorithm.

Keywords: Interactive dashboards, optical fibers, structural health monitoring, visual analytics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 269
6954 Testing Database of Information System using Conceptual Modeling

Authors: Bogdan Walek, Cyril Klimes

Abstract:

This paper focuses on testing database of existing information system. At the beginning we describe the basic problems of implemented databases, such as data redundancy, poor design of database logical structure or inappropriate data types in columns of database tables. These problems are often the result of incorrect understanding of the primary requirements for a database of an information system. Then we propose an algorithm to compare the conceptual model created from vague requirements for a database with a conceptual model reconstructed from implemented database. An algorithm also suggests steps leading to optimization of implemented database. The proposed algorithm is verified by an implemented prototype. The paper also describes a fuzzy system which works with the vague requirements for a database of an information system, procedure for creating conceptual from vague requirements and an algorithm for reconstructing a conceptual model from implemented database.

Keywords: testing, database, relational database, information system, conceptual model, fuzzy, uncertain information, database testing, reconstruction, requirements, optimization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1196
6953 Data Preprocessing for Supervised Leaning

Authors: S. B. Kotsiantis, D. Kanellopoulos, P. E. Pintelas

Abstract:

Many factors affect the success of Machine Learning (ML) on a given task. The representation and quality of the instance data is first and foremost. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. It would be nice if a single sequence of data pre-processing algorithms had the best performance for each data set but this is not happened. Thus, we present the most well know algorithms for each step of data pre-processing so that one achieves the best performance for their data set.

Keywords: Data mining, feature selection, data cleaning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4949
6952 Applications of Big Data in Education

Authors: Faisal Kalota

Abstract:

Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.

Keywords: Analytics, Big Data in Education, Hadoop, Learning Analytics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3770
6951 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Keywords: Data cleaning, dependency rules, violation data discovery, data repair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2058
6950 Coalescing Data Marts

Authors: N. Parimala, P. Pahwa

Abstract:

OLAP uses multidimensional structures, to provide access to data for analysis. Traditionally, OLAP operations are more focused on retrieving data from a single data mart. An exception is the drill across operator. This, however, is restricted to retrieving facts on common dimensions of the multiple data marts. Our concern is to define further operations while retrieving data from multiple data marts. Towards this, we have defined six operations which coalesce data marts. While doing so we consider the common as well as the non-common dimensions of the data marts.

Keywords: Data warehouse, Dimension, OLAP, Star Schema.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1327