Search results for: data transformation

7541 Thailand National Biodiversity Database System with webMathematica and Google Earth

Authors: W. Katsarapong, W. Srisang, K. Jaroensutasinee, M. Jaroensutasinee

Abstract:

National Biodiversity Database System (NBIDS) has been developed for collecting Thai biodiversity data. The goal of this project is to provide advanced tools for querying, analyzing, modeling, and visualizing patterns of species distribution for researchers and scientists. NBIDS data record two types of datasets: biodiversity data and environmental data. Biodiversity data are specie presence data and species status. The attributes of biodiversity data can be further classified into two groups: universal and projectspecific attributes. Universal attributes are attributes that are common to all of the records, e.g. X/Y coordinates, year, and collector name. Project-specific attributes are attributes that are unique to one or a few projects, e.g., flowering stage. Environmental data include atmospheric data, hydrology data, soil data, and land cover data collecting by using GLOBE protocols. We have developed webbased tools for data entry. Google Earth KML and ArcGIS were used as tools for map visualization. webMathematica was used for simple data visualization and also for advanced data analysis and visualization, e.g., spatial interpolation, and statistical analysis. NBIDS will be used by park rangers at Khao Nan National Park, and researchers.

Keywords: GLOBE protocol, Biodiversity, Database System, ArcGIS, Google Earth and webMathematica.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1986

7540 Evaluation of Clustering Based on Preprocessing in Gene Expression Data

Authors: Seo Young Kim, Toshimitsu Hamasaki

Abstract:

Microarrays have become the effective, broadly used tools in biological and medical research to address a wide range of problems, including classification of disease subtypes and tumors. Many statistical methods are available for analyzing and systematizing these complex data into meaningful information, and one of the main goals in analyzing gene expression data is the detection of samples or genes with similar expression patterns. In this paper, we express and compare the performance of several clustering methods based on data preprocessing including strategies of normalization or noise clearness. We also evaluate each of these clustering methods with validation measures for both simulated data and real gene expression data. Consequently, clustering methods which are common used in microarray data analysis are affected by normalization and degree of noise and clearness for datasets.

Keywords: Gene expression, clustering, data preprocessing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1744

7539 Addressing Data Security in the Cloud

Authors: Marinela Mircea

Abstract:

The development of information and communication technology, the increased use of the internet, as well as the effects of the recession within the last years, have lead to the increased use of cloud computing based solutions, also called on-demand solutions. These solutions offer a large number of benefits to organizations as well as challenges and risks, mainly determined by data visualization in different geographic locations on the internet. As far as the specific risks of cloud environment are concerned, data security is still considered a peak barrier in adopting cloud computing. The present study offers an approach upon ensuring the security of cloud data, oriented towards the whole data life cycle. The final part of the study focuses on the assessment of data security in the cloud, this representing the bases in determining the potential losses and the premise for subsequent improvements and continuous learning.

Keywords: cloud computing, data life cycle, data security, security assessment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2164

7538 Simulation of a Process Design Model for Anaerobic Digestion of Municipal Solid Wastes

Authors: Asok Adak, Debabrata Mazumder, Pratip Bandyopadhyay

Abstract:

Anaerobic Digestion has become a promising technology for biological transformation of organic fraction of the municipal solid wastes (MSW). In order to represent the kinetic behavior of such biological process and thereby to design a reactor system, development of a mathematical model is essential. Addressing this issue, a simplistic mathematical model has been developed for anaerobic digestion of MSW in a continuous flow reactor unit under homogeneous steady state condition. Upon simulated hydrolysis, the kinetics of biomass growth and substrate utilization rate are assumed to follow first order reaction kinetics. Simulation of this model has been conducted by studying sensitivity of various process variables. The model was simulated using typical kinetic data of anaerobic digestion MSW and typical MSW characteristics of Kolkata. The hydraulic retention time (HRT) and solid retention time (SRT) time were mainly estimated by varying different model parameters like efficiency of reactor, influent substrate concentration and biomass concentration. Consequently, design table and charts have also been prepared for ready use in the actual plant operation.

Keywords: Anaerobic digestion, municipal solid waste (MSW), process design model, simulation study, hydraulic retention time(HRT), solid retention time (SRT).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2688

7537 A Network Traffic Prediction Algorithm Based On Data Mining Technique

Authors: D. Prangchumpol

Abstract:

This paper is a description approach to predict incoming and outgoing data rate in network system by using association rule discover, which is one of the data mining techniques. Information of incoming and outgoing data in each times and network bandwidth are network performance parameters, which needed to solve in the traffic problem. Since congestion and data loss are important network problems. The result of this technique can predicted future network traffic. In addition, this research is useful for network routing selection and network performance improvement.

Keywords: Traffic prediction, association rule, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3674

7536 Fuzzy Processing of Uncertain Data

Authors: Petr Morávek, Miloš Šeda

Abstract:

In practice, we often come across situations where it is necessary to make decisions based on incomplete or uncertain data. In control systems it may be due to the unknown exact mathematical model, or its excessive complexity (e.g. nonlinearity) when it is necessary to simplify it, respectively, to solve it using a rule base. In the case of databases, searching data we compare a similarity measure with of the requirements of the selection with stored data, where both the select query and the data itself may contain vague terms, for example in the form of linguistic qualifiers. In this paper, we focus on the processing of uncertain data in databases and demonstrate it on the example multi-criteria decision making in the selection of variants, specified by higher number of technical parameters.

Keywords: fuzzy logic, linguistic variable, multicriteria decision

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1421

7535 Automated Stereophotogrammetry Data Cleansing

Authors: Stuart Henry, Philip Morrow, John Winder, Bryan Scotney

Abstract:

The stereophotogrammetry modality is gaining more widespread use in the clinical setting. Registration and visualization of this data, in conjunction with conventional 3D volumetric image modalities, provides virtual human data with textured soft tissue and internal anatomical and structural information. In this investigation computed tomography (CT) and stereophotogrammetry data is acquired from 4 anatomical phantoms and registered using the trimmed iterative closest point (TrICP) algorithm. This paper fully addresses the issue of imaging artifacts around the stereophotogrammetry surface edge using the registered CT data as a reference. Several iterative algorithms are implemented to automatically identify and remove stereophotogrammetry surface edge outliers, improving the overall visualization of the combined stereophotogrammetry and CT data. This paper shows that outliers at the surface edge of stereophotogrammetry data can be successfully removed automatically.

Keywords: Data cleansing, stereophotogrammetry.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1844

7534 An Improved Data Mining Method Applied to the Search of Relationship between Metabolic Syndrome and Lifestyles

Authors: Yi Chao Huang, Yu Ling Liao, Chiu Shuang Lin

Abstract:

A data cutting and sorting method (DCSM) is proposed to optimize the performance of data mining. DCSM reduces the calculation time by getting rid of redundant data during the data mining process. In addition, DCSM minimizes the computational units by splitting the database and by sorting data with support counts. In the process of searching for the relationship between metabolic syndrome and lifestyles with the health examination database of an electronics manufacturing company, DCSM demonstrates higher search efficiency than the traditional Apriori algorithm in tests with different support counts.

Keywords: Data mining, Data cutting and sorting method, Apriori algorithm, Metabolic syndrome

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1590

7533 Data Mining Approach for Commercial Data Classification and Migration in Hybrid Storage Systems

Authors: Mais Haj Qasem, Maen M. Al Assaf, Ali Rodan

Abstract:

Parallel hybrid storage systems consist of a hierarchy of different storage devices that vary in terms of data reading speed performance. As we ascend in the hierarchy, data reading speed becomes faster. Thus, migrating the application’ important data that will be accessed in the near future to the uppermost level will reduce the application I/O waiting time; hence, reducing its execution elapsed time. In this research, we implement trace-driven two-levels parallel hybrid storage system prototype that consists of HDDs and SSDs. The prototype uses data mining techniques to classify application’ data in order to determine its near future data accesses in parallel with the its on-demand request. The important data (i.e. the data that the application will access in the near future) are continuously migrated to the uppermost level of the hierarchy. Our simulation results show that our data migration approach integrated with data mining techniques reduces the application execution elapsed time when using variety of traces in at least to 22%.

Keywords: Data mining, hybrid storage system, recurrent neural network, support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1737

7532 Association Rules Mining and NOSQL Oriented Document in Big Data

Authors: Sarra Senhadji, Imene Benzeguimi, Zohra Yagoub

Abstract:

Big Data represents the recent technology of manipulating voluminous and unstructured data sets over multiple sources. Therefore, NOSQL appears to handle the problem of unstructured data. Association rules mining is one of the popular techniques of data mining to extract hidden relationship from transactional databases. The algorithm for finding association dependencies is well-solved with Map Reduce. The goal of our work is to reduce the time of generating of frequent itemsets by using Map Reduce and NOSQL database oriented document. A comparative study is given to evaluate the performances of our algorithm with the classical algorithm Apriori.

Keywords: Apriori, Association rules mining, Big Data, data mining, Hadoop, Map Reduce, MongoDB, NoSQL.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 700

7531 An Adaptive Dimensionality Reduction Approach for Hyperspectral Imagery Semantic Interpretation

Authors: Akrem Sellami, Imed Riadh Farah, Basel Solaiman

Abstract:

With the development of HyperSpectral Imagery (HSI) technology, the spectral resolution of HSI became denser, which resulted in large number of spectral bands, high correlation between neighboring, and high data redundancy. However, the semantic interpretation is a challenging task for HSI analysis due to the high dimensionality and the high correlation of the different spectral bands. In fact, this work presents a dimensionality reduction approach that allows to overcome the different issues improving the semantic interpretation of HSI. Therefore, in order to preserve the spatial information, the Tensor Locality Preserving Projection (TLPP) has been applied to transform the original HSI. In the second step, knowledge has been extracted based on the adjacency graph to describe the different pixels. Based on the transformation matrix using TLPP, a weighted matrix has been constructed to rank the different spectral bands based on their contribution score. Thus, the relevant bands have been adaptively selected based on the weighted matrix. The performance of the presented approach has been validated by implementing several experiments, and the obtained results demonstrate the efficiency of this approach compared to various existing dimensionality reduction techniques. Also, according to the experimental results, we can conclude that this approach can adaptively select the relevant spectral improving the semantic interpretation of HSI.

Keywords: Band selection, dimensionality reduction, feature extraction, hyperspectral imagery, semantic interpretation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1173

7530 Identifying Critical Success Factors for Data Quality Management through a Delphi Study

Authors: Maria Paula Santos, Ana Lucas

Abstract:

Organizations support their operations and decision making on the data they have at their disposal, so the quality of these data is remarkably important and Data Quality (DQ) is currently a relevant issue, the literature being unanimous in pointing out that poor DQ can result in large costs for organizations. The literature review identified and described 24 Critical Success Factors (CSF) for Data Quality Management (DQM) that were presented to a panel of experts, who ordered them according to their degree of importance, using the Delphi method with the Q-sort technique, based on an online questionnaire. The study shows that the five most important CSF for DQM are: definition of appropriate policies and standards, control of inputs, definition of a strategic plan for DQ, organizational culture focused on quality of the data and obtaining top management commitment and support.

Keywords: Critical success factors, data quality, data quality management, Delphi, Q-Sort.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1111

7529 Secure Data Aggregation Using Clusters in Sensor Networks

Authors: Prakash G L, Thejaswini M, S H Manjula, K R Venugopal, L M Patnaik

Abstract:

Wireless sensor network can be applied to both abominable and military environments. A primary goal in the design of wireless sensor networks is lifetime maximization, constrained by the energy capacity of batteries. One well-known method to reduce energy consumption in such networks is data aggregation. Providing efcient data aggregation while preserving data privacy is a challenging problem in wireless sensor networks research. In this paper, we present privacy-preserving data aggregation scheme for additive aggregation functions. The Cluster-based Private Data Aggregation (CPDA)leverages clustering protocol and algebraic properties of polynomials. It has the advantage of incurring less communication overhead. The goal of our work is to bridge the gap between collaborative data collection by wireless sensor networks and data privacy. We present simulation results of our schemes and compare their performance to a typical data aggregation scheme TAG, where no data privacy protection is provided. Results show the efficacy and efficiency of our schemes.

Keywords: Aggregation, Clustering, Query Processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1736

7528 Equivalent Transformation for Heterogeneous Traffic Cellular Automata

Authors: Shih-Ching Lo

Abstract:

Understanding driving behavior is a complicated researching topic. To describe accurate speed, flow and density of a multiclass users traffic flow, an adequate model is needed. In this study, we propose the concept of standard passenger car equivalent (SPCE) instead of passenger car equivalent (PCE) to estimate the influence of heavy vehicles and slow cars. Traffic cellular automata model is employed to calibrate and validate the results. According to the simulated results, the SPCE transformations present good accuracy.

Keywords: traffic flow, passenger car equivalent, cellular automata

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1713

7527 A New Approach to Feedback Shift Registers

Authors: Myat Su Mon Win

Abstract:

The pseudorandom number generators based on linear feedback shift registers (LFSRs), are very quick, easy and secure in the implementation of hardware and software. Thus they are very popular and widely used. But LFSRs lead to fairly easy cryptanalysis due to their completely linearity properties. In this paper, we propose a stochastic generator, which is called Random Feedback Shift Register (RFSR), using stochastic transformation (Random block) with one-way and non-linearity properties.

Keywords: Linear Feedback Shift Register, Non Linearity, R_Block, Random Feedback Shift Register

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1813

7526 Rubric in Vocational Education

Authors: Azmanirah Ab Rahman, Jamil Ahmad, Ruhizan Muhammad Yasin

Abstract:

Rubric is a very important tool for teachers and students for a variety of purposes. Teachers use the rubric for evaluating student work while students use rubrics for self-assessment. Therefore, this paper was emphasized scoring rubric as a scoring tool for teachers in an environment of Competency Based Education and Training (CBET) in Malaysia Vocational College. A total of three teachers in the fields of electrical and electronics engineering were interviewed to identify how the use of rubrics practiced since vocational transformation implemented in 2012. Overall holistic rubric used to determine the performance of students in the skills area.

Keywords: Rubric, Vocational Education.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3601

7525 Incorporation Mechanism of Stabilizing Simulated Lead-Laden Sludge in Aluminum-Rich Ceramics

Authors: Xingwen Lu, Kaimin Shih

Abstract:

This study investigated a strategy of blending lead-laden sludge and Al-rich precursors to reduce the release of metals from the stabilized products. Using PbO as the simulated lead-laden sludge to sinter with γ-Al2O3 by Pb:Al molar ratios of 1:2 and 1:12, PbAl2O4 and PbAl12O19 were formed as final products during the sintering process, respectively. By firing the PbO + γ-Al2O3 mixtures with different Pb/Al molar ratios at 600 to 1000 °C, the lead transformation was determined through X-ray diffraction (XRD) data. In Pb/Al molar ratio of 1/2 system, the formation of PbAl2O4 is initiated at 700 °C, but an effective formation was observed above 750 °C. An intermediate phase, Pb9Al8O21, was detected in the temperature range of 800-900 °C. However, different incorporation behavior for sintering PbO with Al-rich precursors at a Pb/Al molar ratio of 1/12 was observed during the formation of PbAl12O19 in this system. In the sintering process, both temperature and time effect on the formation of PbAl2O4 and PbAl12O19 phases were estimated. Finally, a prolonged leaching test modified from the U.S. Environmental Protection Agency-s toxicity characteristic leaching procedure (TCLP) was used to evaluate the durability of PbO, Pb9Al8O21, PbAl2O4 and PbAl12O19 phases. Comparison for the leaching results of the four phases demonstrated the higher intrinsic resistance of PbAl12O19 against acid attack.

Keywords: Sludge, Lead, Stabilization, Leaching behavior

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1918

7524 A New Protocol for Concealed Data Aggregation in Wireless Sensor Networks

Authors: M. Abbasi Dezfouli, S. Mazraeh, M. H. Yektaie

Abstract:

Wireless sensor networks (WSN) consists of many sensor nodes that are placed on unattended environments such as military sites in order to collect important information. Implementing a secure protocol that can prevent forwarding forged data and modifying content of aggregated data and has low delay and overhead of communication, computing and storage is very important. This paper presents a new protocol for concealed data aggregation (CDA). In this protocol, the network is divided to virtual cells, nodes within each cell produce a shared key to send and receive of concealed data with each other. Considering to data aggregation in each cell is locally and implementing a secure authentication mechanism, data aggregation delay is very low and producing false data in the network by malicious nodes is not possible. To evaluate the performance of our proposed protocol, we have presented computational models that show the performance and low overhead in our protocol.

Keywords: Wireless Sensor Networks, Security, Concealed Data Aggregation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1738

7523 IMDC: An Image-Mapped Data Clustering Technique for Large Datasets

Authors: Faruq A. Al-Omari, Nabeel I. Al-Fayoumi

Abstract:

In this paper, we present a new algorithm for clustering data in large datasets using image processing approaches. First the dataset is mapped into a binary image plane. The synthesized image is then processed utilizing efficient image processing techniques to cluster the data in the dataset. Henceforth, the algorithm avoids exhaustive search to identify clusters. The algorithm considers only a small set of the data that contains critical boundary information sufficient to identify contained clusters. Compared to available data clustering techniques, the proposed algorithm produces similar quality results and outperforms them in execution time and storage requirements.

Keywords: Data clustering, Data mining, Image-mapping, Pattern discovery, Predictive analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1502

7522 The New Method of Concealed Data Aggregation in Wireless Sensor: A Case Study

Authors: M. Abbasi Dezfouli, S. Mazraeh, M. H. Yektaie

Abstract:

Wireless sensor networks (WSN) consists of many sensor nodes that are placed on unattended environments such as military sites in order to collect important information. Implementing a secure protocol that can prevent forwarding forged data and modifying content of aggregated data and has low delay and overhead of communication, computing and storage is very important. This paper presents a new protocol for concealed data aggregation (CDA). In this protocol, the network is divided to virtual cells, nodes within each cell produce a shared key to send and receive of concealed data with each other. Considering to data aggregation in each cell is locally and implementing a secure authentication mechanism, data aggregation delay is very low and producing false data in the network by malicious nodes is not possible. To evaluate the performance of our proposed protocol, we have presented computational models that show the performance and low overhead in our protocol.

Keywords: Wireless Sensor Networks, Security, Concealed Data Aggregation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1770

7521 The Semantic Web: a New Approach for Future World Wide Web

Authors: Sahar Nasrolahi, Mahdi Nikdast, Mehrdad Mahdavi Boroujerdi

Abstract:

The purpose of semantic web research is to transform the Web from a linked document repository into a distributed knowledge base and application platform, thus allowing the vast range of available information and services to be more efficiently exploited. As a first step in this transformation, languages such as OWL have been developed. Although fully realizing the Semantic Web still seems some way off, OWL has already been very successful and has rapidly become a defacto standard for ontology development in fields as diverse as geography, geology, astronomy, agriculture, defence and the life sciences. The aim of this paper is to classify key concepts of Semantic Web as well as introducing a new practical approach which uses these concepts to outperform Word Wide Web.

Keywords: Semantic Web, Ontology, OWL, Microformat, Word Wide Web.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1605

7520 The Extent to Which Social Factors Affect Urban Functional Mutations and Transformations

Authors: S. Mozuriunaite

Abstract:

Contemporary metropolitan areas and large cities are dynamic, rapidly growing and continuously changing. Thus, urban transformations and mutations are not a new phenomenon, but rather a continuous process. Basic factors of urban transformation are related to development of technologies, globalisation, lifestyle, etc., which in combination with local factors have generated an extremely great variety of urban development conditions. This article discusses the main urbanisation processes in Lithuania during last 50-year period and social factors affecting urban functional mutations.

Keywords: Dispersion, functional mutations, urbanisation, urban mutations, social factors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1550

7519 Response Spectrum Transformation for Seismic Qualification Testing

Authors: Nouredine Bourahla, Farid Bouriche, Yacine Benghalia

Abstract:

Seismic qualification testing for equipments to be mounted on upper storeys of buildings is very demanding in terms of floor spectra. The latter is characterized by high accelerations amplitudes within a narrow frequency band. This article presents a method which permits to cover specified required response spectra beyond the shaking table capability by amplifying the acceleration amplitudes at an appropriate frequency range using a physical intermediate mounted on the platform of the shaker.

Keywords: floor spectra, response spectrum, seismicqualification testing, shaking table

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1841

7518 Explicit Feedback Linearization of Magnetic Levitation System

Authors: Bhawna Tandon, Shiv Narayan, Jagdish Kumar

Abstract:

This study proposes the transformation of nonlinear Magnetic Levitation System into linear one, via state and feedback transformations using explicit algorithm. This algorithm allows computing explicitly the linearizing state coordinates and feedback for any nonlinear control system, which is feedback linearizable, without solving the Partial Differential Equations. The algorithm is performed using a maximum of N-1 steps where N being the dimension of the system.

Keywords: Explicit Algorithm, Feedback Linearization, Nonlinear control, Magnetic Levitation System.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2979

7517 Peakwise Smoothing of Data Models using Wavelets

Authors: D Sudheer Reddy, N Gopal Reddy, P V Radhadevi, J Saibaba, Geeta Varadan

Abstract:

Smoothing or filtering of data is first preprocessing step for noise suppression in many applications involving data analysis. Moving average is the most popular method of smoothing the data, generalization of this led to the development of Savitzky-Golay filter. Many window smoothing methods were developed by convolving the data with different window functions for different applications; most widely used window functions are Gaussian or Kaiser. Function approximation of the data by polynomial regression or Fourier expansion or wavelet expansion also gives a smoothed data. Wavelets also smooth the data to great extent by thresholding the wavelet coefficients. Almost all smoothing methods destroys the peaks and flatten them when the support of the window is increased. In certain applications it is desirable to retain peaks while smoothing the data as much as possible. In this paper we present a methodology called as peak-wise smoothing that will smooth the data to any desired level without losing the major peak features.

Keywords: smoothing, moving average, peakwise smoothing, spatialdensity models, planar shape models, wavelets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1753

7516 A New Precautionary Method for Measurement and Improvement the Data Quality

Authors: Seyed Mohammad Hossein Moossavizadeh, Mehran Mohsenzadeh, Nasrin Arshadi

Abstract:

the data quality is a kind of complex and unstructured concept, which is concerned by information systems managers. The reason of this attention is the high amount of Expenses for maintenance and cleaning of the inefficient data. Such a data more than its expenses of lack of quality, cause wrong statistics, analysis and decisions in organizations. Therefor the managers intend to improve the quality of their information systems' data. One of the basic subjects of quality improvement is the evaluation of the amount of it. In this paper, we present a precautionary method, which with its application the data of information systems would have a better quality. Our method would cover different dimensions of data quality; therefor it has necessary integrity. The presented method has tested on three dimensions of accuracy, value-added and believability and the results confirm the improvement and integrity of this method.

Keywords: Data quality, precaution, information system, measurement, improvement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1470

7515 An Efficient Data Mining Approach on Compressed Transactions

Authors: Jia-Yu Dai, Don-Lin Yang, Jungpin Wu, Ming-Chuan Hung

Abstract:

In an era of knowledge explosion, the growth of data increases rapidly day by day. Since data storage is a limited resource, how to reduce the data space in the process becomes a challenge issue. Data compression provides a good solution which can lower the required space. Data mining has many useful applications in recent years because it can help users discover interesting knowledge in large databases. However, existing compression algorithms are not appropriate for data mining. In [1, 2], two different approaches were proposed to compress databases and then perform the data mining process. However, they all lack the ability to decompress the data to their original state and improve the data mining performance. In this research a new approach called Mining Merged Transactions with the Quantification Table (M2TQT) was proposed to solve these problems. M2TQT uses the relationship of transactions to merge related transactions and builds a quantification table to prune the candidate itemsets which are impossible to become frequent in order to improve the performance of mining association rules. The experiments show that M2TQT performs better than existing approaches.

Keywords: Association rule, data mining, merged transaction, quantification table.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1962

7514 Weigh-in-Motion Data Analysis Software for Developing Traffic Data for Mechanistic Empirical Pavement Design

Authors: M. A. Hasan, M. R. Islam, R. A. Tarefder

Abstract:

Currently, there are few user friendly Weigh-in- Motion (WIM) data analysis softwares available which can produce traffic input data for the recently developed AASHTOWare pavement Mechanistic-Empirical (ME) design software. However, these softwares have only rudimentary Quality Control (QC) processes. Therefore, they cannot properly deal with erroneous WIM data. As the pavement performance is highly sensible to the quality of WIM data, it is highly recommended to use more refined QC process on raw WIM data to get a good result. This study develops a userfriendly software, which can produce traffic input for the ME design software. This software takes the raw data (Class and Weight data) collected from the WIM station and processes it with a sophisticated QC procedure. Traffic data such as traffic volume, traffic distribution, axle load spectra, etc. can be obtained from this software; which can directly be used in the ME design software.

Keywords: Weigh-in-motion, software, axle load spectra, traffic distribution, AASHTOWare.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1899

7513 Nonlinear Transformation of Laser Generated Ultrasonic Pulses in Geomaterials

Authors: Elena B. Cherepetskaya, Alexander A. Karabutov, Natalia B. Podymova, Ivan Sas

Abstract:

Nonlinear evolution of broadband ultrasonic pulses passed through the rock specimens is studied using the apparatus “GEOSCAN-02M”. Ultrasonic pulses are excited by the pulses of Qswitched Nd:YAG laser with the time duration of 10 ns and with the energy of 260 mJ. This energy can be reduced to 20 mJ by some light filters. The laser beam radius did not exceed 5 mm. As a result of the absorption of the laser pulse in the special material – the optoacoustic generator–the pulses of longitudinal ultrasonic waves are excited with the time duration of 100 ns and with the maximum pressure amplitude of 10 MPa. The immersion technique is used to measure the parameters of these ultrasonic pulses passed through a specimen, the immersion liquid is distilled water. The reference pulse passed through the cell with water has the compression and the rarefaction phases. The amplitude of the rarefaction phase is five times lower than that of the compression phase. The spectral range of the reference pulse reaches 10 MHz. The cubic-shaped specimens of the Karelian gabbro are studied with the rib length 3 cm. The ultimate strength of the specimens by the uniaxial compression is (300±10) MPa. As the reference pulse passes through the area of the specimen without cracks the compression phase decreases and the rarefaction one increases due to diffraction and scattering of ultrasound, so the ratio of these phases becomes 2.3:1. After preloading some horizontal cracks appear in the specimens. Their location is found by one-sided scanning of the specimen using the backward mode detection of the ultrasonic pulses reflected from the structure defects. Using the computer processing of these signals the images are obtained of the cross-sections of the specimens with cracks. By the increase of the reference pulse amplitude from 0.1 MPa to 5 MPa the nonlinear transformation of the ultrasonic pulse passed through the specimen with horizontal cracks results in the decrease by 2.5 times of the amplitude of the rarefaction phase and in the increase of its duration by 2.1 times. By the increase of the reference pulse amplitude from 5 MPa to 10 MPa the time splitting of the phases is observed for the bipolar pulse passed through the specimen. The compression and rarefaction phases propagate with different velocities. These features of the powerful broadband ultrasonic pulses passed through the rock specimens can be described by the hysteresis model of Preisach- Mayergoyz and can be used for the location of cracks in the optically opaque materials.

Keywords: Cracks, geological materials, nonlinear evolution of ultrasonic pulses, rock.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1900

7512 Human Growth Curve Estimation through a Combination of Longitudinal and Cross-sectional Data

Authors: Sedigheh Mirzaei S., Debasis Sengupta

Abstract:

Parametric models have been quite popular for studying human growth, particularly in relation to biological parameters such as peak size velocity and age at peak size velocity. Longitudinal data are generally considered to be vital for fittinga parametric model to individual-specific data, and for studying the distribution of these biological parameters in a human population. However, cross-sectional data are easier to obtain than longitudinal data. In this paper, we present a method of combining longitudinal and cross-sectional data for the purpose of estimating the distribution of the biological parameters. We demonstrate, through simulations in the special case ofthePreece Baines model, how estimates based on longitudinal data can be improved upon by harnessing the information contained in cross-sectional data.We study the extent of improvement for different mixes of the two types of data, and finally illustrate the use of the method through data collected by the Indian Statistical Institute.

Keywords: Preece-Baines growth model, MCMC method, Mixed effect model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2142