Search results for: data preprocessing.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7419

Search results for: data preprocessing.

7359 Long-Term Simulation of Digestive Sound Signals by CEPSTRAL Technique

Authors: Einalou Z., Najafi Z., Maghooli K. Zandi Y, Sheibeigi A

Abstract:

In this study, an investigation over digestive diseases has been done in which the sound acts as a detector medium. Pursue to the preprocessing the extracted signal in cepstrum domain is registered. After classification of digestive diseases, the system selects random samples based on their features and generates the interest nonstationary, long-term signals via inverse transform in cepstral domain which is presented in digital and sonic form as the output. This structure is updatable or on the other word, by receiving a new signal the corresponding disease classification is updated in the feature domain.

Keywords: Cepstrum, databank, digestive disease, acousticsignal.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1507
7358 Spectral Analysis of Speech: A New Technique

Authors: Neeta Awasthy, J.P.Saini, D.S.Chauhan

Abstract:

ICA which is generally used for blind source separation problem has been tested for feature extraction in Speech recognition system to replace the phoneme based approach of MFCC. Applying the Cepstral coefficients generated to ICA as preprocessing has developed a new signal processing approach. This gives much better results against MFCC and ICA separately, both for word and speaker recognition. The mixing matrix A is different before and after MFCC as expected. As Mel is a nonlinear scale. However, cepstrals generated from Linear Predictive Coefficient being independent prove to be the right candidate for ICA. Matlab is the tool used for all comparisons. The database used is samples of ISOLET.

Keywords: Cepstral Coefficient, Distance measures, Independent Component Analysis, Linear Predictive Coefficients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1911
7357 Edge Detection in Low Contrast Images

Authors: Koushlendra Kumar Singh, Manish Kumar Bajpai, Rajesh K. Pandey

Abstract:

The edges of low contrast images are not clearly distinguishable to human eye. It is difficult to find the edges and boundaries in it. The present work encompasses a new approach for low contrast images. The Chebyshev polynomial based fractional order filter has been used for filtering operation on an image. The preprocessing has been performed by this filter on the input image. Laplacian of Gaussian method has been applied on preprocessed image for edge detection. The algorithm has been tested on two test images.

Keywords: Chebyshev polynomials, Fractional order differentiator, Laplacian of Gaussian (LoG) method, Low contrast image.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3201
7356 Image Enhancement using α-Trimmed Mean ε-Filters

Authors: Mahdi Shaneh, Arash Golibagh Mahyari

Abstract:

Image enhancement is the most important challenging preprocessing for almost all applications of Image Processing. By now, various methods such as Median filter, α-trimmed mean filter, etc. have been suggested. It was proved that the α-trimmed mean filter is the modification of median and mean filters. On the other hand, ε-filters have shown excellent performance in suppressing noise. In spite of their simplicity, they achieve good results. However, conventional ε-filter is based on moving average. In this paper, we suggested a new ε-filter which utilizes α-trimmed mean. We argue that this new method gives better outcomes compared to previous ones and the experimental results confirmed this claim.

Keywords: Image enhancement, median filter, ε-filter – α-trimmed mean filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5448
7355 Improved Computational Efficiency of Machine Learning Algorithms Based on Evaluation Metrics to Control the Spread of Coronavirus in the UK

Authors: Swathi Ganesan, Nalinda Somasiri, Rebecca Jeyavadhanam, Gayathri Karthick

Abstract:

The COVID-19 crisis presents a substantial and critical hazard to worldwide health. Since the occurrence of the disease in late January 2020 in the UK, the number of infected people confirmed to acquire the illness has increased tremendously across the country, and the number of individuals affected is undoubtedly considerably high. The purpose of this research is to figure out a predictive machine learning (ML) archetypal that could forecast the COVID-19 cases within the UK. This study concentrates on the statistical data collected from 31st January 2020 to 31st March 2021 in the United Kingdom. Information on total COVID-19 cases registered, new cases encountered on a daily basis, total death registered, and patients’ death per day due to Coronavirus is collected from World Health Organization (WHO). Data preprocessing is carried out to identify any missing values, outliers, or anomalies in the dataset. The data are split into 8:2 ratio for training and testing purposes to forecast future new COVID-19 cases. Support Vector Machine (SVM), Random Forest (RF), and linear regression (LR) algorithms are chosen to study the model performance in the prediction of new COVID-19 cases. From the evaluation metrics such as r-squared value and mean squared error, the statistical performance of the model in predicting the new COVID-19 cases is evaluated. RF outperformed the other two ML algorithms with a training accuracy of 99.47% and testing accuracy of 98.26% when n = 30. The mean square error obtained for RF is 4.05e11, which is lesser compared to the other predictive models used for this study. From the experimental analysis, RF algorithm can perform more effectively and efficiently in predicting the new COVID-19 cases, which could help the health sector to take relevant control measures for the spread of the virus.

Keywords: COVID-19, machine learning, supervised learning, unsupervised learning, linear regression, support vector machine, random forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 99
7354 3D Anisotropic Diffusion for Liver Segmentation

Authors: Wan Nural Jawahir Wan Yussof, Hans Burkhardt

Abstract:

Liver segmentation is the first significant process for liver diagnosis of the Computed Tomography. It segments the liver structure from other abdominal organs. Sophisticated filtering techniques are indispensable for a proper segmentation. In this paper, we employ a 3D anisotropic diffusion as a preprocessing step. While removing image noise, this technique preserve the significant parts of the image, typically edges, lines or other details that are important for the interpretation of the image. The segmentation task is done by using thresholding with automatic threshold values selection and finally the false liver region is eliminated using 3D connected component. The result shows that by employing the 3D anisotropic filtering, better liver segmentation results could be achieved eventhough simple segmentation technique is used.

Keywords: 3D Anisotropic Diffusion, non-linear filtering, CT Liver.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1543
7353 Applications of Big Data in Education

Authors: Faisal Kalota

Abstract:

Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.

Keywords: Analytics, Big Data in Education, Hadoop, Learning Analytics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4815
7352 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Keywords: Data cleaning, dependency rules, violation data discovery, data repair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2561
7351 Component-based Segmentation of Words from Handwritten Arabic Text

Authors: Jawad H AlKhateeb, Jianmin Jiang, Jinchang Ren, Stan S Ipson

Abstract:

Efficient preprocessing is very essential for automatic recognition of handwritten documents. In this paper, techniques on segmenting words in handwritten Arabic text are presented. Firstly, connected components (ccs) are extracted, and distances among different components are analyzed. The statistical distribution of this distance is then obtained to determine an optimal threshold for words segmentation. Meanwhile, an improved projection based method is also employed for baseline detection. The proposed method has been successfully tested on IFN/ENIT database consisting of 26459 Arabic words handwritten by 411 different writers, and the results were promising and very encouraging in more accurate detection of the baseline and segmentation of words for further recognition.

Keywords: Arabic OCR, off-line recognition, Baseline estimation, Word segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2164
7350 Coalescing Data Marts

Authors: N. Parimala, P. Pahwa

Abstract:

OLAP uses multidimensional structures, to provide access to data for analysis. Traditionally, OLAP operations are more focused on retrieving data from a single data mart. An exception is the drill across operator. This, however, is restricted to retrieving facts on common dimensions of the multiple data marts. Our concern is to define further operations while retrieving data from multiple data marts. Towards this, we have defined six operations which coalesce data marts. While doing so we consider the common as well as the non-common dimensions of the data marts.

Keywords: Data warehouse, Dimension, OLAP, Star Schema.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1515
7349 Improving Worm Detection with Artificial Neural Networks through Feature Selection and Temporal Analysis Techniques

Authors: Dima Stopel, Zvi Boger, Robert Moskovitch, Yuval Shahar, Yuval Elovici

Abstract:

Computer worm detection is commonly performed by antivirus software tools that rely on prior explicit knowledge of the worm-s code (detection based on code signatures). We present an approach for detection of the presence of computer worms based on Artificial Neural Networks (ANN) using the computer's behavioral measures. Identification of significant features, which describe the activity of a worm within a host, is commonly acquired from security experts. We suggest acquiring these features by applying feature selection methods. We compare three different feature selection techniques for the dimensionality reduction and identification of the most prominent features to capture efficiently the computer behavior in the context of worm activity. Additionally, we explore three different temporal representation techniques for the most prominent features. In order to evaluate the different techniques, several computers were infected with five different worms and 323 different features of the infected computers were measured. We evaluated each technique by preprocessing the dataset according to each one and training the ANN model with the preprocessed data. We then evaluated the ability of the model to detect the presence of a new computer worm, in particular, during heavy user activity on the infected computers.

Keywords: Artificial Neural Networks, Feature Selection, Temporal Analysis, Worm Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1682
7348 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: Mining Big Data, Big Data, Machine learning, Data Streams, Telecommunication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2418
7347 String Matching using Inverted Lists

Authors: Chouvalit Khancome, Veera Boonjing

Abstract:

This paper proposes a new solution to string matching problem. This solution constructs an inverted list representing a  string pattern to be searched for. It then uses a new algorithm to process an input string in a single pass. The preprocessing phase  takes 1) time complexity O(m) 2) space complexity O(1) where m is  the length of pattern. The searching phase time complexity takes 1)  O(m+α ) in average case 2) O(n/m) in the best case and 3) O(n) in  the worst case, where α is the number of comparing leading to  mismatch and n is the length of input text.

Keywords: String matching, inverted list, inverted index, pattern, algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1512
7346 A New Face Recognition Method using PCA, LDA and Neural Network

Authors: A. Hossein Sahoolizadeh, B. Zargham Heidari, C. Hamid Dehghani

Abstract:

In this paper, a new face recognition method based on PCA (principal Component Analysis), LDA (Linear Discriminant Analysis) and neural networks is proposed. This method consists of four steps: i) Preprocessing, ii) Dimension reduction using PCA, iii) feature extraction using LDA and iv) classification using neural network. Combination of PCA and LDA is used for improving the capability of LDA when a few samples of images are available and neural classifier is used to reduce number misclassification caused by not-linearly separable classes. The proposed method was tested on Yale face database. Experimental results on this database demonstrated the effectiveness of the proposed method for face recognition with less misclassification in comparison with previous methods.

Keywords: Face recognition Principal component analysis, Linear discriminant analysis, Neural networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3159
7345 Evolutionary Feature Selection for Text Documents using the SVM

Authors: Daniel I. Morariu, Lucian N. Vintan, Volker Tresp

Abstract:

Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step, the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of feature selection methods to reduce the dimensionality of the document-representation vector. In this paper, we present three feature selection methods: Information Gain, Support Vector Machine feature selection called (SVM_FS) and Genetic Algorithm with SVM (called GA_SVM). We show that the best results were obtained with GA_SVM method for a relatively small dimension of the feature vector.

Keywords: Feature Selection, Learning with Kernels, Support Vector Machine, Genetic Algorithm, and Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1658
7344 A Study of Touching Characters in Degraded Gurmukhi Text

Authors: M. K. Jindal, G. S. Lehal, R. K. Sharma

Abstract:

Character segmentation is an important preprocessing step for text recognition. In degraded documents, existence of touching characters decreases recognition rate drastically, for any optical character recognition (OCR) system. In this paper a study of touching Gurmukhi characters is carried out and these characters have been divided into various categories after a careful analysis.Structural properties of the Gurmukhi characters are used for defining the categories. New algorithms have been proposed to segment the touching characters in middle zone. These algorithms have shown a reasonable improvement in segmenting the touching characters in degraded Gurmukhi script. The algorithms proposed in this paper are applicable only to machine printed text.

Keywords: Character Segmentation, Middle Zone, Touching Characters.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1791
7343 Feature Selection Methods for an Improved SVM Classifier

Authors: Daniel Morariu, Lucian N. Vintan, Volker Tresp

Abstract:

Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step, the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of feature selection methods to reduce the dimensionality of the document-representation vector. In this paper, three feature selection methods are evaluated: Random Selection, Information Gain (IG) and Support Vector Machine feature selection (called SVM_FS). We show that the best results were obtained with SVM_FS method for a relatively small dimension of the feature vector. Also we present a novel method to better correlate SVM kernel-s parameters (Polynomial or Gaussian kernel).

Keywords: Feature Selection, Learning with Kernels, SupportVector Machine, and Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1775
7342 Evolving Neural Networks using Moment Method for Handwritten Digit Recognition

Authors: H. El Fadili, K. Zenkouar, H. Qjidaa

Abstract:

This paper proposes a neural network weights and topology optimization using genetic evolution and the backpropagation training algorithm. The proposed crossover and mutation operators aims to adapt the networks architectures and weights during the evolution process. Through a specific inheritance procedure, the weights are transmitted from the parents to their offsprings, which allows re-exploitation of the already trained networks and hence the acceleration of the global convergence of the algorithm. In the preprocessing phase, a new feature extraction method is proposed based on Legendre moments with the Maximum entropy principle MEP as a selection criterion. This allows a global search space reduction in the design of the networks. The proposed method has been applied and tested on the well known MNIST database of handwritten digits.

Keywords: Genetic algorithm, Legendre Moments, MEP, Neural Network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1621
7341 Comparative Analysis of Diverse Collection of Big Data Analytics Tools

Authors: S. Vidhya, S. Sarumathi, N. Shanthi

Abstract:

Over the past era, there have been a lot of efforts and studies are carried out in growing proficient tools for performing various tasks in big data. Recently big data have gotten a lot of publicity for their good reasons. Due to the large and complex collection of datasets it is difficult to process on traditional data processing applications. This concern turns to be further mandatory for producing various tools in big data. Moreover, the main aim of big data analytics is to utilize the advanced analytic techniques besides very huge, different datasets which contain diverse sizes from terabytes to zettabytes and diverse types such as structured or unstructured and batch or streaming. Big data is useful for data sets where their size or type is away from the capability of traditional relational databases for capturing, managing and processing the data with low-latency. Thus the out coming challenges tend to the occurrence of powerful big data tools. In this survey, a various collection of big data tools are illustrated and also compared with the salient features.

Keywords: Big data, Big data analytics, Business analytics, Data analysis, Data visualization, Data discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3727
7340 Robust Detection of R-Wave Using Wavelet Technique

Authors: Awadhesh Pachauri, Manabendra Bhuyan

Abstract:

Electrocardiogram (ECG) is considered to be the backbone of cardiology. ECG is composed of P, QRS & T waves and information related to cardiac diseases can be extracted from the intervals and amplitudes of these waves. The first step in extracting ECG features starts from the accurate detection of R peaks in the QRS complex. We have developed a robust R wave detector using wavelets. The wavelets used for detection are Daubechies and Symmetric. The method does not require any preprocessing therefore, only needs the ECG correct recordings while implementing the detection. The database has been collected from MIT-BIH arrhythmia database and the signals from Lead-II have been analyzed. MatLab 7.0 has been used to develop the algorithm. The ECG signal under test has been decomposed to the required level using the selected wavelet and the selection of detail coefficient d4 has been done based on energy, frequency and cross-correlation analysis of decomposition structure of ECG signal. The robustness of the method is apparent from the obtained results.

Keywords: ECG, P-QRS-T waves, Wavelet Transform, Hard Thresholding, R-wave Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2425
7339 3D Face Recognition Using Modified PCA Methods

Authors: Omid Gervei, Ahmad Ayatollahi, Navid Gervei

Abstract:

In this paper we present an approach for 3D face recognition based on extracting principal components of range images by utilizing modified PCA methods namely 2DPCA and bidirectional 2DPCA also known as (2D) 2 PCA.A preprocessing stage was implemented on the images to smooth them using median and Gaussian filtering. In the normalization stage we locate the nose tip to lay it at the center of images then crop each image to a standard size of 100*100. In the face recognition stage we extract the principal component of each image using both 2DPCA and (2D) 2 PCA. Finally, we use Euclidean distance to measure the minimum distance between a given test image to the training images in the database. We also compare the result of using both methods. The best result achieved by experiments on a public face database shows that 83.3 percent is the rate of face recognition for a random facial expression.

Keywords: 3D face recognition, 2DPCA, (2D) 2 PCA, Rangeimage

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3016
7338 Multi-labeled Data Expressed by a Set of Labels

Authors: Tetsuya Furukawa, Masahiro Kuzunishi

Abstract:

Collected data must be organized to be utilized efficiently, and hierarchical classification of data is efficient approach to organize data. When data is classified to multiple categories or annotated with a set of labels, users request multi-labeled data by giving a set of labels. There are several interpretations of the data expressed by a set of labels. This paper discusses which data is expressed by a set of labels by introducing orders for sets of labels and shows that there are four types of orders, which are characterized by whether the labels of expressed data includes every label of the given set of labels within the range of the set. Desirable properties of the orders, data is also expressed by the higher set of labels and different sets of labels express different data, are discussed for the orders.

Keywords: Classification Hierarchies, Multi-labeled Data, Multiple Classificaiton, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1256
7337 Time Series Forecasting Using Independent Component Analysis

Authors: Theodor D. Popescu

Abstract:

The paper presents a method for multivariate time series forecasting using Independent Component Analysis (ICA), as a preprocessing tool. The idea of this approach is to do the forecasting in the space of independent components (sources), and then to transform back the results to the original time series space. The forecasting can be done separately and with a different method for each component, depending on its time structure. The paper gives also a review of the main algorithms for independent component analysis in the case of instantaneous mixture models, using second and high-order statistics. The method has been applied in simulation to an artificial multivariate time series with five components, generated from three sources and a mixing matrix, randomly generated.

Keywords: Independent Component Analysis, second order statistics, simulation, time series forecasting

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1737
7336 A Comprehensive Evaluation of Supervised Machine Learning for the Phase Identification Problem

Authors: Brandon Foggo, Nanpeng Yu

Abstract:

Power distribution circuits undergo frequent network topology changes that are often left undocumented. As a result, the documentation of a circuit’s connectivity becomes inaccurate with time. The lack of reliable circuit connectivity information is one of the biggest obstacles to model, monitor, and control modern distribution systems. To enhance the reliability and efficiency of electric power distribution systems, the circuit’s connectivity information must be updated periodically. This paper focuses on one critical component of a distribution circuit’s topology - the secondary transformer to phase association. This topology component describes the set of phase lines that feed power to a given secondary transformer (and therefore a given group of power consumers). Finding the documentation of this component is call Phase Identification, and is typically performed with physical measurements. These measurements can take time lengths on the order of several months, but with supervised learning, the time length can be reduced significantly. This paper compares several such methods applied to Phase Identification for a large range of real distribution circuits, describes a method of training data selection, describes preprocessing steps unique to the Phase Identification problem, and ultimately describes a method which obtains high accuracy (> 96% in most cases, > 92% in the worst case) using only 5% of the measurements typically used for Phase Identification.

Keywords: Distribution network, machine learning, network topology, phase identification, smart grid.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 998
7335 High Resolution Images: Segmenting, Extracting Information and GIS Integration

Authors: Erick López-Ornelas

Abstract:

As the world changes more rapidly, the demand for update information for resource management, environment monitoring, planning are increasing exponentially. Integration of Remote Sensing with GIS technology will significantly promote the ability for addressing these concerns. This paper presents an alternative way of update GIS applications using image processing and high resolution images. We show a method of high-resolution image segmentation using graphs and morphological operations, where a preprocessing step (watershed operation) is required. A morphological process is then applied using the opening and closing operations. After this segmentation we can extract significant cartographic elements such as urban areas, streets or green areas. The result of this segmentation and this extraction is then used to update GIS applications. Some examples are shown using aerial photography.

Keywords: GIS, Remote Sensing, image segmentation, feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1597
7334 The Comparison of Data Replication in Distributed Systems

Authors: Iman Zangeneh, Mostafa Moradi, Ali Mokhtarbaf

Abstract:

The necessity of ever-increasing use of distributed data in computer networks is obvious for all. One technique that is performed on the distributed data for increasing of efficiency and reliablity is data rplication. In this paper, after introducing this technique and its advantages, we will examine some dynamic data replication. We will examine their characteristies for some overus scenario and the we will propose some suggestion for their improvement.

Keywords: data replication, data hiding, consistency, dynamicdata replication strategy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1594
7333 Simulation Based VLSI Implementation of Fast Efficient Lossless Image Compression System Using Adjusted Binary Code & Golumb Rice Code

Authors: N. Muthukumaran, R. Ravi

Abstract:

The Simulation based VLSI Implementation of FELICS (Fast Efficient Lossless Image Compression System) Algorithm is proposed to provide the lossless image compression and is implemented in simulation oriented VLSI (Very Large Scale Integrated). To analysis the performance of Lossless image compression and to reduce the image without losing image quality and then implemented in VLSI based FELICS algorithm. In FELICS algorithm, which consists of simplified adjusted binary code for Image compression and these compression image is converted in pixel and then implemented in VLSI domain. This parameter is used to achieve high processing speed and minimize the area and power. The simplified adjusted binary code reduces the number of arithmetic operation and achieved high processing speed. The color difference preprocessing is also proposed to improve coding efficiency with simple arithmetic operation. Although VLSI based FELICS Algorithm provides effective solution for hardware architecture design for regular pipelining data flow parallelism with four stages. With two level parallelisms, consecutive pixels can be classified into even and odd samples and the individual hardware engine is dedicated for each one. This method can be further enhanced by multilevel parallelisms.

Keywords: Image compression, Pixel, Compression Ratio, Adjusted Binary code, Golumb Rice code, High Definition display, VLSI Implementation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2028
7332 Key Frames Extraction for Sign Language Video Analysis and Recognition

Authors: Jaroslav Polec, Petra Heribanová, Tomáš Hirner

Abstract:

In this paper we proposed a method for finding video frames representing one sign in the finger alphabet. The method is based on determining hands location, segmentation and the use of standard video quality evaluation metrics. Metric calculation is performed only in regions of interest. Sliding mechanism for finding local extrema and adaptive threshold based on local averaging is used for key frames selection. The success rate is evaluated by recall, precision and F1 measure. The method effectiveness is compared with metrics applied to all frames. Proposed method is fast, effective and relatively easy to realize by simple input video preprocessing and subsequent use of tools designed for video quality measuring.

Keywords: Key frame, video, quality, metric, MSE, MSAD, SSIM, VQM, sign language, finger alphabet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1981
7331 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: Clustering, data mining, DBSCAN, k-means, k-medoids, sensor data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1958
7330 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: Big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1987