Search results for: Statistical Data Analysis.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 13750

Search results for: Statistical Data Analysis.

13390 Statistical Measures and Optimization Algorithms for Gene Selection in Lung and Ovarian Tumor

Authors: C. Gunavathi, K. Premalatha

Abstract:

Microarray technology is universally used in the study of disease diagnosis using gene expression levels. The main shortcoming of gene expression data is that it includes thousands of genes and a small number of samples. Abundant methods and techniques have been proposed for tumor classification using microarray gene expression data. Feature or gene selection methods can be used to mine the genes that directly involve in the classification and to eliminate irrelevant genes. In this paper statistical measures like T-Statistics, Signal-to-Noise Ratio (SNR) and F-Statistics are used to rank the genes. The ranked genes are used for further classification. Particle Swarm Optimization (PSO) algorithm and Shuffled Frog Leaping (SFL) algorithm are used to find the significant genes from the top-m ranked genes. The Naïve Bayes Classifier (NBC) is used to classify the samples based on the significant genes. The proposed work is applied on Lung and Ovarian datasets. The experimental results show that the proposed method achieves 100% accuracy in all the three datasets and the results are compared with previous works.

Keywords: Microarray, T-Statistics, Signal-to-Noise Ratio, FStatistics, Particle Swarm Optimization, Shuffled Frog Leaping, Naïve Bayes Classifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1931
13389 Application of Statistical Approach for Optimizing CMCase Production by Bacillus tequilensis S28 Strain via Submerged Fermentation Using Wheat Bran as Carbon Source

Authors: A. Sharma, R. Tewari, S. K. Soni

Abstract:

Biofuels production has come forth as a future technology to combat the problem of depleting fossil fuels. Bio-based ethanol production from enzymatic lignocellulosic biomass degradation serves an efficient method and catching the eye of scientific community. High cost of the enzyme is the major obstacle in preventing the commercialization of this process. Thus main objective of the present study was to optimize composition of medium components for enhancing cellulase production by newly isolated strain of Bacillus tequilensis. Nineteen factors were taken into account using statistical Plackett-Burman Design. The significant variables influencing the cellulose production were further employed in statistical Response Surface Methodology using Central Composite Design for maximizing cellulase production. The optimum medium composition for cellulase production was: peptone (4.94 g/L), ammonium chloride (4.99 g/L), yeast extract (2.00 g/L), Tween-20 (0.53 g/L), calcium chloride (0.20 g/L) and cobalt chloride (0.60 g/L) with pH 7, agitation speed 150 rpm and 72 h incubation at 37oC. Analysis of variance (ANOVA) revealed high coefficient of determination (R2) of 0.99. Maximum cellulase productivity of 11.5 IU/ml was observed against the model predicted value of 13 IU/ml. This was found to be optimally active at 60oC and pH 5.5.

Keywords: Bacillus tequilensis, CMCase, Submerged Fermentation, Optimization, Plackett-Burman Design, Response Surface Methodology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3044
13388 Enhanced Clustering Analysis and Visualization Using Kohonen's Self-Organizing Feature Map Networks

Authors: Kasthurirangan Gopalakrishnan, Siddhartha Khaitan, Anshu Manik

Abstract:

Cluster analysis is the name given to a diverse collection of techniques that can be used to classify objects (e.g. individuals, quadrats, species etc). While Kohonen's Self-Organizing Feature Map (SOFM) or Self-Organizing Map (SOM) networks have been successfully applied as a classification tool to various problem domains, including speech recognition, image data compression, image or character recognition, robot control and medical diagnosis, its potential as a robust substitute for clustering analysis remains relatively unresearched. SOM networks combine competitive learning with dimensionality reduction by smoothing the clusters with respect to an a priori grid and provide a powerful tool for data visualization. In this paper, SOM is used for creating a toroidal mapping of two-dimensional lattice to perform cluster analysis on results of a chemical analysis of wines produced in the same region in Italy but derived from three different cultivators, referred to as the “wine recognition data" located in the University of California-Irvine database. The results are encouraging and it is believed that SOM would make an appealing and powerful decision-support system tool for clustering tasks and for data visualization.

Keywords: Artificial neural networks, cluster analysis, Kohonen maps, wine recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2108
13387 Analysis and Comparison of Image Encryption Algorithms

Authors: İsmet Öztürk, İbrahim Soğukpınar

Abstract:

With the fast progression of data exchange in electronic way, information security is becoming more important in data storage and transmission. Because of widely using images in industrial process, it is important to protect the confidential image data from unauthorized access. In this paper, we analyzed current image encryption algorithms and compression is added for two of them (Mirror-like image encryption and Visual Cryptography). Implementations of these two algorithms have been realized for experimental purposes. The results of analysis are given in this paper.

Keywords: image encryption, image cryptosystem, security, transmission

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4938
13386 Dynamics of Marital Status and Information Search through Consumer Generated Media: An Exploratory Study

Authors: Shivakumar Krishnamurti, Ruchi Agarwal

Abstract:

The study examines the influence of marital status on consumers of products and services using blogs as a source of information. A pre-designed questionnaire was used to collect the primary data from the respondents (experiences). Data were collected from one hundred and eighty seven respondents residing in and around the Emirates of Sharjah and Dubai of the United Arab Emirates. The collected data was analyzed with the help of statistical tools such as averages, percentages, factor analysis, Student’s t-test and Structural Equation Modelling Technique. Objectives of the study are to know the reasons how married and unmarried or single consumers of products and services are motivated to use blogs as a source of information, to know whether the consumers of products and services irrespective of their marital status share their views and experiences with other bloggers and to know the respondents’ future intentions towards blogging. The study revealed the following: Majority of the respondents have the motivation to blog because they are willing to receive comments on what they post about services, convenience of blogs to search for information about services and products, by blogging respondents share information on the symptoms of a disease/ disorder that may be experienced by someone, helps to share information about ready to cook mix products and are keen to spend more time blogging in the future.

Keywords: Blog, consumer, information, marital status.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1684
13385 Efficacy of Polyfluoroalkyl Substances Filtration with Low-Cost Organic Fiber Filter

Authors: Gautham Das, Edward Morrone, Erik Treble, Clinton Binder

Abstract:

The purpose of this study was to evaluate the efficacy of a low-cost filter regarding per- and polyfluoroalkyl substances (PFAS). PFAS is a commonly used man-made chemical that can be found in a variety of household and industrial products with deleterious effects on humans. The filter consists of a combination of low-cost materials which could be locally procured. Water testing results for 4 different PFAS contaminants indicated that for Perfluorooctane sulfonic acid (PFOS), the Agency for Toxic Substances and Disease Registry (ATSDR) regulation is 7 ppt, the initial concentration was 15 ppt, and the final concentration was 3.9 ppt. For Perfluorononanoic acid (PFNA), the ATSDR regulation is 10.5 ppt, the initial concentration was 15 ppt, and the final concentration was 3.9 ppt. For Perfluorooctanoic acid (PFOA), the ATSDR regulation is 11 ppt, the initial concentration was 15 ppt, and the final concentration was 3.9 ppt. For Perfluorohexane sulfonic acid (PFHxS), the ATSDR regulation is 70 ppt, the initial concentration was 15 ppt, and the final concentration was 3.9 ppt. The results indicated a 74% reduction in PFAS concentration in filtered samples. Statistical data through regression analysis showed 0.9 validity of the sample data. Initial tests show the efficiency of the proposed filter described could be far greater if tested at a greater scale. It is highly recommended further testing to be conducted to validate the data for an innovative solution to a ubiquitous problem.

Keywords: PFAS, PFOS, PFOA, PFHxS, low-cost filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 617
13384 Deterioration Assessment Models for Water Pipelines

Authors: L. Parvizsedghy, I. Gkountis, A. Senouci, T. Zayed, M. Alsharqawi, H. El Chanati, M. El-Abbasy, F. Mosleh

Abstract:

The aging and deterioration of water pipelines in cities worldwide result in more frequent water main breaks, water service disruptions, and flooding damage. Therefore, there is an urgent need for undertaking proper maintenance procedures to avoid breaks and disastrous failures. However, due to budget limitations, the maintenance of water pipeline networks needs to be prioritized through efficient deterioration assessment models. Previous studies focused on the development of structural or physical deterioration assessment models, which require expensive inspection data. But, this paper aims at developing deterioration assessment models for water pipelines using statistical techniques. Several deterioration models were developed based on pipeline size, material type, and soil type using linear regression analysis. The categorical nature of some variables affecting pipeline deterioration was considered through developing several categorical models. The developed models were validated with an average validity percentage greater than 95%. Moreover, sensitivity analysis was carried out against different classifications and it displayed higher importance of age of pipes compared to other factors. The developed models will be helpful for the water municipalities and asset managers to assess the condition of their pipes and prioritize them for maintenance and inspection purposes.

Keywords: Water pipelines, deterioration assessment models, regression analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1180
13383 Using TRACE and SNAP Codes to Establish the Model of Maanshan PWR for SBO Accident

Authors: B. R. Shen, J. R. Wang, J. H. Yang, S. W. Chen, C. Shih, Y. Chiang, Y. F. Chang, Y. H. Huang

Abstract:

In this research, TRACE code with the interface code-SNAP was used to simulate and analyze the SBO (station blackout) accident which occurred in Maanshan PWR (pressurized water reactor) nuclear power plant (NPP). There are four main steps in this research. First, the SBO accident data of Maanshan NPP were collected. Second, the TRACE/SNAP model of Maanshan NPP was established by using these data. Third, this TRACE/SNAP model was used to perform the simulation and analysis of SBO accident. Finally, the simulation and analysis of SBO with mitigation equipments was performed. The analysis results of TRACE are consistent with the data of Maanshan NPP. The mitigation equipments of Maanshan can maintain the safety of Maanshan in the SBO according to the TRACE predictions.

Keywords: PWR, TRACE, SBO, Maanshan.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 733
13382 Plant Varieties Selection System

Authors: Kitti Koonsanit, Chuleerat Jaruskulchai, Poonsak Miphokasap, Apisit Eiumnoh

Abstract:

In the end of the day, meteorological data and environmental data becomes widely used such as plant varieties selection system. Variety plant selection for planted area is of almost importance for all crops, including varieties of sugarcane. Since sugarcane have many varieties. Variety plant non selection for planting may not be adapted to the climate or soil conditions for planted area. Poor growth, bloom drop, poor fruit, and low price are to be from varieties which were not recommended for those planted area. This paper presents plant varieties selection system for planted areas in Thailand from meteorological data and environmental data by the use of decision tree techniques. With this software developed as an environmental data analysis tool, it can analyze resulting easier and faster. Our software is a front end of WEKA that provides fundamental data mining functions such as classify, clustering, and analysis functions. It also supports pre-processing, analysis, and decision tree output with exporting result. After that, our software can export and display data result to Google maps API in order to display result and plot plant icons effectively.

Keywords: Plant varieties selection system, decision tree, expert recommendation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1778
13381 Real-Time Image Analysis of Capsule Endoscopy for Bleeding Discrimination in Embedded System Platform

Authors: Yong-Gyu Lee, Gilwon Yoon

Abstract:

Image processing for capsule endoscopy requires large memory and it takes hours for diagnosis since operation time is normally more than 8 hours. A real-time analysis algorithm of capsule images can be clinically very useful. It can differentiate abnormal tissue from health structure and provide with correlation information among the images. Bleeding is our interest in this regard and we propose a method of detecting frames with potential bleeding in real-time. Our detection algorithm is based on statistical analysis and the shapes of bleeding spots. We tested our algorithm with 30 cases of capsule endoscopy in the digestive track. Results were excellent where a sensitivity of 99% and a specificity of 97% were achieved in detecting the image frames with bleeding spots.

Keywords: bleeding, capsule endoscopy, image processing, real time analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1849
13380 Program Memories Error Detection and Correction On-Board Earth Observation Satellites

Authors: Y. Bentoutou

Abstract:

Memory Errors Detection and Correction aim to secure the transaction of data between the central processing unit of a satellite onboard computer and its local memory. In this paper, the application of a double-bit error detection and correction method is described and implemented in Field Programmable Gate Array (FPGA) technology. The performance of the proposed EDAC method is measured and compared with two different EDAC devices, using the same FPGA technology. Statistical analysis of single-event upset (SEU) and multiple-bit upset (MBU) activity in commercial memories onboard the first Algerian microsatellite Alsat-1 is given.

Keywords: Error Detection and Correction, On-board computer, small satellite missions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2201
13379 A Novel SVM-Based OOK Detector in Low SNR Infrared Channels

Authors: J. P. Dubois, O. M. Abdul-Latif

Abstract:

Support Vector Machine (SVM) is a recent class of statistical classification and regression techniques playing an increasing role in applications to detection problems in various engineering problems, notably in statistical signal processing, pattern recognition, image analysis, and communication systems. In this paper, SVM is applied to an infrared (IR) binary communication system with different types of channel models including Ricean multipath fading and partially developed scattering channel with additive white Gaussian noise (AWGN) at the receiver. The structure and performance of SVM in terms of the bit error rate (BER) metric is derived and simulated for these channel stochastic models and the computational complexity of the implementation, in terms of average computational time per bit, is also presented. The performance of SVM is then compared to classical binary signal maximum likelihood detection using a matched filter driven by On-Off keying (OOK) modulation. We found that the performance of SVM is superior to that of the traditional optimal detection schemes used in statistical communication, especially for very low signal-to-noise ratio (SNR) ranges. For large SNR, the performance of the SVM is similar to that of the classical detectors. The implication of these results is that SVM can prove very beneficial to IR communication systems that notoriously suffer from low SNR at the cost of increased computational complexity.

Keywords: Least square-support vector machine, on-off keying, matched filter, maximum likelihood detector, wireless infrared communication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1936
13378 Dicotyledon Weed Quantification Algorithm for Selective Herbicide Application in Maize Crops: Statistical Evaluation of the Potential Herbicide Savings

Authors: Morten Stigaard Laursen, Rasmus Nyholm Jørgensen, Henrik Skov Midtiby, Anders Krogh Mortensen, Sanmohan Baby

Abstract:

This work contributes a statistical model and simulation framework yielding the best estimate possible for the potential herbicide reduction when using the MoDiCoVi algorithm all the while requiring a efficacy comparable to conventional spraying. In June 2013 a maize field located in Denmark were seeded. The field was divided into parcels which was assigned to one of two main groups: 1) Control, consisting of subgroups of no spray and full dose spraty; 2) MoDiCoVi algorithm subdivided into five different leaf cover thresholds for spray activation. In addition approximately 25% of the parcels were seeded with additional weeds perpendicular to the maize rows. In total 299 parcels were randomly assigned with the 28 different treatment combinations. In the statistical analysis, bootstrapping was used for balancing the number of replicates. The achieved potential herbicide savings was found to be 70% to 95% depending on the initial weed coverage. However additional field trials covering more seasons and locations are needed to verify the generalisation of these results. There is a potential for further herbicide savings as the time interval between the first and second spraying session was not long enough for the weeds to turn yellow, instead they only stagnated in growth.

Keywords: Weed crop discrimination, macrosprayer, herbicide reduction, site-specific, sprayer-boom.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1029
13377 A Survey of Semantic Integration Approaches in Bioinformatics

Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir

Abstract:

Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.

Keywords: Semantic data integration, biological ontology, linked data, semantic web, OWL, RDF.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1806
13376 Ship Detection Requirements Analysis for Different Sea States: Validation on Real SAR Data

Authors: Jaime Martín-de-Nicolás, David Mata-Moya, Nerea del-Rey-Maestre, Pedro Gómez-del-Hoyo, María-Pilar Jarabo-Amores

Abstract:

Ship detection is nowadays quite an important issue in tasks related to sea traffic control, fishery management and ship search and rescue. Although it has traditionally been carried out by patrol ships or aircrafts, coverage and weather conditions and sea state can become a problem. Synthetic aperture radars can surpass these coverage limitations and work under any climatological condition. A fast CFAR ship detector based on a robust statistical modeling of sea clutter with respect to sea states in SAR images is used. In this paper, the minimum SNR required to obtain a given detection probability with a given false alarm rate for any sea state is determined. A Gaussian target model using real SAR data is considered. Results show that SNR does not depend heavily on the class considered. Provided there is some variation in the backscattering of targets in SAR imagery, the detection probability is limited and a post-processing stage based on morphology would be suitable.

Keywords: SAR, generalized gamma distribution, detection curves, radar detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1155
13375 An Approach Based on Statistics and Multi-Resolution Representation to Classify Mammograms

Authors: Nebi Gedik

Abstract:

One of the significant and continual public health problems in the world is breast cancer. Early detection is very important to fight the disease, and mammography has been one of the most common and reliable methods to detect the disease in the early stages. However, it is a difficult task, and computer-aided diagnosis (CAD) systems are needed to assist radiologists in providing both accurate and uniform evaluation for mass in mammograms. In this study, a multiresolution statistical method to classify mammograms as normal and abnormal in digitized mammograms is used to construct a CAD system. The mammogram images are represented by wave atom transform, and this representation is made by certain groups of coefficients, independently. The CAD system is designed by calculating some statistical features using each group of coefficients. The classification is performed by using support vector machine (SVM).

Keywords: Wave atom transform, statistical features, multi-resolution representation, mammogram.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 864
13374 A Review: Comparative Analysis of Different Categorical Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, M. Sharmila

Abstract:

Over the past epoch a rampant amount of work has been done in the data clustering research under the unsupervised learning technique in Data mining. Furthermore several algorithms and methods have been proposed focusing on clustering different data types, representation of cluster models, and accuracy rates of the clusters. However no single clustering algorithm proves to be the most efficient in providing best results. Accordingly in order to find the solution to this issue a new technique, called Cluster ensemble method was bloomed. This cluster ensemble is a good alternative approach for facing the cluster analysis problem. The main hope of the cluster ensemble is to merge different clustering solutions in such a way to achieve accuracy and to improve the quality of individual data clustering. Due to the substantial and unremitting development of new methods in the sphere of data mining and also the incessant interest in inventing new algorithms, makes obligatory to scrutinize a critical analysis of the existing techniques and the future novelty. This paper exposes the comparative study of different cluster ensemble methods along with their features, systematic working process and the average accuracy and error rates of each ensemble methods. Consequently this speculative and comprehensive analysis will be very useful for the community of clustering practitioners and also helps in deciding the most suitable one to rectify the problem in hand.

Keywords: Clustering, Cluster Ensemble methods, Co-association matrix, Consensus function, Median partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2585
13373 Adaptive Kernel Principal Analysis for Online Feature Extraction

Authors: Mingtao Ding, Zheng Tian, Haixia Xu

Abstract:

The batch nature limits the standard kernel principal component analysis (KPCA) methods in numerous applications, especially for dynamic or large-scale data. In this paper, an efficient adaptive approach is presented for online extraction of the kernel principal components (KPC). The contribution of this paper may be divided into two parts. First, kernel covariance matrix is correctly updated to adapt to the changing characteristics of data. Second, KPC are recursively formulated to overcome the batch nature of standard KPCA.This formulation is derived from the recursive eigen-decomposition of kernel covariance matrix and indicates the KPC variation caused by the new data. The proposed method not only alleviates sub-optimality of the KPCA method for non-stationary data, but also maintains constant update speed and memory usage as the data-size increases. Experiments for simulation data and real applications demonstrate that our approach yields improvements in terms of both computational speed and approximation accuracy.

Keywords: adaptive method, kernel principal component analysis, online extraction, recursive algorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1536
13372 Analysis of Textual Data Based On Multiple 2-Class Classification Models

Authors: Shigeaki Sakurai, Ryohei Orihara

Abstract:

This paper proposes a new method for analyzing textual data. The method deals with items of textual data, where each item is described based on various viewpoints. The method acquires 2- class classification models of the viewpoints by applying an inductive learning method to items with multiple viewpoints. The method infers whether the viewpoints are assigned to the new items or not by using the models. The method extracts expressions from the new items classified into the viewpoints and extracts characteristic expressions corresponding to the viewpoints by comparing the frequency of expressions among the viewpoints. This paper also applies the method to questionnaire data given by guests at a hotel and verifies its effect through numerical experiments.

Keywords: Text mining, Multiple viewpoints, Differential analysis, Questionnaire data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1275
13371 Quality of Concrete of Recent Development Projects in Libya

Authors: Mohamed .S .Alazhari, Milad. M. Al Shebani

Abstract:

Numerous concrete structures projects are currently running in Libya as part of a US$50 billion government funding. The quality of concrete used in 20 different construction projects were assessed based mainly on the concrete compressive strength achieved. The projects are scattered all over the country and are at various levels of completeness. For most of these projects, the concrete compressive strength was obtained from test results of a 150mm standard cube mold. Statistical analysis of collected concrete compressive strengths reveals that the data in general followed a normal distribution pattern. The study covers comparison and assessment of concrete quality aspects such as: quality control, strength range, data standard deviation, data scatter, and ratio of minimum strength to design strength. Site quality control for these projects ranged from very good to poor according to ACI214 criteria [1]. The ranges (Rg) of the strength (max. strength – min. strength) divided by average strength are from (34% to 160%). Data scatter is measured as the range (Rg) divided by standard deviation () and is found to be (1.82 to 11.04), indicating that the range is ±3σ. International construction companies working in Libya follow different assessment criteria for concrete compressive strength in lieu of national unified procedure. The study reveals that assessments of concrete quality conducted by these construction companies usually meet their adopted (internal) standards, but sometimes fail to meet internationally known standard requirements. The assessment of concrete presented in this paper is based on ACI, British standards and proposed Libyan concrete strength assessment criteria.

Keywords: Acceptance criteria, Concrete, Compressive strength, quality control

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1747
13370 A Comprehensive Review on Different Mixed Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, S. Vidhya, M. Sharmila

Abstract:

An extensive amount of work has been done in data clustering research under the unsupervised learning technique in Data Mining during the past two decades. Moreover, several approaches and methods have been emerged focusing on clustering diverse data types, features of cluster models and similarity rates of clusters. However, none of the single clustering algorithm exemplifies its best nature in extracting efficient clusters. Consequently, in order to rectify this issue, a new challenging technique called Cluster Ensemble method was bloomed. This new approach tends to be the alternative method for the cluster analysis problem. The main objective of the Cluster Ensemble is to aggregate the diverse clustering solutions in such a way to attain accuracy and also to improve the eminence the individual clustering algorithms. Due to the massive and rapid development of new methods in the globe of data mining, it is highly mandatory to scrutinize a vital analysis of existing techniques and the future novelty. This paper shows the comparative analysis of different cluster ensemble methods along with their methodologies and salient features. Henceforth this unambiguous analysis will be very useful for the society of clustering experts and also helps in deciding the most appropriate one to resolve the problem in hand.

Keywords: Clustering, Cluster Ensemble Methods, Coassociation matrix, Consensus Function, Median Partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2093
13369 Time Domain and Frequency Domain Analyses of Measured Metocean Data for Malaysian Waters

Authors: Duong Vannak, Mohd Shahir Liew, Guo Zheng Yew

Abstract:

Data of wave height and wind speed were collected from three existing oil fields in South China Sea – offshore Peninsular Malaysia, Sarawak and Sabah regions. Extreme values and other significant data were employed for analysis. The data were recorded from 1999 until 2008. The results show that offshore structures are susceptible to unacceptable motions initiated by wind and waves with worst structural impacts caused by extreme wave heights. To protect offshore structures from damage, there is a need to quantify descriptive statistics and determine spectra envelope of wind speed and wave height, and to ascertain the frequency content of each spectrum for offshore structures in the South China Sea shallow waters using measured time series. The results indicate that the process is nonstationary; it is converted to stationary process by first differencing the time series. For descriptive statistical analysis, both wind speed and wave height have significant influence on the offshore structure during the northeast monsoon with high mean wind speed of 13.5195 knots ( = 6.3566 knots) and the high mean wave height of 2.3597 m ( = 0.8690 m). Through observation of the spectra, there is no clear dominant peak and the peaks fluctuate randomly. Each wind speed spectrum and wave height spectrum has its individual identifiable pattern. The wind speed spectrum tends to grow gradually at the lower frequency range and increasing till it doubles at the higher frequency range with the mean peak frequency range of 0.4104 Hz to 0.4721 Hz, while the wave height tends to grow drastically at the low frequency range, which then fluctuates and decreases slightly at the high frequency range with the mean peak frequency range of 0.2911 Hz to 0.3425 Hz.

Keywords: Metocean, Offshore Engineering, Time Series, Descriptive Statistics, Autospectral Density Function, Wind, Wave.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3659
13368 Defect Detection of Tiles Using 2D-Wavelet Transform and Statistical Features

Authors: M.Ghazvini, S. A. Monadjemi, N. Movahhedinia, K. Jamshidi

Abstract:

In this article, a method has been offered to classify normal and defective tiles using wavelet transform and artificial neural networks. The proposed algorithm calculates max and min medians as well as the standard deviation and average of detail images obtained from wavelet filters, then comes by feature vectors and attempts to classify the given tile using a Perceptron neural network with a single hidden layer. In this study along with the proposal of using median of optimum points as the basic feature and its comparison with the rest of the statistical features in the wavelet field, the relational advantages of Haar wavelet is investigated. This method has been experimented on a number of various tile designs and in average, it has been valid for over 90% of the cases. Amongst the other advantages, high speed and low calculating load are prominent.

Keywords: Defect detection, tile and ceramic quality inspection, wavelet transform, classification, neural networks, statistical features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2351
13367 Optimization of the Nutrient Supplients for Cellulase Production with the Basal Medium Palm Oil Mill Effluent

Authors: Rashid S S, Alam M Z, Karim M I A, Salleh, M H

Abstract:

A statistical optimization was studied to design a media composition to produce optimum cellulolytic enzyme where palm oil mill effluent (POME) as a basal medium and filamentous fungus, Trichoderma reesei RUT-C30 were used in the liquid state bioconversion(LSB). 2% (w/v) total suspended solid, TSS, of the POME supplemented with 1% (w/v) cellulose, 0.5%(w/v) peptone and 0.02% (v/v) Tween 80 was estimated to produce the optimum CMCase activity of 18.53 U/ml through the statistical analysis followed by the faced centered central composite design(FCCCD). The probability values of cellulose (<0.0011) and peptone (0.0021) indicated the significant effect on the production of cellulase with the determination coefficient (R2) of 0.995.

Keywords: Face centered central composite design (FCCCD), Liquid state bioconversion (LSB), Palm oil mill effluent, Trichoderma reesei RUT C-30.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2126
13366 The Impact of Environmental Dynamism on Strategic Outsourcing Success

Authors: Mohamad Ghozali Hassan, Abdul Aziz Othman, Mohd Azril Ismail

Abstract:

Adapting quickly to environmental dynamism is essential for an organization to develop outsourcing strategic and management in order to sustain competitive advantage. This research used the Partial Least Squares Structural Equation Modeling (PLSSEM) tool to investigate the factors of environmental dynamism impact on the strategic outsourcing success among electrical and electronic manufacturing industries in outsourcing management. Statistical results confirm that the inclusion of customer demand, technological change, and competition level as a new combination concept of environmental dynamism, has positive effects on outsourcing success. Additionally, this research demonstrates the acceptability of PLS-SEM as a statistical analysis to furnish a better understanding of environmental dynamism in outsourcing management in Malaysia. A practical finding contributes to academics and practitioners in the field of outsourcing management.

Keywords: Environmental Dynamism, Customer Demand, Technological Change, Competition Level, Outsourcing Success.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2181
13365 Dynamical Analysis of Circadian Gene Expression

Authors: Carla Layana Luis Diambra

Abstract:

Microarrays technique allows the simultaneous measurements of the expression levels of thousands of mRNAs. By mining this data one can identify the dynamics of the gene expression time series. By recourse of principal component analysis, we uncover the circadian rhythmic patterns underlying the gene expression profiles from Cyanobacterium Synechocystis. We applied PCA to reduce the dimensionality of the data set. Examination of the components also provides insight into the underlying factors measured in the experiments. Our results suggest that all rhythmic content of data can be reduced to three main components.

Keywords: circadian rhythms, clustering, gene expression, PCA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1576
13364 Applying Hybrid Graph Drawing and Clustering Methods on Stock Investment Analysis

Authors: Mouataz Zreika, Maria Estela Varua

Abstract:

Stock investment decisions are often made based on current events of the global economy and the analysis of historical data. Conversely, visual representation could assist investors’ gain deeper understanding and better insight on stock market trends more efficiently. The trend analysis is based on long-term data collection. The study adopts a hybrid method that combines the Clustering algorithm and Force-directed algorithm to overcome the scalability problem when visualizing large data. This method exemplifies the potential relationships between each stock, as well as determining the degree of strength and connectivity, which will provide investors another understanding of the stock relationship for reference. Information derived from visualization will also help them make an informed decision. The results of the experiments show that the proposed method is able to produced visualized data aesthetically by providing clearer views for connectivity and edge weights.

Keywords: Clustering, force-directed, graph drawing, stock investment analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1582
13363 Wind Power Forecast Error Simulation Model

Authors: Josip Vasilj, Petar Sarajcev, Damir Jakus

Abstract:

One of the major difficulties introduced with wind power penetration is the inherent uncertainty in production originating from uncertain wind conditions. This uncertainty impacts many different aspects of power system operation, especially the balancing power requirements. For this reason, in power system development planing, it is necessary to evaluate the potential uncertainty in future wind power generation. For this purpose, simulation models are required, reproducing the performance of wind power forecasts. This paper presents a wind power forecast error simulation models which are based on the stochastic process simulation. Proposed models capture the most important statistical parameters recognized in wind power forecast error time series. Furthermore, two distinct models are presented based on data availability. First model uses wind speed measurements on potential or existing wind power plant locations, while the seconds model uses statistical distribution of wind speeds.

Keywords: Wind power, Uncertainty, Stochastic process, Monte Carlo simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3910
13362 Assessing the Theoretical Suitability of Sentinel-2 and WorldView-3 Data for Hydrocarbon Mapping of Spill Events, Using HYSS

Authors: K. Tunde Olagunju, C. Scott Allen, F.D. (Freek) van der Meer

Abstract:

Identification of hydrocarbon oil in remote sensing images is often the first step in monitoring oil during spill events. Most remote sensing methods adopt techniques for hydrocarbon identification to achieve detection in order to model an appropriate cleanup program. Identification on optical sensors does not only allow for detection but also for characterization and quantification. Until recently, in optical remote sensing, quantification and characterization were only potentially possible using high-resolution laboratory and airborne imaging spectrometers (hyperspectral data). Unlike multispectral, hyperspectral data are not freely available, as this data category is mainly obtained via airborne survey at present. In this research, two operational high-resolution multispectral satellites (WorldView-3 and Sentinel-2) are theoretically assessed for their suitability for hydrocarbon characterization, using the Hydrocarbon Spectra Slope model (HYSS). This method utilized the two most persistent hydrocarbon diagnostic/absorption features at 1.73 µm and 2.30 µm for hydrocarbon mapping on multispectral data. In this research, spectra measurement of seven different hydrocarbon oils (crude and refined oil) taken on 10 different substrates with the use of laboratory ASD Fieldspec were convolved to Sentinel-2 and WorldView-3 resolution, using their full width half maximum (FWHM) parameter. The resulting hydrocarbon slope values obtained from the studied samples enable clear qualitative discrimination of most hydrocarbons, despite the presence of different background substrates, particularly on WorldView-3. Due to close conformity of central wavelengths and narrow bandwidths to key hydrocarbon bands used in HYSS, the statistical significance for qualitative analysis on WorldView-3 sensors for all studied hydrocarbon oil returned with 95% confidence level (P-value ˂ 0.01), except for Diesel. Using multifactor analysis of variance (MANOVA), the discriminating power of HYSS is statistically significant for most hydrocarbon-substrate combinations on Sentinel-2 and WorldView-3 FWHM, revealing the potential of these two operational multispectral sensors as rapid response tools for hydrocarbon mapping. One notable exception is highly transmissive hydrocarbons on Sentinel-2 data due to the non-conformity of spectral bands with key hydrocarbon absorptions and the relatively coarse bandwidth (> 100 nm).

Keywords: hydrocarbon, oil spill, remote sensing, hyperspectral, multispectral, hydrocarbon – substrate combination, Sentinel-2, WorldView-3

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 674
13361 Predictive Maintenance of Industrial Shredders: Efficient Operation through Real-Time Monitoring Using Statistical Machine Learning

Authors: Federico Pittino, Dominik Holzmann, Krithika Sayar-Chand, Stefan Moser, Sebastian Pliessnig, Thomas Arnold

Abstract:

The shredding of waste materials is a key step in the recycling process towards circular economy. Industrial shredders for waste processing operate in very harsh operating conditions, leading to the need of frequent maintenance of critical components. The maintenance optimization is particularly important also to increase the machine’s efficiency, thereby reducing the operational costs. In this work, a monitoring system has been developed and deployed on an industrial shredder located at a waste recycling plant in Austria. The machine has been monitored for several months and methods for predictive maintenance have been developed for two key components: the cutting knives and the drive belt. The large amount of collected data is leveraged by statistical machine learning techniques, thereby not requiring a very detailed knowledge of the machine or its live operating conditions. The results show that, despite the wide range of operating conditions, a reliable estimate of the optimal time for maintenance can be derived. Moreover, the trade-off between the cost of maintenance and the increase in power consumption due to the wear state of the monitored components of the machine is investigated. This work proves the benefits of real-time monitoring system for efficient operation of industrial shredders.

Keywords: predictive maintenance, circular economy, industrial shredder, cost optimization, statistical machine learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 610