Search results for: window data envelopment analysis
13419 Normalization Discriminant Independent Component Analysis
Authors: Liew Yee Ping, Pang Ying Han, Lau Siong Hoe, Ooi Shih Yin, Housam Khalifa Bashier Babiker
Abstract:
In face recognition, feature extraction techniques attempts to search for appropriate representation of the data. However, when the feature dimension is larger than the samples size, it brings performance degradation. Hence, we propose a method called Normalization Discriminant Independent Component Analysis (NDICA). The input data will be regularized to obtain the most reliable features from the data and processed using Independent Component Analysis (ICA). The proposed method is evaluated on three face databases, Olivetti Research Ltd (ORL), Face Recognition Technology (FERET) and Face Recognition Grand Challenge (FRGC). NDICA showed it effectiveness compared with other unsupervised and supervised techniques.
Keywords: Face recognition, small sample size, regularization, independent component analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 195413418 Frequent Itemset Mining Using Rough-Sets
Authors: Usman Qamar, Younus Javed
Abstract:
Frequent pattern mining is the process of finding a pattern (a set of items, subsequences, substructures, etc.) that occurs frequently in a data set. It was proposed in the context of frequent itemsets and association rule mining. Frequent pattern mining is used to find inherent regularities in data. What products were often purchased together? Its applications include basket data analysis, cross-marketing, catalog design, sale campaign analysis, Web log (click stream) analysis, and DNA sequence analysis. However, one of the bottlenecks of frequent itemset mining is that as the data increase the amount of time and resources required to mining the data increases at an exponential rate. In this investigation a new algorithm is proposed which can be uses as a pre-processor for frequent itemset mining. FASTER (FeAture SelecTion using Entropy and Rough sets) is a hybrid pre-processor algorithm which utilizes entropy and roughsets to carry out record reduction and feature (attribute) selection respectively. FASTER for frequent itemset mining can produce a speed up of 3.1 times when compared to original algorithm while maintaining an accuracy of 71%.
Keywords: Rough-sets, Classification, Feature Selection, Entropy, Outliers, Frequent itemset mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 243413417 Using SNAP and RADTRAD to Establish the Analysis Model for Maanshan PWR Plant
Authors: J. R. Wang, H. C. Chen, C. Shih, S. W. Chen, J. H. Yang, Y. Chiang
Abstract:
In this study, we focus on the establishment of the analysis model for Maanshan PWR nuclear power plant (NPP) by using RADTRAD and SNAP codes with the FSAR, manuals, and other data. In order to evaluate the cumulative dose at the Exclusion Area Boundary (EAB) and Low Population Zone (LPZ) outer boundary, Maanshan NPP RADTRAD/SNAP model was used to perform the analysis of the DBA LOCA case. The analysis results of RADTRAD were similar to FSAR data. These analysis results were lower than the failure criteria of 10 CFR 100.11 (a total radiation dose to the whole body, 250 mSv; a total radiation dose to the thyroid from iodine exposure, 3000 mSv).Keywords: RADionuclide, transport, removal, and dose estimation, RADTRAD, symbolic nuclear analysis package, SNAP, dose, PWR.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 102513416 Attribute Analysis of Quick Response Code Payment Users Using Discriminant Non-negative Matrix Factorization
Authors: Hironori Karachi, Haruka Yamashita
Abstract:
Recently, the system of quick response (QR) code is getting popular. Many companies introduce new QR code payment services and the services are competing with each other to increase the number of users. For increasing the number of users, we should grasp the difference of feature of the demographic information, usage information, and value of users between services. In this study, we conduct an analysis of real-world data provided by Nomura Research Institute including the demographic data of users and information of users’ usages of two services; LINE Pay, and PayPay. For analyzing such data and interpret the feature of them, Nonnegative Matrix Factorization (NMF) is widely used; however, in case of the target data, there is a problem of the missing data. EM-algorithm NMF (EMNMF) to complete unknown values for understanding the feature of the given data presented by matrix shape. Moreover, for comparing the result of the NMF analysis of two matrices, there is Discriminant NMF (DNMF) shows the difference of users features between two matrices. In this study, we combine EMNMF and DNMF and also analyze the target data. As the interpretation, we show the difference of the features of users between LINE Pay and Paypay.
Keywords: Data science, non-negative matrix factorization, missing data, quality of services.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 45313415 Spatial Variability of Brahmaputra River Flow Characteristics
Authors: Hemant Kumar
Abstract:
Brahmaputra River is known according to the Hindu mythology the son of the Lord Brahma. According to this name, the river Brahmaputra creates mass destruction during the monsoon season in Assam, India. It is a state situated in North-East part of India. This is one of the essential states out of the seven countries of eastern India, where almost all entire Brahmaputra flow carried out. The other states carry their tributaries. In the present case study, the spatial analysis performed in this specific case the number of MODIS data are acquired. In the method of detecting the change, the spray content was found during heavy rainfall and in the flooded monsoon season. By this method, particularly the analysis over the Brahmaputra outflow determines the flooded season. The charged particle-associated in aerosol content genuinely verifies the heavy water content below the ground surface, which is validated by trend analysis through rainfall spectrum data. This is confirmed by in-situ sampled view data from a different position of Brahmaputra River. Further, a Hyperion Hyperspectral 30 m resolution data were used to scan the sediment deposits, which is also confirmed by in-situ sampled view data from a different position.
Keywords: Spatial analysis, change detection, aerosol, trend analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 54213414 3D Modelling and Numerical Analysis of Human Inner Ear by Means of Finite Elements Method
Authors: C. Castro-Egler, A. Durán-Escalante, A. García-González
Abstract:
This paper presents a method to generate a finite element model of the human auditory inner ear system. The geometric model has been realized using 2D images from a virtual model of temporal bones. A point cloud has been gotten manually from those images to construct a whole mesh with hexahedral elements. The main difference with the predecessor models is the spiral shape of the cochlea with its three scales completely defined: scala tympani, scala media and scala vestibuli; which are separate by basilar membrane and Reissner membrane. To validate this model, numerical simulations have been realised with two models: an isolated inner ear and a whole model of human auditory system. Ideal conditions of displacement are applied over the oval window in the isolated Inner Ear model. The whole model is made up of the outer auditory channel, the tympani, the ossicular chain, and the inner ear. The boundary condition for the whole model is 1Pa over the auditory channel entrance. The numerical simulations by FEM have been done using a harmonic analysis with a frequency range between 100-10.000 Hz with an interval of 100Hz. The following results have been carried out: basilar membrane displacement; the scala media pressure according to the cochlea length and the transfer function of the middle ear normalized with the pressure in the tympanic membrane. The basilar membrane displacements and the pressure in the scala media make it possible to validate the response in frequency of the basilar membrane.
Keywords: Finite elements method, human auditory system model, numerical analysis, 3D modelling cochlea.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 153213413 Speech Enhancement of Vowels Based on Pitch and Formant Frequency
Authors: R. Rishma Rodrigo, R. Radhika, M. Vanitha Lakshmi
Abstract:
Numerous signal processing based speech enhancement systems have been proposed to improve intelligibility in the presence of noise. Traditionally, studies of neural vowel encoding have focused on the representation of formants (peaks in vowel spectra) in the discharge patterns of the population of auditory-nerve (AN) fibers. A method is presented for recording high-frequency speech components into a low-frequency region, to increase audibility for hearing loss listeners. The purpose of the paper is to enhance the formant of the speech based on the Kaiser window. The pitch and formant of the signal is based on the auto correlation, zero crossing and magnitude difference function. The formant enhancement stage aims to restore the representation of formants at the level of the midbrain. A MATLAB software’s are used for the implementation of the system with low complexity is developed.
Keywords: Formant estimation, formant enhancement, pitch detection, speech analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 164113412 Simulation Data Summarization Based on Spatial Histograms
Authors: Jing Zhao, Yoshiharu Ishikawa, Chuan Xiao, Kento Sugiura
Abstract:
In order to analyze large-scale scientific data, research on data exploration and visualization has gained popularity. In this paper, we focus on the exploration and visualization of scientific simulation data, and define a spatial V-Optimal histogram for data summarization. We propose histogram construction algorithms based on a general binary hierarchical partitioning as well as a more specific one, the l-grid partitioning. For effective data summarization and efficient data visualization in scientific data analysis, we propose an optimal algorithm as well as a heuristic algorithm for histogram construction. To verify the effectiveness and efficiency of the proposed methods, we conduct experiments on the massive evacuation simulation data.Keywords: Simulation data, data summarization, spatial histograms, exploration and visualization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 75313411 Data Mining Classification Methods Applied in Drug Design
Authors: Mária Stachová, Lukáš Sobíšek
Abstract:
Data mining incorporates a group of statistical methods used to analyze a set of information, or a data set. It operates with models and algorithms, which are powerful tools with the great potential. They can help people to understand the patterns in certain chunk of information so it is obvious that the data mining tools have a wide area of applications. For example in the theoretical chemistry data mining tools can be used to predict moleculeproperties or improve computer-assisted drug design. Classification analysis is one of the major data mining methodologies. The aim of thecontribution is to create a classification model, which would be able to deal with a huge data set with high accuracy. For this purpose logistic regression, Bayesian logistic regression and random forest models were built using R software. TheBayesian logistic regression in Latent GOLD software was created as well. These classification methods belong to supervised learning methods. It was necessary to reduce data matrix dimension before construct models and thus the factor analysis (FA) was used. Those models were applied to predict the biological activity of molecules, potential new drug candidates.Keywords: data mining, classification, drug design, QSAR
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 284913410 Analysis of Web User Identification Methods
Authors: Renáta Iváncsy, Sándor Juhász
Abstract:
Web usage mining has become a popular research area, as a huge amount of data is available online. These data can be used for several purposes, such as web personalization, web structure enhancement, web navigation prediction etc. However, the raw log files are not directly usable; they have to be preprocessed in order to transform them into a suitable format for different data mining tasks. One of the key issues in the preprocessing phase is to identify web users. Identifying users based on web log files is not a straightforward problem, thus various methods have been developed. There are several difficulties that have to be overcome, such as client side caching, changing and shared IP addresses and so on. This paper presents three different methods for identifying web users. Two of them are the most commonly used methods in web log mining systems, whereas the third on is our novel approach that uses a complex cookie-based method to identify web users. Furthermore we also take steps towards identifying the individuals behind the impersonal web users. To demonstrate the efficiency of the new method we developed an implementation called Web Activity Tracking (WAT) system that aims at a more precise distinction of web users based on log data. We present some statistical analysis created by the WAT on real data about the behavior of the Hungarian web users and a comprehensive analysis and comparison of the three methodsKeywords: Data preparation, Tracking individuals, Web useridentification, Web usage mining
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 439213409 A Meta-Analytic Path Analysis of e-Learning Acceptance Model
Authors: David W.S. Tai, Ren-Cheng Zhang, Sheng-Hung Chang, Chin-Pin Chen, Jia-Ling Chen
Abstract:
This study reports results of a meta-analytic path analysis e-learning Acceptance Model with k = 27 studies, Databases searched included Information Sciences Institute (ISI) website. Variables recorded included perceived usefulness, perceived ease of use, attitude toward behavior, and behavioral intention to use e-learning. A correlation matrix of these variables was derived from meta-analytic data and then analyzed by using structural path analysis to test the fitness of the e-learning acceptance model to the observed aggregated data. Results showed the revised hypothesized model to be a reasonable, good fit to aggregated data. Furthermore, discussions and implications are given in this article.
Keywords: E-learning, Meta Analytic Path Analysis, Technology Acceptance Model
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 244513408 Big Data: Concepts, Technologies and Applications in the Public Sector
Authors: A. Alexandru, C. A. Alexandru, D. Coardos, E. Tudora
Abstract:
Big Data (BD) is associated with a new generation of technologies and architectures which can harness the value of extremely large volumes of very varied data through real time processing and analysis. It involves changes in (1) data types, (2) accumulation speed, and (3) data volume. This paper presents the main concepts related to the BD paradigm, and introduces architectures and technologies for BD and BD sets. The integration of BD with the Hadoop Framework is also underlined. BD has attracted a lot of attention in the public sector due to the newly emerging technologies that allow the availability of network access. The volume of different types of data has exponentially increased. Some applications of BD in the public sector in Romania are briefly presented.
Keywords: Big data, big data Analytics, Hadoop framework, cloud computing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 232113407 Extreme Temperature Forecast in Mbonge, Cameroon through Return Level Analysis of the Generalized Extreme Value (GEV) Distribution
Authors: Nkongho Ayuketang Arreyndip, Ebobenow Joseph
Abstract:
In this paper, temperature extremes are forecast by employing the block maxima method of the Generalized extreme value(GEV) distribution to analyse temperature data from the Cameroon Development Corporation (C.D.C). By considering two sets of data (Raw data and simulated data) and two (stationary and non-stationary) models of the GEV distribution, return levels analysis is carried out and it was found that in the stationary model, the return values are constant over time with the raw data while in the simulated data, the return values show an increasing trend but with an upper bound. In the non-stationary model, the return levels of both the raw data and simulated data show an increasing trend but with an upper bound. This clearly shows that temperatures in the tropics even-though show a sign of increasing in the future, there is a maximum temperature at which there is no exceedence. The results of this paper are very vital in Agricultural and Environmental research.Keywords: Return level, Generalized extreme value (GEV), Meteorology, Forecasting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 210613406 Analysis of Lead Time Delays in Supply Chain: A Case Study
Authors: Abdel-Aziz M. Mohamed, Nermeen Coutry
Abstract:
Lead time is a critical measure of a supply chain's performance. It impacts both the customer satisfactions as well as the total cost of inventory. This paper presents the result of a study on the analysis of the customer order lead-time for a multinational company. In the study, the lead time was divided into three stages respectively: order entry, order fulfillment, and order delivery. A sample of size 2,425 order lines was extracted from the company's records to use for this study. The sample data entails information regarding customer orders from the time of order entry until order delivery. Data regarding the lead time of each stage for different orders were also provided. Summary statistics on lead time data reveals that about 30% of the orders were delivered later than the scheduled due date. The result of the multiple linear regression analysis technique revealed that component type, logistics parameter, order size and the customer type have significant impacts on lead time. Data analysis on the stages of lead time indicates that stage 2 consumed over 50% of the lead time. Pareto analysis was made to study the reasons for the customer order delay in each stage. Recommendation was given to resolve the problem.Keywords: Lead time reduction, customer satisfaction, service quality, statistical analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 669013405 Analysis and Comparison of Image Encryption Algorithms
Authors: İsmet Öztürk, İbrahim Soğukpınar
Abstract:
With the fast progression of data exchange in electronic way, information security is becoming more important in data storage and transmission. Because of widely using images in industrial process, it is important to protect the confidential image data from unauthorized access. In this paper, we analyzed current image encryption algorithms and compression is added for two of them (Mirror-like image encryption and Visual Cryptography). Implementations of these two algorithms have been realized for experimental purposes. The results of analysis are given in this paper.
Keywords: image encryption, image cryptosystem, security, transmission
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 495813404 Enhanced Clustering Analysis and Visualization Using Kohonen's Self-Organizing Feature Map Networks
Authors: Kasthurirangan Gopalakrishnan, Siddhartha Khaitan, Anshu Manik
Abstract:
Cluster analysis is the name given to a diverse collection of techniques that can be used to classify objects (e.g. individuals, quadrats, species etc). While Kohonen's Self-Organizing Feature Map (SOFM) or Self-Organizing Map (SOM) networks have been successfully applied as a classification tool to various problem domains, including speech recognition, image data compression, image or character recognition, robot control and medical diagnosis, its potential as a robust substitute for clustering analysis remains relatively unresearched. SOM networks combine competitive learning with dimensionality reduction by smoothing the clusters with respect to an a priori grid and provide a powerful tool for data visualization. In this paper, SOM is used for creating a toroidal mapping of two-dimensional lattice to perform cluster analysis on results of a chemical analysis of wines produced in the same region in Italy but derived from three different cultivators, referred to as the “wine recognition data" located in the University of California-Irvine database. The results are encouraging and it is believed that SOM would make an appealing and powerful decision-support system tool for clustering tasks and for data visualization.
Keywords: Artificial neural networks, cluster analysis, Kohonen maps, wine recognition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 212313403 Plant Varieties Selection System
Authors: Kitti Koonsanit, Chuleerat Jaruskulchai, Poonsak Miphokasap, Apisit Eiumnoh
Abstract:
In the end of the day, meteorological data and environmental data becomes widely used such as plant varieties selection system. Variety plant selection for planted area is of almost importance for all crops, including varieties of sugarcane. Since sugarcane have many varieties. Variety plant non selection for planting may not be adapted to the climate or soil conditions for planted area. Poor growth, bloom drop, poor fruit, and low price are to be from varieties which were not recommended for those planted area. This paper presents plant varieties selection system for planted areas in Thailand from meteorological data and environmental data by the use of decision tree techniques. With this software developed as an environmental data analysis tool, it can analyze resulting easier and faster. Our software is a front end of WEKA that provides fundamental data mining functions such as classify, clustering, and analysis functions. It also supports pre-processing, analysis, and decision tree output with exporting result. After that, our software can export and display data result to Google maps API in order to display result and plot plant icons effectively.
Keywords: Plant varieties selection system, decision tree, expert recommendation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 179313402 Using TRACE and SNAP Codes to Establish the Model of Maanshan PWR for SBO Accident
Authors: B. R. Shen, J. R. Wang, J. H. Yang, S. W. Chen, C. Shih, Y. Chiang, Y. F. Chang, Y. H. Huang
Abstract:
In this research, TRACE code with the interface code-SNAP was used to simulate and analyze the SBO (station blackout) accident which occurred in Maanshan PWR (pressurized water reactor) nuclear power plant (NPP). There are four main steps in this research. First, the SBO accident data of Maanshan NPP were collected. Second, the TRACE/SNAP model of Maanshan NPP was established by using these data. Third, this TRACE/SNAP model was used to perform the simulation and analysis of SBO accident. Finally, the simulation and analysis of SBO with mitigation equipments was performed. The analysis results of TRACE are consistent with the data of Maanshan NPP. The mitigation equipments of Maanshan can maintain the safety of Maanshan in the SBO according to the TRACE predictions.
Keywords: PWR, TRACE, SBO, Maanshan.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 76613401 Collocation Assessment between GEO and GSO Satellites
Authors: A. E. Emam, M. Abd Elghany
Abstract:
The change in orbit evolution between collocated satellites (X, Y) inside +/-0.09° E/W and +/- 0.07° N/S cluster, after one of these satellites is placed in an inclined orbit (satellite X) and the effect of this change in the collocation safety inside the cluster window has been studied and evaluated. Several collocation scenarios had been studied in order to adjust the location of both satellites inside their cluster to maximize the separation between them and safe the mission.Keywords: Satellite, GEO, collocation, risk assessment.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 232713400 A Survey of Semantic Integration Approaches in Bioinformatics
Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir
Abstract:
Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.Keywords: Semantic data integration, biological ontology, linked data, semantic web, OWL, RDF.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 181913399 Thin Bed Reservoir Delineation Using Spectral Decomposition and Instantaneous Seismic Attributes, Pohokura Field, Taranaki Basin, New Zealand
Authors: P. Sophon, M. Kruachanta, S. Chaisri, G. Leaungvongpaisan, P. Wongpornchai
Abstract:
The thick bed hydrocarbon reservoirs are primarily interested because of the more prolific production. When the amount of petroleum in the thick bed starts decreasing, the thin bed reservoirs are the alternative targets to maintain the reserves. The conventional interpretation of seismic data cannot delineate the thin bed having thickness less than the vertical seismic resolution. Therefore, spectral decomposition and instantaneous seismic attributes were used to delineate the thin bed in this study. Short Window Discrete Fourier Transform (SWDFT) spectral decomposition and instantaneous frequency attributes were used to reveal the thin bed reservoir, while Continuous Wavelet Transform (CWT) spectral decomposition and envelope (instantaneous amplitude) attributes were used to indicate hydrocarbon bearing zone. The study area is located in the Pohokura Field, Taranaki Basin, New Zealand. The thin bed target is the uppermost part of Mangahewa Formation, the most productive in the gas-condensate production in the Pohokura Field. According to the time-frequency analysis, SWDFT spectral decomposition can reveal the thin bed using a 72 Hz SWDFT isofrequency section and map, and that is confirmed by the instantaneous frequency attribute. The envelope attribute showing the high anomaly indicates the hydrocarbon accumulation area at the thin bed target. Moreover, the CWT spectral decomposition shows the low-frequency shadow zone and abnormal seismic attenuation in the higher isofrequencies below the thin bed confirms that the thin bed can be a prospective hydrocarbon zone.
Keywords: Hydrocarbon indication, instantaneous seismic attribute, spectral decomposition, thin bed delineation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 64013398 A Review: Comparative Analysis of Different Categorical Data Clustering Ensemble Methods
Authors: S. Sarumathi, N. Shanthi, M. Sharmila
Abstract:
Over the past epoch a rampant amount of work has been done in the data clustering research under the unsupervised learning technique in Data mining. Furthermore several algorithms and methods have been proposed focusing on clustering different data types, representation of cluster models, and accuracy rates of the clusters. However no single clustering algorithm proves to be the most efficient in providing best results. Accordingly in order to find the solution to this issue a new technique, called Cluster ensemble method was bloomed. This cluster ensemble is a good alternative approach for facing the cluster analysis problem. The main hope of the cluster ensemble is to merge different clustering solutions in such a way to achieve accuracy and to improve the quality of individual data clustering. Due to the substantial and unremitting development of new methods in the sphere of data mining and also the incessant interest in inventing new algorithms, makes obligatory to scrutinize a critical analysis of the existing techniques and the future novelty. This paper exposes the comparative study of different cluster ensemble methods along with their features, systematic working process and the average accuracy and error rates of each ensemble methods. Consequently this speculative and comprehensive analysis will be very useful for the community of clustering practitioners and also helps in deciding the most suitable one to rectify the problem in hand.
Keywords: Clustering, Cluster Ensemble methods, Co-association matrix, Consensus function, Median partition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 260313397 Analysis of Textual Data Based On Multiple 2-Class Classification Models
Authors: Shigeaki Sakurai, Ryohei Orihara
Abstract:
This paper proposes a new method for analyzing textual data. The method deals with items of textual data, where each item is described based on various viewpoints. The method acquires 2- class classification models of the viewpoints by applying an inductive learning method to items with multiple viewpoints. The method infers whether the viewpoints are assigned to the new items or not by using the models. The method extracts expressions from the new items classified into the viewpoints and extracts characteristic expressions corresponding to the viewpoints by comparing the frequency of expressions among the viewpoints. This paper also applies the method to questionnaire data given by guests at a hotel and verifies its effect through numerical experiments.
Keywords: Text mining, Multiple viewpoints, Differential analysis, Questionnaire data
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 129013396 Adaptive Kernel Principal Analysis for Online Feature Extraction
Authors: Mingtao Ding, Zheng Tian, Haixia Xu
Abstract:
The batch nature limits the standard kernel principal component analysis (KPCA) methods in numerous applications, especially for dynamic or large-scale data. In this paper, an efficient adaptive approach is presented for online extraction of the kernel principal components (KPC). The contribution of this paper may be divided into two parts. First, kernel covariance matrix is correctly updated to adapt to the changing characteristics of data. Second, KPC are recursively formulated to overcome the batch nature of standard KPCA.This formulation is derived from the recursive eigen-decomposition of kernel covariance matrix and indicates the KPC variation caused by the new data. The proposed method not only alleviates sub-optimality of the KPCA method for non-stationary data, but also maintains constant update speed and memory usage as the data-size increases. Experiments for simulation data and real applications demonstrate that our approach yields improvements in terms of both computational speed and approximation accuracy.
Keywords: adaptive method, kernel principal component analysis, online extraction, recursive algorithm
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 155213395 A Comprehensive Review on Different Mixed Data Clustering Ensemble Methods
Authors: S. Sarumathi, N. Shanthi, S. Vidhya, M. Sharmila
Abstract:
An extensive amount of work has been done in data clustering research under the unsupervised learning technique in Data Mining during the past two decades. Moreover, several approaches and methods have been emerged focusing on clustering diverse data types, features of cluster models and similarity rates of clusters. However, none of the single clustering algorithm exemplifies its best nature in extracting efficient clusters. Consequently, in order to rectify this issue, a new challenging technique called Cluster Ensemble method was bloomed. This new approach tends to be the alternative method for the cluster analysis problem. The main objective of the Cluster Ensemble is to aggregate the diverse clustering solutions in such a way to attain accuracy and also to improve the eminence the individual clustering algorithms. Due to the massive and rapid development of new methods in the globe of data mining, it is highly mandatory to scrutinize a vital analysis of existing techniques and the future novelty. This paper shows the comparative analysis of different cluster ensemble methods along with their methodologies and salient features. Henceforth this unambiguous analysis will be very useful for the society of clustering experts and also helps in deciding the most appropriate one to resolve the problem in hand.
Keywords: Clustering, Cluster Ensemble Methods, Coassociation matrix, Consensus Function, Median Partition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 210413394 Solar Architecture of Low-Energy Buildings for Industrial Applications
Authors: P. Brinks, O. Kornadt, R. Oly
Abstract:
This research focuses on the optimization of glazed surfaces and the assessment of possible solar gains in industrial buildings. Existing window rating methods for single windows were evaluated and a new method for a simple analysis of energy gains and losses by single windows was introduced. Furthermore extensive transient building simulations were carried out to appraise the performance of low cost polycarbonate multi-cell sheets in interaction with typical buildings for industrial applications. Mainly energy saving potential was determined by optimizing the orientation and area of such glazing systems in dependency on their thermal qualities. Moreover the impact on critical aspects such as summer overheating and daylight illumination was considered to ensure the user comfort and avoid additional energy demand for lighting or cooling. Hereby the simulated heating demand could be reduced by up to 1/3 compared to traditional architecture of industrial halls using mainly skylights.
Keywords: Solar Architecture, Passive Solar Building Design, Glazing, Low-Energy Buildings, Industrial Buildings.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 198313393 Dynamical Analysis of Circadian Gene Expression
Authors: Carla Layana Luis Diambra
Abstract:
Microarrays technique allows the simultaneous measurements of the expression levels of thousands of mRNAs. By mining this data one can identify the dynamics of the gene expression time series. By recourse of principal component analysis, we uncover the circadian rhythmic patterns underlying the gene expression profiles from Cyanobacterium Synechocystis. We applied PCA to reduce the dimensionality of the data set. Examination of the components also provides insight into the underlying factors measured in the experiments. Our results suggest that all rhythmic content of data can be reduced to three main components.
Keywords: circadian rhythms, clustering, gene expression, PCA.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 159213392 Differential Protection for Power Transformer Using Wavelet Transform and PNN
Authors: S. Sendilkumar, B. L. Mathur, Joseph Henry
Abstract:
A new approach for protection of power transformer is presented using a time-frequency transform known as Wavelet transform. Different operating conditions such as inrush, Normal, load, External fault and internal fault current are sampled and processed to obtain wavelet coefficients. Different Operating conditions provide variation in wavelet coefficients. Features like energy and Standard deviation are calculated using Parsevals theorem. These features are used as inputs to PNN (Probabilistic neural network) for fault classification. The proposed algorithm provides more accurate results even in the presence of noise inputs and accurately identifies inrush and fault currents. Overall classification accuracy of the proposed method is found to be 96.45%. Simulation of the fault (with and without noise) was done using MATLAB AND SIMULINK software taking 2 cycles of data window (40 m sec) containing 800 samples. The algorithm was evaluated by using 10 % Gaussian white noise.Keywords: Power Transformer, differential Protection, internalfault, inrush current, Wavelet Energy, Db9.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 313113391 Applying Hybrid Graph Drawing and Clustering Methods on Stock Investment Analysis
Authors: Mouataz Zreika, Maria Estela Varua
Abstract:
Stock investment decisions are often made based on current events of the global economy and the analysis of historical data. Conversely, visual representation could assist investors’ gain deeper understanding and better insight on stock market trends more efficiently. The trend analysis is based on long-term data collection. The study adopts a hybrid method that combines the Clustering algorithm and Force-directed algorithm to overcome the scalability problem when visualizing large data. This method exemplifies the potential relationships between each stock, as well as determining the degree of strength and connectivity, which will provide investors another understanding of the stock relationship for reference. Information derived from visualization will also help them make an informed decision. The results of the experiments show that the proposed method is able to produced visualized data aesthetically by providing clearer views for connectivity and edge weights.Keywords: Clustering, force-directed, graph drawing, stock investment analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 159513390 Securing Justice: A Critical Analysis of Kenya-s Post 9/11 Security Apparatus
Authors: Peter Ndichu Muriuki
Abstract:
The 9/11 suicide attacks in New York, Washington, D.C., and Pennsylvania, triggered a number of security responses both in the United States of America and other Countries in the World. Kenya, which is an ally and a close partner to North America and Europe, was not left behind. While many states had been parties to numerous terrorism conventions, their response in implementing them had been slow and needed this catalyst. This special case offered a window of opportunity for many “security conscious" regimes in cementing their legal-criminological and political security apparatus. At the international level, the 9/11 case led to the hasty adoption of Security Council resolution 1373 in 2001, which called upon states to adopt wide-ranging and comprehensive steps and strategies to combat international terrorism and to become parties to the relevant international conventions and protocols relating to terrorism. Since then, Kenya has responded with speed in devising social-legal-criminological-political actions.
Keywords: Justice, Policing, Security, Terrorism
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1665