Search results for: subgraph mining
228 Iterative Clustering Algorithm for Analyzing Temporal Patterns of Gene Expression
Authors: Seo Young Kim, Jae Won Lee, Jong Sung Bae
Abstract:
Microarray experiments are information rich; however, extensive data mining is required to identify the patterns that characterize the underlying mechanisms of action. For biologists, a key aim when analyzing microarray data is to group genes based on the temporal patterns of their expression levels. In this paper, we used an iterative clustering method to find temporal patterns of gene expression. We evaluated the performance of this method by applying it to real sporulation data and simulated data. The patterns obtained using the iterative clustering were found to be superior to those obtained using existing clustering algorithms.Keywords: Clustering, microarray experiment, temporal pattern of gene expression data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1356227 WebGD: A CORBA-based Document Classification and Retrieval System on the Web
Authors: Fuyang Peng, Bo Deng, Chao Qi, Mou Zhan
Abstract:
This paper presents the design and implementation of the WebGD, a CORBA-based document classification and retrieval system on Internet. The WebGD makes use of such techniques as Web, CORBA, Java, NLP, fuzzy technique, knowledge-based processing and database technology. Unified classification and retrieval model, classifying and retrieving with one reasoning engine and flexible working mode configuration are some of its main features. The architecture of WebGD, the unified classification and retrieval model, the components of the WebGD server and the fuzzy inference engine are discussed in this paper in detail.Keywords: Text Mining, document classification, knowledgeprocessing, fuzzy logic, Web, CORBA
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1848226 An Overview of Construction and Demolition Waste as Coarse Aggregate in Concrete
Authors: S. R. Shamili, J. Karthikeyan
Abstract:
Fast development of the total populace and far and wide urbanization has surprisingly expanded the advancement of the construction industry. As a result of these activities, old structures are being demolished to make new buildings. Due to these large-scale demolitions, a huge amount of debris is generated all over the world, which results in a landfill. The use of construction and demolition waste as landfill causes groundwater contamination, which is hazardous. Using construction and demolition waste as aggregate can reduce the use of natural aggregates and the problem of mining. The objective of this study is to provide a detailed overview on how the construction and demolition waste material has been used as aggregate in structural concrete. In this study, the preparation, classification, and composition of construction and demolition wastes are also discussed.
Keywords: Aggregate, construction and demolition waste, landfill, large scale demolition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 643225 Eclectic Rule-Extraction from Support Vector Machines
Authors: Nahla Barakat, Joachim Diederich
Abstract:
Support vector machines (SVMs) have shown superior performance compared to other machine learning techniques, especially in classification problems. Yet one limitation of SVMs is the lack of an explanation capability which is crucial in some applications, e.g. in the medical and security domains. In this paper, a novel approach for eclectic rule-extraction from support vector machines is presented. This approach utilizes the knowledge acquired by the SVM and represented in its support vectors as well as the parameters associated with them. The approach includes three stages; training, propositional rule-extraction and rule quality evaluation. Results from four different experiments have demonstrated the value of the approach for extracting comprehensible rules of high accuracy and fidelity.Keywords: Data mining, hybrid rule-extraction algorithms, medical diagnosis, SVMs
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1710224 Discovery of Sequential Patterns Based On Constraint Patterns
Authors: Shigeaki Sakurai, Youichi Kitahata, Ryohei Orihara
Abstract:
This paper proposes a method that discovers sequential patterns corresponding to user-s interests from sequential data. This method expresses the interests as constraint patterns. The constraint patterns can define relationships among attributes of the items composing the data. The method recursively decomposes the constraint patterns into constraint subpatterns. The method evaluates the constraint subpatterns in order to efficiently discover sequential patterns satisfying the constraint patterns. Also, this paper applies the method to the sequential data composed of stock price indexes and verifies its effectiveness through comparing it with a method without using the constraint patterns.
Keywords: Sequential pattern mining, Constraint pattern, Attribute constraint, Stock price indexes
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1423223 Issue Reorganization Using the Measure of Relevance
Authors: William Wong Xiu Shun, Yoonjin Hyun, Mingyu Kim, Seongi Choi, Namgyu Kim
Abstract:
The need to extract R&D keywords from issues and use them to retrieve R&D information is increasing rapidly. However, it is difficult to identify related issues or distinguish them. Although the similarity between issues cannot be identified, with an R&D lexicon, issues that always share the same R&D keywords can be determined. In detail, the R&D keywords that are associated with a particular issue imply the key technology elements that are needed to solve a particular issue. Furthermore, the relationship among issues that share the same R&D keywords can be shown in a more systematic way by clustering them according to keywords. Thus, sharing R&D results and reusing R&D technology can be facilitated. Indirectly, redundant investment in R&D can be reduced as the relevant R&D information can be shared among corresponding issues and the reusability of related R&D can be improved. Therefore, a methodology to cluster issues from the perspective of common R&D keywords is proposed to satisfy these demands.
Keywords: Clustering, Social Network Analysis, Text Mining, Topic Analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2038222 Modeling Language for Constructing Solvers in Machine Learning: Reductionist Perspectives
Authors: Tsuyoshi Okita
Abstract:
For a given specific problem an efficient algorithm has been the matter of study. However, an alternative approach orthogonal to this approach comes out, which is called a reduction. In general for a given specific problem this reduction approach studies how to convert an original problem into subproblems. This paper proposes a formal modeling language to support this reduction approach in order to make a solver quickly. We show three examples from the wide area of learning problems. The benefit is a fast prototyping of algorithms for a given new problem. It is noted that our formal modeling language is not intend for providing an efficient notation for data mining application, but for facilitating a designer who develops solvers in machine learning.
Keywords: Formal language, statistical inference problem, reduction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1328221 Ozone Decomposition over Silver-Loaded Perlite
Authors: Krassimir Genov, Vladimir Georgiev, Todor Batakliev, Dipak K. Sarker
Abstract:
The Bulgarian natural expanded mineral obtained from Bentonite AD perlite (A deposit of "The Broken Mountain" for perlite mining, near by the village of Vodenicharsko, in the municipality of Djebel), was loaded with silver (as ion form - Ag+ 2 and 5 wt% by the incipient wetness impregnation method), and as atomic silver - Ag0 using Tollen-s reagent (silver mirror reaction). Some physicochemical characterization of the samples are provided via: DC arc-AES, XRD, DR-IR and UV-VIS. The aim of this work was to obtain and test the silver-loaded catalyst for ozone decomposition. So the samples loaded with atomic silver show ca. 80% conversion of ozone 20 minutes after the reaction start. Then conversion decreases to ca. 20 % but stay stable during the prolongation of time.
Keywords: aluminum-silicates, Ag/perlite expanded glass, ozone decomposition
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2269220 Methodology of Restoration Research in Czech Republic
Authors: M. Rehor, V. Ondracek
Abstract:
Restoration research has become important on principle recently in Czech Republic. The reason is simple. More than 70 % of mined brown coal comes from the North Bohemian Basin these days. Open cast brown coal mining has lead to large damage on the landscape. Reclamation of phytotoxic areas is one of the serious problems in the North Bohemian Basin. It mainly concerns the areas with the occurrence of overburden rocks from the coal bed enriched with coal. The presented paper includes the characteristics of the important phytotoxic areas and the methodology of their reclamation. The results are documented with the long term monitoring of physical, mineralogical, chemical and pedological parameters of rocks in the testing areas.
Keywords: Brown coal, dump, methodology, restoration.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1543219 Knowledge Discovery from Production Databases for Hierarchical Process Control
Authors: Pavol Tanuska, Pavel Vazan, Michal Kebisek, Dominika Jurovata
Abstract:
The paper gives the results of the project that was oriented on the usage of knowledge discoveries from production systems for needs of the hierarchical process control. One of the main project goals was the proposal of knowledge discovery model for process control. Specifics data mining methods and techniques was used for defined problems of the process control. The gained knowledge was used on the real production system thus the proposed solution has been verified. The paper documents how is possible to apply the new discovery knowledge to use in the real hierarchical process control. There are specified the opportunities for application of the proposed knowledge discovery model for hierarchical process control.
Keywords: Hierarchical process control, knowledge discovery from databases, neural network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1776218 IMDC: An Image-Mapped Data Clustering Technique for Large Datasets
Authors: Faruq A. Al-Omari, Nabeel I. Al-Fayoumi
Abstract:
In this paper, we present a new algorithm for clustering data in large datasets using image processing approaches. First the dataset is mapped into a binary image plane. The synthesized image is then processed utilizing efficient image processing techniques to cluster the data in the dataset. Henceforth, the algorithm avoids exhaustive search to identify clusters. The algorithm considers only a small set of the data that contains critical boundary information sufficient to identify contained clusters. Compared to available data clustering techniques, the proposed algorithm produces similar quality results and outperforms them in execution time and storage requirements.
Keywords: Data clustering, Data mining, Image-mapping, Pattern discovery, Predictive analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1500217 A Fast Block-based Evolutional Algorithm for Combinatorial Problems
Authors: Huang, Wei-Hsiu Chang, Pei-Chann, Wang, Lien-Chun
Abstract:
The problems with high complexity had been the challenge in combinatorial problems. Due to the none-determined and polynomial characteristics, these problems usually face to unreasonable searching budget. Hence combinatorial optimizations attracted numerous researchers to develop better algorithms. In recent academic researches, most focus on developing to enhance the conventional evolutional algorithms and facilitate the local heuristics, such as VNS, 2-opt and 3-opt. Despite the performances of the introduction of the local strategies are significant, however, these improvement cannot improve the performance for solving the different problems. Therefore, this research proposes a meta-heuristic evolutional algorithm which can be applied to solve several types of problems. The performance validates BBEA has the ability to solve the problems even without the design of local strategies.
Keywords: Combinatorial problems, Artificial Chromosomes, Blocks Mining, Block Recombination
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1418216 An Engineering Approach to Forecast Volatility of Financial Indices
Authors: Irwin Ma, Tony Wong, Thiagas Sankar
Abstract:
By systematically applying different engineering methods, difficult financial problems become approachable. Using a combination of theory and techniques such as wavelet transform, time series data mining, Markov chain based discrete stochastic optimization, and evolutionary algorithms, this work formulated a strategy to characterize and forecast non-linear time series. It attempted to extract typical features from the volatility data sets of S&P100 and S&P500 indices that include abrupt drops, jumps and other non-linearity. As a result, accuracy of forecasting has reached an average of over 75% surpassing any other publicly available results on the forecast of any financial index.Keywords: Discrete stochastic optimization, genetic algorithms, genetic programming, volatility forecast
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1631215 Sequential Partitioning Brainbow Image Segmentation Using Bayesian
Authors: Yayun Hsu, Henry Horng-Shing Lu
Abstract:
This paper proposes a data-driven, biology-inspired neural segmentation method of 3D drosophila Brainbow images. We use Bayesian Sequential Partitioning algorithm for probabilistic modeling, which can be used to detect somas and to eliminate crosstalk effects. This work attempts to develop an automatic methodology for neuron image segmentation, which nowadays still lacks a complete solution due to the complexity of the image. The proposed method does not need any predetermined, risk-prone thresholds, since biological information is inherently included inside the image processing procedure. Therefore, it is less sensitive to variations in neuron morphology; meanwhile, its flexibility would be beneficial for tracing the intertwining structure of neurons.
Keywords: Brainbow, 3D imaging, image segmentation, neuron morphology, biological data mining, non-parametric learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2260214 Learning Classifier Systems Approach for Automated Discovery of Censored Production Rules
Authors: Suraiya Jabin, Kamal K. Bharadwaj
Abstract:
In the recent past Learning Classifier Systems have been successfully used for data mining. Learning Classifier System (LCS) is basically a machine learning technique which combines evolutionary computing, reinforcement learning, supervised or unsupervised learning and heuristics to produce adaptive systems. A LCS learns by interacting with an environment from which it receives feedback in the form of numerical reward. Learning is achieved by trying to maximize the amount of reward received. All LCSs models more or less, comprise four main components; a finite population of condition–action rules, called classifiers; the performance component, which governs the interaction with the environment; the credit assignment component, which distributes the reward received from the environment to the classifiers accountable for the rewards obtained; the discovery component, which is responsible for discovering better rules and improving existing ones through a genetic algorithm. The concatenate of the production rules in the LCS form the genotype, and therefore the GA should operate on a population of classifier systems. This approach is known as the 'Pittsburgh' Classifier Systems. Other LCS that perform their GA at the rule level within a population are known as 'Mitchigan' Classifier Systems. The most predominant representation of the discovered knowledge is the standard production rules (PRs) in the form of IF P THEN D. The PRs, however, are unable to handle exceptions and do not exhibit variable precision. The Censored Production Rules (CPRs), an extension of PRs, were proposed by Michalski and Winston that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: IF P THEN D UNLESS C, where Censor C is an exception to the rule. Such rules are employed in situations, in which conditional statement IF P THEN D holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence are tight or there is simply no information available as to whether it holds or not. Thus, the IF P THEN D part of CPR expresses important information, while the UNLESS C part acts only as a switch and changes the polarity of D to ~D. In this paper Pittsburgh style LCSs approach is used for automated discovery of CPRs. An appropriate encoding scheme is suggested to represent a chromosome consisting of fixed size set of CPRs. Suitable genetic operators are designed for the set of CPRs and individual CPRs and also appropriate fitness function is proposed that incorporates basic constraints on CPR. Experimental results are presented to demonstrate the performance of the proposed learning classifier system.Keywords: Censored Production Rule, Data Mining, GeneticAlgorithm, Learning Classifier System, Machine Learning, PittsburgApproach, , Reinforcement learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1530213 A Modified Fuzzy C-Means Algorithm for Natural Data Exploration
Authors: Binu Thomas, Raju G., Sonam Wangmo
Abstract:
In Data mining, Fuzzy clustering algorithms have demonstrated advantage over crisp clustering algorithms in dealing with the challenges posed by large collections of vague and uncertain natural data. This paper reviews concept of fuzzy logic and fuzzy clustering. The classical fuzzy c-means algorithm is presented and its limitations are highlighted. Based on the study of the fuzzy c-means algorithm and its extensions, we propose a modification to the cmeans algorithm to overcome the limitations of it in calculating the new cluster centers and in finding the membership values with natural data. The efficiency of the new modified method is demonstrated on real data collected for Bhutan-s Gross National Happiness (GNH) program.Keywords: Adaptive fuzzy clustering, clustering, fuzzy logic, fuzzy clustering, c-means.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1992212 Improved C-Fuzzy Decision Tree for Intrusion Detection
Authors: Krishnamoorthi Makkithaya, N. V. Subba Reddy, U. Dinesh Acharya
Abstract:
As the number of networked computers grows, intrusion detection is an essential component in keeping networks secure. Various approaches for intrusion detection are currently being in use with each one has its own merits and demerits. This paper presents our work to test and improve the performance of a new class of decision tree c-fuzzy decision tree to detect intrusion. The work also includes identifying best candidate feature sub set to build the efficient c-fuzzy decision tree based Intrusion Detection System (IDS). We investigated the usefulness of c-fuzzy decision tree for developing IDS with a data partition based on horizontal fragmentation. Empirical results indicate the usefulness of our approach in developing the efficient IDS.Keywords: Data mining, Decision tree, Feature selection, Fuzzyc- means clustering, Intrusion detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1577211 Review for Identifying Online Opinion Leaders
Authors: Yu Wang
Abstract:
Nowadays, Internet enables its users to share the information online and to interact with others. Facing with numerous information, these Internet users are confused and begin to rely on the opinion leaders’ recommendations. The online opinion leaders are the individuals who have professional knowledge, who utilize the online channels to spread word-of-mouth information and who can affect the attitudes or even the behavior of their followers to some degree. Because utilizing the online opinion leaders is seen as an important approach to affect the potential consumers, how to identify them has become one of the hottest topics in the related field. Hence, in this article, the concepts and characteristics are introduced, and the researches related to identifying opinion leaders are collected and divided into three categories. Finally, the implications for future studies are provided.
Keywords: Online opinion leaders, user attributes analysis, text mining analysis, network structure analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1824210 Finding an Optimized Discriminate Function for Internet Application Recognition
Authors: E. Khorram, S.M. Mirzababaei
Abstract:
Everyday the usages of the Internet increase and simply a world of the data become accessible. Network providers do not want to let the provided services to be used in harmful or terrorist affairs, so they used a variety of methods to protect the special regions from the harmful data. One of the most important methods is supposed to be the firewall. Firewall stops the transfer of such packets through several ways, but in some cases they do not use firewall because of its blind packet stopping, high process power needed and expensive prices. Here we have proposed a method to find a discriminate function to distinguish between usual packets and harmful ones by the statistical processing on the network router logs. So an administrator can alarm to the user. This method is very fast and can be used simply in adjacent with the Internet routers.
Keywords: Data Mining, Firewall, Optimization, Packetclassification, Statistical Pattern Recognition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1408209 K-Means for Spherical Clusters with Large Variance in Sizes
Authors: A. M. Fahim, G. Saake, A. M. Salem, F. A. Torkey, M. A. Ramadan
Abstract:
Data clustering is an important data exploration technique with many applications in data mining. The k-means algorithm is well known for its efficiency in clustering large data sets. However, this algorithm is suitable for spherical shaped clusters of similar sizes and densities. The quality of the resulting clusters decreases when the data set contains spherical shaped with large variance in sizes. In this paper, we introduce a competent procedure to overcome this problem. The proposed method is based on shifting the center of the large cluster toward the small cluster, and recomputing the membership of small cluster points, the experimental results reveal that the proposed algorithm produces satisfactory results.Keywords: K-Means, Data Clustering, Cluster Analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3281208 Learning an Overcomplete Dictionary using a Cauchy Mixture Model for Sparse Decay
Authors: E. S. Gower, M. O. J. Hawksford
Abstract:
An algorithm for learning an overcomplete dictionary using a Cauchy mixture model for sparse decomposition of an underdetermined mixing system is introduced. The mixture density function is derived from a ratio sample of the observed mixture signals where 1) there are at least two but not necessarily more mixture signals observed, 2) the source signals are statistically independent and 3) the sources are sparse. The basis vectors of the dictionary are learned via the optimization of the location parameters of the Cauchy mixture components, which is shown to be more accurate and robust than the conventional data mining methods usually employed for this task. Using a well known sparse decomposition algorithm, we extract three speech signals from two mixtures based on the estimated dictionary. Further tests with additive Gaussian noise are used to demonstrate the proposed algorithm-s robustness to outliers.Keywords: expectation-maximization, Pitman estimator, sparsedecomposition
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1949207 Development of Subjective Measures of Interestingness: From Unexpectedness to Shocking
Authors: Eiad Yafi, M. A. Alam, Ranjit Biswas
Abstract:
Knowledge Discovery of Databases (KDD) is the process of extracting previously unknown but useful and significant information from large massive volume of databases. Data Mining is a stage in the entire process of KDD which applies an algorithm to extract interesting patterns. Usually, such algorithms generate huge volume of patterns. These patterns have to be evaluated by using interestingness measures to reflect the user requirements. Interestingness is defined in different ways, (i) Objective measures (ii) Subjective measures. Objective measures such as support and confidence extract meaningful patterns based on the structure of the patterns, while subjective measures such as unexpectedness and novelty reflect the user perspective. In this report, we try to brief the more widely spread and successful subjective measures and propose a new subjective measure of interestingness, i.e. shocking.Keywords: Shocking rules (SHR).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1536206 Literature-Based Discoveries in Lupus Treatment
Authors: Oluwaseyi Jaiyeoba, Vetria Byrd
Abstract:
Systemic lupus erythematosus (aka lupus) is a chronic disease known for its chameleon-like ability to mimic symptoms of other diseases rendering it hard to detect, diagnose and treat. The heterogeneous nature of the disease generates disparate data that are often multifaceted and multi-dimensional. Musculoskeletal manifestation of lupus is one of the most common clinical manifestations of lupus. This research links disparate literature on the treatment of lupus as it affects the musculoskeletal system using the discoveries from literature-based research articles available on the PubMed database. Several Natural Language Processing (NPL) tools exist to connect disjointed but related literature, such as Connected Papers, Bitola, and Gopalakrishnan. Literature-based discovery (LBD) has been used to bridge unconnected disciplines based on text mining procedures. The technical/medical literature consists of many technical/medical concepts, each having its sub-literature. This approach has been used to link Parkinson’s, Raynaud, and Multiple Sclerosis treatment within works of literature. Literature-based discovery methods can connect two or more related but disjointed literature concepts to produce a novel and plausible approach to solving a research problem. Data visualization techniques with the help of natural language processing tools are used to visually represent the result of literature-based discoveries. Literature search results can be voluminous, but Data visualization processes can provide insight and detect subtle patterns in large data. These insights and patterns can lead to discoveries that would have otherwise been hidden from disjointed literature. In this research, literature data are mined and combined with visualization techniques for heterogeneous data to discover viable treatments reported in the literature for lupus expression in the musculoskeletal system. This research answers the question of using literature-based discovery to identify potential treatments for a multifaceted disease like lupus. A three-pronged methodology is used in this research: text mining, natural language processing, and data visualization. These three research-related fields are employed to identify patterns in lupus-related data that, when visually represented, could aid research in the treatment of lupus. This work introduces a method for visually representing interconnections of various lupus-related literature. The methodology outlined in this work is the first step toward literature-based research and treatment planning for the musculoskeletal manifestation of lupus. The results also outline the interconnection of complex, disparate data associated with the manifestation of lupus in the musculoskeletal system. The societal impact of this work is broad. Advances in this work will improve the quality of life for millions of persons in the workforce currently diagnosed and silently living with a musculoskeletal disease associated with lupus.
Keywords: Systemic lupus erythematosus, LBD, Data Visualization, musculoskeletal system, treatment.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 507205 Integration of Support Vector Machine and Bayesian Neural Network for Data Mining and Classification
Authors: Essam Al-Daoud
Abstract:
Several combinations of the preprocessing algorithms, feature selection techniques and classifiers can be applied to the data classification tasks. This study introduces a new accurate classifier, the proposed classifier consist from four components: Signal-to- Noise as a feature selection technique, support vector machine, Bayesian neural network and AdaBoost as an ensemble algorithm. To verify the effectiveness of the proposed classifier, seven well known classifiers are applied to four datasets. The experiments show that using the suggested classifier enhances the classification rates for all datasets.Keywords: AdaBoost, Bayesian neural network, Signal-to-Noise, support vector machine, MCMC.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2020204 Analysis of Textual Data Based On Multiple 2-Class Classification Models
Authors: Shigeaki Sakurai, Ryohei Orihara
Abstract:
This paper proposes a new method for analyzing textual data. The method deals with items of textual data, where each item is described based on various viewpoints. The method acquires 2- class classification models of the viewpoints by applying an inductive learning method to items with multiple viewpoints. The method infers whether the viewpoints are assigned to the new items or not by using the models. The method extracts expressions from the new items classified into the viewpoints and extracts characteristic expressions corresponding to the viewpoints by comparing the frequency of expressions among the viewpoints. This paper also applies the method to questionnaire data given by guests at a hotel and verifies its effect through numerical experiments.
Keywords: Text mining, Multiple viewpoints, Differential analysis, Questionnaire data
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1290203 Network Anomaly Detection using Soft Computing
Authors: Surat Srinoy, Werasak Kurutach, Witcha Chimphlee, Siriporn Chimphlee
Abstract:
One main drawback of intrusion detection system is the inability of detecting new attacks which do not have known signatures. In this paper we discuss an intrusion detection method that proposes independent component analysis (ICA) based feature selection heuristics and using rough fuzzy for clustering data. ICA is to separate these independent components (ICs) from the monitored variables. Rough set has to decrease the amount of data and get rid of redundancy and Fuzzy methods allow objects to belong to several clusters simultaneously, with different degrees of membership. Our approach allows us to recognize not only known attacks but also to detect activity that may be the result of a new, unknown attack. The experimental results on Knowledge Discovery and Data Mining- (KDDCup 1999) dataset.Keywords: Network security, intrusion detection, rough set, ICA, anomaly detection, independent component analysis, rough fuzzy .
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1955202 Web Usability : A Fuzzy Approach to the Navigation Structure Enhancement in a Website System, Case of Iranian Civil Aviation Organization Website
Authors: Hamed Qahri Saremi, Gholam Ali Montazer
Abstract:
With the proliferation of World Wide Web, development of web-based technologies and the growth in web content, the structure of a website becomes more complex and web navigation becomes a critical issue to both web designers and users. In this paper we define the content and web pages as two important and influential factors in website navigation and paraphrase the enhancement in the website navigation as making some useful changes in the link structure of the website based on the aforementioned factors. Then we suggest a new method for proposing the changes using fuzzy approach to optimize the website architecture. Applying the proposed method to a real case of Iranian Civil Aviation Organization (CAO) website, we discuss the results of the novel approach at the final section.Keywords: Web content, Web navigation, Website system, Webusage mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1788201 Context-aware Recommender Systems using Data Mining Techniques
Authors: Kyoung-jae Kim, Hyunchul Ahn, Sangwon Jeong
Abstract:
This study proposes a novel recommender system to provide the advertisements of context-aware services. Our proposed model is designed to apply a modified collaborative filtering (CF) algorithm with regard to the several dimensions for the personalization of mobile devices – location, time and the user-s needs type. In particular, we employ a classification rule to understand user-s needs type using a decision tree algorithm. In addition, we collect primary data from the mobile phone users and apply them to the proposed model to validate its effectiveness. Experimental results show that the proposed system makes more accurate and satisfactory advertisements than comparative systems.Keywords: Location-based advertisement, Recommender system, Collaborative filtering, User needs type, Mobile user.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2174200 Genetic Programming Approach to Hierarchical Production Rule Discovery
Authors: Basheer M. Al-Maqaleh, Kamal K. Bharadwaj
Abstract:
Automated discovery of hierarchical structures in large data sets has been an active research area in the recent past. This paper focuses on the issue of mining generalized rules with crisp hierarchical structure using Genetic Programming (GP) approach to knowledge discovery. The post-processing scheme presented in this work uses flat rules as initial individuals of GP and discovers hierarchical structure. Suitable genetic operators are proposed for the suggested encoding. Based on the Subsumption Matrix(SM), an appropriate fitness function is suggested. Finally, Hierarchical Production Rules (HPRs) are generated from the discovered hierarchy. Experimental results are presented to demonstrate the performance of the proposed algorithm.Keywords: Genetic Programming, Hierarchy, Knowledge Discovery in Database, Subsumption Matrix.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1451199 Towards Achieving Energy Efficiency in Kazakhstan
Authors: Aigerim Uyzbayeva, Valeriya Tyo, Nurlan Ibrayev
Abstract:
Kazakhstan is currently one of the dynamically developing states in its region. The stable growth in all sectors of the economy leads to a corresponding increase in energy consumption. Thus country consumes significant amount of energy due to the high level of industrialisation and the presence of energy-intensive manufacturing such as mining and metallurgy which in turn leads to low energy efficiency. With allowance for this the Government has set several priorities to adopt a transition of Republic of Kazakhstan to a “green economy”. This article provides an overview of Kazakhstan’s energy efficiency situation in for the period of 1991- 2014. First, the dynamics of production and consumption of conventional energy resources are given. Second, the potential of renewable energy sources is summarised followed by the description of GHG emissions trends in the country. Third, Kazakhstan’ national initiatives, policies and locally implemented projects in the field of energy efficiency are described.
Keywords: Energy efficiency in Kazakhstan, greenhouse gases, renewable energy, sustainable development.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3538