Search results for: Text Mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1058

Search results for: Text Mining

428 A Structural Equation Model of Risk Perception of Rockfall for Revisit Intention

Authors: Ya-Fen Lee, Yun-Yao Chi

Abstract:

The study aims to explore the relationship between risk perception of rockfall and revisit intention using a Structural Equation Modeling (SEM) analysis. A total of 573 valid questionnaires are collected from travelers to Taroko National Park, Taiwan. The findings show the majority of travelers have the medium perception of rockfall risk, and are willing to revisit the Taroko National Park. The revisit intention to Taroko National Park is influenced by hazardous preferences, willingness-to-pay, obstruction and attraction. The risk perception has an indirect effect on revisit intention through influencing willingness-to-pay. The study results can be a reference for mitigation the rockfall disaster.

Keywords: Risk perception, rockfall, revisit intention, structural equation modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2154
427 Lexicon-Based Sentiment Analysis for Stock Movement Prediction

Authors: Zane Turner, Kevin Labille, Susan Gauch

Abstract:

Sentiment analysis is a broad and expanding field that aims to extract and classify opinions from textual data. Lexicon-based approaches are based on the use of a sentiment lexicon, i.e., a list of words each mapped to a sentiment score, to rate the sentiment of a text chunk. Our work focuses on predicting stock price change using a sentiment lexicon built from financial conference call logs. We introduce a method to generate a sentiment lexicon based upon an existing probabilistic approach. By using a domain-specific lexicon, we outperform traditional techniques and demonstrate that domain-specific sentiment lexicons provide higher accuracy than generic sentiment lexicons when predicting stock price change.

Keywords: Computational finance, sentiment analysis, sentiment lexicon, stock movement prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1140
426 Methodology of Restoration Research in Czech Republic

Authors: M. Rehor, V. Ondracek

Abstract:

Restoration research has become important on principle recently in Czech Republic. The reason is simple. More than 70 % of mined brown coal comes from the North Bohemian Basin these days. Open cast brown coal mining has lead to large damage on the landscape. Reclamation of phytotoxic areas is one of the serious problems in the North Bohemian Basin. It mainly concerns the areas with the occurrence of overburden rocks from the coal bed enriched with coal. The presented paper includes the characteristics of the important phytotoxic areas and the methodology of their reclamation. The results are documented with the long term monitoring of physical, mineralogical, chemical and pedological parameters of rocks in the testing areas.

Keywords: Brown coal, dump, methodology, restoration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1544
425 Knowledge Discovery from Production Databases for Hierarchical Process Control

Authors: Pavol Tanuska, Pavel Vazan, Michal Kebisek, Dominika Jurovata

Abstract:

The paper gives the results of the project that was oriented on the usage of knowledge discoveries from production systems for needs of the hierarchical process control. One of the main project goals was the proposal of knowledge discovery model for process control. Specifics data mining methods and techniques was used for defined problems of the process control. The gained knowledge was used on the real production system thus the proposed solution has been verified. The paper documents how is possible to apply the new discovery knowledge to use in the real hierarchical process control. There are specified the opportunities for application of the proposed knowledge discovery model for hierarchical process control.

Keywords: Hierarchical process control, knowledge discovery from databases, neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1777
424 IMDC: An Image-Mapped Data Clustering Technique for Large Datasets

Authors: Faruq A. Al-Omari, Nabeel I. Al-Fayoumi

Abstract:

In this paper, we present a new algorithm for clustering data in large datasets using image processing approaches. First the dataset is mapped into a binary image plane. The synthesized image is then processed utilizing efficient image processing techniques to cluster the data in the dataset. Henceforth, the algorithm avoids exhaustive search to identify clusters. The algorithm considers only a small set of the data that contains critical boundary information sufficient to identify contained clusters. Compared to available data clustering techniques, the proposed algorithm produces similar quality results and outperforms them in execution time and storage requirements.

Keywords: Data clustering, Data mining, Image-mapping, Pattern discovery, Predictive analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1501
423 Design, Development and Evaluation of a Portable Recording System to Capture Dynamic Presentations Using the Teacher´s Tablet PC

Authors: Enrique Barra, Abel Carril, Aldo Gordillo, Joaquín Salvachúa, Juan Quemada

Abstract:

Computers and multimedia equipment have improved a lot in the last years. They have reduced their cost and size while at the same time increased their capabilities. These improvements allowed us to design and implement a portable recording system that also integrates the teacher´s tablet PC to capture what he/she writes on the slides and all that happens in it. This paper explains this system in detail and the validation of the recordings that we did after using it to record all the lectures the “Communications Software” course in our university. The results show that pupils used the recordings for different purposes and consider them useful for a variety of things, especially after missing a lecture.

Keywords: Recording System, capture dynamic presentations, lecture recording.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1927
422 A Fast Block-based Evolutional Algorithm for Combinatorial Problems

Authors: Huang, Wei-Hsiu Chang, Pei-Chann, Wang, Lien-Chun

Abstract:

The problems with high complexity had been the challenge in combinatorial problems. Due to the none-determined and polynomial characteristics, these problems usually face to unreasonable searching budget. Hence combinatorial optimizations attracted numerous researchers to develop better algorithms. In recent academic researches, most focus on developing to enhance the conventional evolutional algorithms and facilitate the local heuristics, such as VNS, 2-opt and 3-opt. Despite the performances of the introduction of the local strategies are significant, however, these improvement cannot improve the performance for solving the different problems. Therefore, this research proposes a meta-heuristic evolutional algorithm which can be applied to solve several types of problems. The performance validates BBEA has the ability to solve the problems even without the design of local strategies.

Keywords: Combinatorial problems, Artificial Chromosomes, Blocks Mining, Block Recombination

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1418
421 The Effects of Perceived Organizational Support and Abusive Supervision on Employee’s Turnover Intention: The Mediating Roles of Psychological Contract and Emotional Exhaustion

Authors: Seung Yeon Son

Abstract:

Workers (especially, competent personnel) have been recognized as a core contributor to overall organizational effectiveness. Hence, verifying the determinants of turnover intention is one of the most important research issues. This study tested the influence of perceived organizational support and abusive supervision on employee’s turnover intention. In addition, mediating roles of psychological contract and emotional exhaustion were examined. Data from 255 Korean employees supported all hypotheses Implications for research and directions for future research are discussed.

 

Keywords: Abusive Supervision, Emotional Exhaustion, Perceived Organizational Support, Psychological Contract, Turnover Intention.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3220
420 An Engineering Approach to Forecast Volatility of Financial Indices

Authors: Irwin Ma, Tony Wong, Thiagas Sankar

Abstract:

By systematically applying different engineering methods, difficult financial problems become approachable. Using a combination of theory and techniques such as wavelet transform, time series data mining, Markov chain based discrete stochastic optimization, and evolutionary algorithms, this work formulated a strategy to characterize and forecast non-linear time series. It attempted to extract typical features from the volatility data sets of S&P100 and S&P500 indices that include abrupt drops, jumps and other non-linearity. As a result, accuracy of forecasting has reached an average of over 75% surpassing any other publicly available results on the forecast of any financial index.

Keywords: Discrete stochastic optimization, genetic algorithms, genetic programming, volatility forecast

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1631
419 Dynamic Analysis by a Family of Time Marching Procedures Based On Numerically Computed Green’s Functions

Authors: Delfim Soares Jr.

Abstract:

In this work, a new family of time marching procedures based on Green’s function matrices is presented. The formulation is based on the development of new recurrence relationships, which employ time integral terms to treat initial condition values. These integral terms are numerically evaluated taking into account Newton-Cotes formulas. The Green’s matrices of the model are also numerically computed, taking into account the generalized-α method and subcycling techniques. As it is discussed and illustrated along the text, the proposed procedure is efficient and accurate, providing a very attractive time marching technique. 

Keywords: Dynamics, Time-Marching, Green’s Function, Generalized-α Method, Subcycling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1515
418 Heavy Metals in PM2.5 Aerosols in Urban Sites of Győr, Hungary

Authors: Zs. Csanádi, A. Szabó Nagy, J. Szabó, J. Erdős

Abstract:

Atmospheric concentrations of some heavy metal compounds (Pb, Cd, Ni) and the metalloid As were identified and determined in airborne PM2.5 particles in urban sites of Győr, northwest area of Hungary. PM2.5 aerosol samples were collected in two different sampling sites and the trace metal(loid) (Pb, Ni, Cd and As) content were analyzed by atomic absorption spectroscopy. The concentration of PM2.5 fraction was varied between 12.22 and 36.92 μg/m3 at the two sampling sites. The trend of heavy metal mean concentrations regarding the mean value of the two urban sites of Győr was found in decreasing order of Pb > Ni > Cd. The mean values were 7.59 ng/m3 for Pb, 0.34 ng/m3 for Ni and 0.11 ng/m3 for Cd, respectively. The metalloid As could be detected only in 3.57% of the total collected samples. The levels of PM2.5 bounded heavy metals were determined and compared with other cities located in Hungary.

Keywords: Aerosol, air quality, heavy metals, PM2.5.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 919
417 Sequential Partitioning Brainbow Image Segmentation Using Bayesian

Authors: Yayun Hsu, Henry Horng-Shing Lu

Abstract:

This paper proposes a data-driven, biology-inspired neural segmentation method of 3D drosophila Brainbow images. We use Bayesian Sequential Partitioning algorithm for probabilistic modeling, which can be used to detect somas and to eliminate crosstalk effects. This work attempts to develop an automatic methodology for neuron image segmentation, which nowadays still lacks a complete solution due to the complexity of the image. The proposed method does not need any predetermined, risk-prone thresholds, since biological information is inherently included inside the image processing procedure. Therefore, it is less sensitive to variations in neuron morphology; meanwhile, its flexibility would be beneficial for tracing the intertwining structure of neurons.

Keywords: Brainbow, 3D imaging, image segmentation, neuron morphology, biological data mining, non-parametric learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2261
416 An Analysis of Compression Methods and Implementation of Medical Images in Wireless Network

Authors: C. Rajan, K. Geetha, S. Geetha

Abstract:

The motivation of image compression technique is to reduce the irrelevance and redundancy of the image data in order to store or pass data in an efficient way from one place to another place. There are several types of compression methods available. Without the help of compression technique, the file size is knowingly larger, usually several megabytes, but by doing the compression technique, it is possible to reduce file size up to 10% as of the original without noticeable loss in quality. Image compression can be lossless or lossy. The compression technique can be applied to images, audio, video and text data. This research work mainly concentrates on methods of encoding, DCT, compression methods, security, etc. Different methodologies and network simulations have been analyzed here. Various methods of compression methodologies and its performance metrics has been investigated and presented in a table manner.

Keywords: Image compression techniques, encoding, DCT, lossy compression, lossless compression, JPEG.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1189
415 Pricing Strategy Selection Using Fuzzy Linear Programming

Authors: Elif Alaybeyoğlu, Y. Esra Albayrak

Abstract:

Marketing establishes a communication network between producers and consumers. Nowadays, marketing approach is customer-focused and products are directly oriented to meet customer needs. Marketing, which is a long process, needs organization and management. Therefore strategic marketing planning becomes more and more important in today’s competitive conditions. Main focus of this paper is to evaluate pricing strategies and select the best pricing strategy solution while considering internal and external factors influencing the company’s pricing decisions associated with new product development. To reflect the decision maker’s subjective preference information and to determine the weight vector of factors (attributes), the fuzzy linear programming technique for multidimensional analysis of preference (LINMAP) under intuitionistic fuzzy (IF) environments is used.

Keywords: IF Sets, LINMAP, MAGDM, Marketing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2266
414 Indonesian News Classification using Support Vector Machine

Authors: Dewi Y. Liliana, Agung Hardianto, M. Ridok

Abstract:

Digital news with a variety topics is abundant on the internet. The problem is to classify news based on its appropriate category to facilitate user to find relevant news rapidly. Classifier engine is used to split any news automatically into the respective category. This research employs Support Vector Machine (SVM) to classify Indonesian news. SVM is a robust method to classify binary classes. The core processing of SVM is in the formation of an optimum separating plane to separate the different classes. For multiclass problem, a mechanism called one against one is used to combine the binary classification result. Documents were taken from the Indonesian digital news site, www.kompas.com. The experiment showed a promising result with the accuracy rate of 85%. This system is feasible to be implemented on Indonesian news classification.

Keywords: classification, Indonesian news, text processing, support vector machine

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3490
413 Lecture Video Indexing and Retrieval Using Topic Keywords

Authors: B. J. Sandesh, Saurabha Jirgi, S. Vidya, Prakash Eljer, Gowri Srinivasa

Abstract:

In this paper, we propose a framework to help users to search and retrieve the portions in the lecture video of their interest. This is achieved by temporally segmenting and indexing the lecture video using the topic keywords. We use transcribed text from the video and documents relevant to the video topic extracted from the web for this purpose. The keywords for indexing are found by applying the non-negative matrix factorization (NMF) topic modeling techniques on the web documents. Our proposed technique first creates indices on the transcribed documents using the topic keywords, and these are mapped to the video to find the start and end time of the portions of the video for a particular topic. This time information is stored in the index table along with the topic keyword which is used to retrieve the specific portions of the video for the query provided by the users.

Keywords: Video indexing and retrieval, lecture videos, content based video search, multimodal indexing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1556
412 Strategy Research for the Development of Thematic Commercial Streets - Based On the Survey of Eight Typical Thematic Commercial Streets in Harbin

Authors: Wang Zhenzhen, Wang Xu, Hong Liangping

Abstract:

The construction of thematic commercial streets has been on the hotspot with the rapid development of cities. In order to improve the image and competitiveness of cities, many cities are building or rebuilding thematic commercial streets. However, many contradictions and problems have emerged during this process. Therefore, it is significant, for both the practice and the research, to analyze the development of thematic commercial streets and provide some useful suggestions. Through the deep research and comparative study of the eight typical thematic commercial streets in Harbin, this paper summarize the current situations, laws and influencing factors of the development of these streets, and then put forward some suggestions about the plan, constructions and developments of the thematic commercial streets.

Keywords: Thematic commercial streets, laws of the development, influence factors, the constructions and developments, degrees of aggregation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1622
411 Learning Classifier Systems Approach for Automated Discovery of Censored Production Rules

Authors: Suraiya Jabin, Kamal K. Bharadwaj

Abstract:

In the recent past Learning Classifier Systems have been successfully used for data mining. Learning Classifier System (LCS) is basically a machine learning technique which combines evolutionary computing, reinforcement learning, supervised or unsupervised learning and heuristics to produce adaptive systems. A LCS learns by interacting with an environment from which it receives feedback in the form of numerical reward. Learning is achieved by trying to maximize the amount of reward received. All LCSs models more or less, comprise four main components; a finite population of condition–action rules, called classifiers; the performance component, which governs the interaction with the environment; the credit assignment component, which distributes the reward received from the environment to the classifiers accountable for the rewards obtained; the discovery component, which is responsible for discovering better rules and improving existing ones through a genetic algorithm. The concatenate of the production rules in the LCS form the genotype, and therefore the GA should operate on a population of classifier systems. This approach is known as the 'Pittsburgh' Classifier Systems. Other LCS that perform their GA at the rule level within a population are known as 'Mitchigan' Classifier Systems. The most predominant representation of the discovered knowledge is the standard production rules (PRs) in the form of IF P THEN D. The PRs, however, are unable to handle exceptions and do not exhibit variable precision. The Censored Production Rules (CPRs), an extension of PRs, were proposed by Michalski and Winston that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: IF P THEN D UNLESS C, where Censor C is an exception to the rule. Such rules are employed in situations, in which conditional statement IF P THEN D holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence are tight or there is simply no information available as to whether it holds or not. Thus, the IF P THEN D part of CPR expresses important information, while the UNLESS C part acts only as a switch and changes the polarity of D to ~D. In this paper Pittsburgh style LCSs approach is used for automated discovery of CPRs. An appropriate encoding scheme is suggested to represent a chromosome consisting of fixed size set of CPRs. Suitable genetic operators are designed for the set of CPRs and individual CPRs and also appropriate fitness function is proposed that incorporates basic constraints on CPR. Experimental results are presented to demonstrate the performance of the proposed learning classifier system.

Keywords: Censored Production Rule, Data Mining, GeneticAlgorithm, Learning Classifier System, Machine Learning, PittsburgApproach, , Reinforcement learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1530
410 A Modified Fuzzy C-Means Algorithm for Natural Data Exploration

Authors: Binu Thomas, Raju G., Sonam Wangmo

Abstract:

In Data mining, Fuzzy clustering algorithms have demonstrated advantage over crisp clustering algorithms in dealing with the challenges posed by large collections of vague and uncertain natural data. This paper reviews concept of fuzzy logic and fuzzy clustering. The classical fuzzy c-means algorithm is presented and its limitations are highlighted. Based on the study of the fuzzy c-means algorithm and its extensions, we propose a modification to the cmeans algorithm to overcome the limitations of it in calculating the new cluster centers and in finding the membership values with natural data. The efficiency of the new modified method is demonstrated on real data collected for Bhutan-s Gross National Happiness (GNH) program.

Keywords: Adaptive fuzzy clustering, clustering, fuzzy logic, fuzzy clustering, c-means.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1992
409 Buzan Mind Mapping: An Efficient Technique for Note-Taking

Authors: T. K. Tee, M. N. A. Azman, S. Mohamed, Muhammad, M., M. M. Mohamad, J. Md Yunos, M. H. Yee, W. Othman

Abstract:

Buzan mind mapping is an efficient system of note-taking that makes revision a fun thing to do for students. Tony Buzan has been teaching children all over the world for the past thirty years and has proved that mind maps are the magic formula in the classroom for everyone. The purpose of this paper is to discuss the importance of Buzan mind mapping as a note-taking technique for the secondary school students. This paper also examines the mind mapping technique, advantages and disadvantages of hand-drawn mind maps. Samples of students’ mind maps were presented and discussed.

Keywords: Buzan Mind Mapping, note-taking technique, hand-drawn mind maps.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9398
408 Improved C-Fuzzy Decision Tree for Intrusion Detection

Authors: Krishnamoorthi Makkithaya, N. V. Subba Reddy, U. Dinesh Acharya

Abstract:

As the number of networked computers grows, intrusion detection is an essential component in keeping networks secure. Various approaches for intrusion detection are currently being in use with each one has its own merits and demerits. This paper presents our work to test and improve the performance of a new class of decision tree c-fuzzy decision tree to detect intrusion. The work also includes identifying best candidate feature sub set to build the efficient c-fuzzy decision tree based Intrusion Detection System (IDS). We investigated the usefulness of c-fuzzy decision tree for developing IDS with a data partition based on horizontal fragmentation. Empirical results indicate the usefulness of our approach in developing the efficient IDS.

Keywords: Data mining, Decision tree, Feature selection, Fuzzyc- means clustering, Intrusion detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1577
407 The Paralinguistic Function of Emojis in Twitter Communication

Authors: Yasmin Tantawi, Mary Beth Rosson

Abstract:

In response to the dearth of information about emoji use for different purposes in different settings, this paper investigates the paralinguistic function of emojis within Twitter communication in the United States. To conduct this investigation, the Twitter feeds from 16 population centers spread throughout the United States were collected from the Twitter public API. One hundred tweets were collected from each population center, totaling to 1,600 tweets. Tweets containing emojis were next extracted using the “emot” Python package; these were then analyzed via the IBM Watson API Natural Language Understanding module to identify the topics discussed. A manual content analysis was then conducted to ascertain the paralinguistic and emotional features of the emojis used in these tweets. We present our characterization of emoji usage in Twitter and discuss implications for the design of Twitter and other text-based communication tools.

Keywords: Computer mediated communication, content analysis, paralinguistics, sociology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2029
406 Hybrid Modeling Algorithm for Continuous Tamil Speech Recognition

Authors: M. Kalamani, S. Valarmathy, M. Krishnamoorthi

Abstract:

In this paper, Fuzzy C-Means clustering with Expectation Maximization-Gaussian Mixture Model based hybrid modeling algorithm is proposed for Continuous Tamil Speech Recognition. The speech sentences from various speakers are used for training and testing phase and objective measures are between the proposed and existing Continuous Speech Recognition algorithms. From the simulated results, it is observed that the proposed algorithm improves the recognition accuracy and F-measure up to 3% as compared to that of the existing algorithms for the speech signal from various speakers. In addition, it reduces the Word Error Rate, Error Rate and Error up to 4% as compared to that of the existing algorithms. In all aspects, the proposed hybrid modeling for Tamil speech recognition provides the significant improvements for speechto- text conversion in various applications.

Keywords: Speech Segmentation, Feature Extraction, Clustering, HMM, EM-GMM, CSR.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2139
405 Active Control Improvement of Smart Cantilever Beam by Piezoelectric Materials and On-Line Differential Artificial Neural Networks

Authors: P. Karimi, A. H. Khedmati Bazkiaei

Abstract:

The main goal of this study is to test differential neural network as a controller of smart structure and is to enumerate its advantages and disadvantages in comparison with other controllers. In this study, the smart structure has been considered as a Euler Bernoulli cantilever beam and it has been tried that it be under control with the use of vibration neural network resulting from movement. Also, a linear observer has been considered as a reference controller and has been compared its results. The considered vibration charts and the controlled state have been recounted in the final part of this text. The obtained result show that neural observer has better performance in comparison to the implemented linear observer.

Keywords: Smart material, on-line differential artificial neural network, active control, finite element method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 816
404 Finding an Optimized Discriminate Function for Internet Application Recognition

Authors: E. Khorram, S.M. Mirzababaei

Abstract:

Everyday the usages of the Internet increase and simply a world of the data become accessible. Network providers do not want to let the provided services to be used in harmful or terrorist affairs, so they used a variety of methods to protect the special regions from the harmful data. One of the most important methods is supposed to be the firewall. Firewall stops the transfer of such packets through several ways, but in some cases they do not use firewall because of its blind packet stopping, high process power needed and expensive prices. Here we have proposed a method to find a discriminate function to distinguish between usual packets and harmful ones by the statistical processing on the network router logs. So an administrator can alarm to the user. This method is very fast and can be used simply in adjacent with the Internet routers.

Keywords: Data Mining, Firewall, Optimization, Packetclassification, Statistical Pattern Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1410
403 K-Means for Spherical Clusters with Large Variance in Sizes

Authors: A. M. Fahim, G. Saake, A. M. Salem, F. A. Torkey, M. A. Ramadan

Abstract:

Data clustering is an important data exploration technique with many applications in data mining. The k-means algorithm is well known for its efficiency in clustering large data sets. However, this algorithm is suitable for spherical shaped clusters of similar sizes and densities. The quality of the resulting clusters decreases when the data set contains spherical shaped with large variance in sizes. In this paper, we introduce a competent procedure to overcome this problem. The proposed method is based on shifting the center of the large cluster toward the small cluster, and recomputing the membership of small cluster points, the experimental results reveal that the proposed algorithm produces satisfactory results.

Keywords: K-Means, Data Clustering, Cluster Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3283
402 Learning an Overcomplete Dictionary using a Cauchy Mixture Model for Sparse Decay

Authors: E. S. Gower, M. O. J. Hawksford

Abstract:

An algorithm for learning an overcomplete dictionary using a Cauchy mixture model for sparse decomposition of an underdetermined mixing system is introduced. The mixture density function is derived from a ratio sample of the observed mixture signals where 1) there are at least two but not necessarily more mixture signals observed, 2) the source signals are statistically independent and 3) the sources are sparse. The basis vectors of the dictionary are learned via the optimization of the location parameters of the Cauchy mixture components, which is shown to be more accurate and robust than the conventional data mining methods usually employed for this task. Using a well known sparse decomposition algorithm, we extract three speech signals from two mixtures based on the estimated dictionary. Further tests with additive Gaussian noise are used to demonstrate the proposed algorithm-s robustness to outliers.

Keywords: expectation-maximization, Pitman estimator, sparsedecomposition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1950
401 Development of Subjective Measures of Interestingness: From Unexpectedness to Shocking

Authors: Eiad Yafi, M. A. Alam, Ranjit Biswas

Abstract:

Knowledge Discovery of Databases (KDD) is the process of extracting previously unknown but useful and significant information from large massive volume of databases. Data Mining is a stage in the entire process of KDD which applies an algorithm to extract interesting patterns. Usually, such algorithms generate huge volume of patterns. These patterns have to be evaluated by using interestingness measures to reflect the user requirements. Interestingness is defined in different ways, (i) Objective measures (ii) Subjective measures. Objective measures such as support and confidence extract meaningful patterns based on the structure of the patterns, while subjective measures such as unexpectedness and novelty reflect the user perspective. In this report, we try to brief the more widely spread and successful subjective measures and propose a new subjective measure of interestingness, i.e. shocking.

Keywords: Shocking rules (SHR).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1538
400 Feature Selection for Web Page Classification Using Swarm Optimization

Authors: B. Leela Devi, A. Sankar

Abstract:

The web’s increased popularity has included a huge amount of information, due to which automated web page classification systems are essential to improve search engines’ performance. Web pages have many features like HTML or XML tags, hyperlinks, URLs and text contents which can be considered during an automated classification process. It is known that Webpage classification is enhanced by hyperlinks as it reflects Web page linkages. The aim of this study is to reduce the number of features to be used to improve the accuracy of the classification of web pages. In this paper, a novel feature selection method using an improved Particle Swarm Optimization (PSO) using principle of evolution is proposed. The extracted features were tested on the WebKB dataset using a parallel Neural Network to reduce the computational cost.

Keywords: Web page classification, WebKB Dataset, Term Frequency-Inverse Document Frequency (TF-IDF), Particle Swarm Optimization (PSO).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3262
399 A Study on Energy Efficiency of Vertical Water Treatment System with DC Power Supply

Authors: Young-Kwan Choi, Gang-Wook Shin, Sung-Taek Hong

Abstract:

Water supply system consumes large amount of power load during water treatment and transportation of purified water. Many energy conserving high efficiency materials such as DC motor and LED light have recently been introduced to water supply system for energy conservation. This paper performed empirical analysis on BLDC and AC motors and comparatively analyzed the change in power according to DC power supply ratio in order to conserve energy of a next-generation water treatment system called vertical water treatment system. In addition, a DC distribution system linked with photovoltaic generation was simulated to analyze the energy conserving effect of DC load.

Keywords: Vertical Water Treatment System, DC Power Supply, Energy Efficiency, BLDC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2132