Search results for: navigation pattern mining.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1595

Search results for: navigation pattern mining.

605 Unsupervised Clustering Methods for Identifying Rare Events in Anomaly Detection

Authors: Witcha Chimphlee, Abdul Hanan Abdullah, Mohd Noor Md Sap, Siriporn Chimphlee, Surat Srinoy

Abstract:

It is important problems to increase the detection rates and reduce false positive rates in Intrusion Detection System (IDS). Although preventative techniques such as access control and authentication attempt to prevent intruders, these can fail, and as a second line of defence, intrusion detection has been introduced. Rare events are events that occur very infrequently, detection of rare events is a common problem in many domains. In this paper we propose an intrusion detection method that combines Rough set and Fuzzy Clustering. Rough set has to decrease the amount of data and get rid of redundancy. Fuzzy c-means clustering allow objects to belong to several clusters simultaneously, with different degrees of membership. Our approach allows us to recognize not only known attacks but also to detect suspicious activity that may be the result of a new, unknown attack. The experimental results on Knowledge Discovery and Data Mining-(KDDCup 1999) Dataset show that the method is efficient and practical for intrusion detection systems.

Keywords: Network and security, intrusion detection, fuzzy cmeans, rough set.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2861
604 Application of Particle Swarm Optimization for Economic Load Dispatch and Loss Reduction

Authors: N. Phanthuna, J. Jaturacherdchaiskul, S. Lerdvanittip, S. Auchariyamet

Abstract:

This paper proposes a particle swarm optimization (PSO) technique to solve the economic load dispatch (ELD) problems. For the ELD problem in this work, the objective function is to minimize the total fuel cost of all generator units for a given daily load pattern while the main constraints are power balance and generation output of each units. Case study in the test system of 40-generation units with 6 load patterns is presented to demonstrate the performance of PSO in solving the ELD problem. It can be seen that the optimal solution given by PSO provides the minimum total cost of generation while satisfying all the constraints and benefiting greatly from saving in power loss reduction.

Keywords: Particle Swarm Optimization, Economic Load Dispatch, Loss Reduction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1899
603 Semi-Automatic Method to Assist Expert for Association Rules Validation

Authors: Amdouni Hamida, Gammoudi Mohamed Mohsen

Abstract:

In order to help the expert to validate association rules extracted from data, some quality measures are proposed in the literature. We distinguish two categories: objective and subjective measures. The first one depends on a fixed threshold and on data quality from which the rules are extracted. The second one consists on providing to the expert some tools in the objective to explore and visualize rules during the evaluation step. However, the number of extracted rules to validate remains high. Thus, the manually mining rules task is very hard. To solve this problem, we propose, in this paper, a semi-automatic method to assist the expert during the association rule's validation. Our method uses rule-based classification as follow: (i) We transform association rules into classification rules (classifiers), (ii) We use the generated classifiers for data classification. (iii) We visualize association rules with their quality classification to give an idea to the expert and to assist him during validation process.

Keywords: Association rules, Rule-based classification, Classification quality, Validation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1791
602 The Negative Effect of Traditional Loops Style on the Performance of Algorithms

Authors: Mahmoud Moh'd Mhashi

Abstract:

A new algorithm called Character-Comparison to Character-Access (CCCA) is developed to test the effect of both: 1) converting character-comparison and number-comparison into character-access and 2) the starting point of checking on the performance of the checking operation in string searching. An experiment is performed using both English text and DNA text with different sizes. The results are compared with five algorithms, namely, Naive, BM, Inf_Suf_Pref, Raita, and Cycle. With the CCCA algorithm, the results suggest that the evaluation criteria of the average number of total comparisons are improved up to 35%. Furthermore, the results suggest that the clock time required by the other algorithms is improved in range from 22.13% to 42.33% by the new CCCA algorithm.

Keywords: Pattern matching, string searching, charactercomparison, character-access, text type, and checking

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1270
601 Analysis of Fixed Beamforming Algorithms for Smart Antenna Systems

Authors: Muhammad Umair Shahid, Abdul Rehman, Mudassir Mukhtar, Muhammad Nauman

Abstract:

The smart antenna is the prominent technology that has become known in recent years to meet the growing demands of wireless communications. In an overcrowded atmosphere, its application is growing gradually. A methodical evaluation of the performance of Fixed Beamforming algorithms for smart antennas such as Multiple Sidelobe Canceller (MSC), Maximum Signal-to-interference ratio (MSIR) and minimum variance (MVDR) has been comprehensively presented in this paper. Simulation results show that beamforming is helpful in providing optimized response towards desired directions. MVDR beamformer provides the most optimal solution.

Keywords: Fixed weight beamforming, array pattern, signal to interference ratio, power efficiency, element spacing, array elements, optimum weight vector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 780
600 Color Shift of Printing with Hybrid Halftone Images for Overlay Misalignment

Authors: Xu Guoliang, Tan Qingping

Abstract:

Color printing proceeds with multiple halftone separations overlay. Because of separation overlay misalignment in printing, the percentage of different primary color combination may vary and it will result in color shift. In traditional printing procedure with AM halftone, every separation has different screening angle to make the superposition pattern in a random style, which will reduce the color shift. To evaluate the color shift of printing with hybrid halftoning, we simulate printing procedure with halftone images overlay and calculate the color difference between expected color and color in different overlay misalignment configurations. The color difference for hybrid halftone and AM halftone is very close. So the color shift for hybrid halftone is acceptable with current color printing procedure.

Keywords: color printing, AM halftone, Hybrid halftone, misalignment, color shift, Neugebauer Color Equation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1687
599 A Practical Distributed String Matching Algorithm Architecture and Implementation

Authors: Bi Kun, Gu Nai-jie, Tu Kun, Liu Xiao-hu, Liu Gang

Abstract:

Traditional parallel single string matching algorithms are always based on PRAM computation model. Those algorithms concentrate on the cost optimal design and the theoretical speed. Based on the distributed string matching algorithm proposed by CHEN, a practical distributed string matching algorithm architecture is proposed in this paper. And also an improved single string matching algorithm based on a variant Boyer-Moore algorithm is presented. We implement our algorithm on the above architecture and the experiments prove that it is really practical and efficient on distributed memory machine. Its computation complexity is O(n/p + m), where n is the length of the text, and m is the length of the pattern, and p is the number of the processors.

Keywords: Boyer-Moore algorithm, distributed algorithm, parallel string matching, string matching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2189
598 Lodging Business Management in Nakhon Pathom with Sufficient Economy Approach

Authors: Krisada Sungkhamanee

Abstract:

The objectives of this research are to search the management pattern of Nakhon Pathom lodging entrepreneurs for sufficient economy ways, to know the threat that affects this sector and design fit arrangement model to sustain their business with Nakhon Pathom style. What will happen if they do not use this approach? Will they have a financial crisis? The data and information are collected by informal discussions with 12 managers and 400 questionnaires. A mixed method of both qualitative research and quantitative research are used. Bent Flyvbjerg’s phronesis is utilized for this analysis. Our research will prove that sufficient economy can help small business firms to solve their problems. We think that the results of our research will be a financial model to solve many problems of the entrepreneurs and this way will can be a model for other provinces of Thailand.

Keywords: Nakhon Pathom Province, Lodging Business, Sufficient Economy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4019
597 Map UI Design of IoT Application Based on Passenger Evacuation Behaviors in Underground Station

Authors: Meng-Cong Zheng

Abstract:

When the public space is in an emergency, how to quickly establish spatial cognition and emergency shelter in the closed underground space is the urgent task. This study takes Taipei Station as the research base and aims to apply the use of Internet of things (IoT) application for underground evacuation mobility design. The first experiment identified passengers' evacuation behaviors and spatial cognition in underground spaces by wayfinding tasks and thinking aloud, then defined the design conditions of User Interface (UI) and proposed the UI design.  The second experiment evaluated the UI design based on passengers' evacuation behaviors by wayfinding tasks and think aloud again as same as the first experiment. The first experiment found that the design conditions that the subjects were most concerned about were "map" and hoping to learn the relative position of themselves with other landmarks by the map and watch the overall route. "Position" needs to be accurately labeled to determine the location in underground space. Each step of the escape instructions should be presented clearly in "navigation bar." The "message bar" should be informed of the next or final target exit. In the second experiment with the UI design, we found that the "spatial map" distinguishing between walking and non-walking areas with shades of color is useful. The addition of 2.5D maps of the UI design increased the user's perception of space. Amending the color of the corner diagram in the "escape route" also reduces the confusion between the symbol and other diagrams. The larger volume of toilets and elevators can be a judgment of users' relative location in "Hardware facilities." Fire extinguisher icon should be highlighted. "Fire point tips" of the UI design indicated fire with a graphical fireball can convey precise information to the escaped person. "Fire point tips" of the UI design indicated fire with a graphical fireball can convey precise information to the escaped person. However, "Compass and return to present location" are less used in underground space.

Keywords: Evacuation behaviors, IoT application, map UI design, underground station.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 741
596 Analysis of Vortical Structures Generated by the Swirler of Combustion Chamber

Authors: Vladislav A. Nazukin, Valery G. Avgustinovich, Vakhtang V. Tsatiashvili

Abstract:

The most important part of modern lean low NOx combustors is a premixer where swirlers are often used for intensification of mixing processes and further formation of required flow pattern in combustor liner. Swirling flow leads to formation of complex eddy structures causing flow perturbations. It is able to cause combustion instability. Therefore, at design phase, it is necessary to pay great attention to aerodynamics of premixers. Analysis based on unsteady CFD modeling of swirling flow in production combustor swirler showed presence of large number of different eddy structures that can be conditionally divided into three types relative to its location of origin and a propagation path. Further, features of each eddy type were subsequently defined. Comparison of calculated and experimental pressure fluctuations spectrums verified correctness of computations.

Keywords: DES simulation, swirler, vortical structures.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1875
595 High Impedance Faults Detection Technique Based on Wavelet Transform

Authors: Ming-Ta Yang, Jin-Lung Guan, Jhy-Cherng Gu

Abstract:

The purpose of this paper is to solve the problem of protecting aerial lines from high impedance faults (HIFs) in distribution systems. This investigation successfully applies 3I0 zero sequence current to solve HIF problems. The feature extraction system based on discrete wavelet transform (DWT) and the feature identification technique found on statistical confidence are then applied to discriminate effectively between the HIFs and the switch operations. Based on continuous wavelet transform (CWT) pattern recognition of HIFs is proposed, also. Staged fault testing results demonstrate that the proposed wavelet based algorithm is feasible performance well.

Keywords: Continuous wavelet transform, discrete wavelet transform, high impedance faults, statistical confidence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2324
594 A Computational Cost-Effective Clustering Algorithm in Multidimensional Space Using the Manhattan Metric: Application to the Global Terrorism Database

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

The increasing amount of collected data has limited the performance of the current analyzing algorithms. Thus, developing new cost-effective algorithms in terms of complexity, scalability, and accuracy raised significant interests. In this paper, a modified effective k-means based algorithm is developed and experimented. The new algorithm aims to reduce the computational load without significantly affecting the quality of the clusterings. The algorithm uses the City Block distance and a new stop criterion to guarantee the convergence. Conducted experiments on a real data set show its high performance when compared with the original k-means version.

Keywords: Pattern recognition, partitional clustering, K-means clustering, Manhattan distance, terrorism data analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1359
593 The Effect of Fine Aggregate Properties on the Fatigue Behavior of the Conventional and Polymer Modified Bituminous Mixtures Using Two Types of Sand as Fine Aggregate

Authors: S. G. Yasreen, N. B. Madzlan, K. Ibrahim

Abstract:

Fatigue cracking continues to be the main challenges in improving the performance of bituminous mixture pavements. The purpose of this paper is to look at some aspects of the effects of fine aggregate properties on the fatigue behaviour of hot mixture asphalt. Two types of sand (quarry and mining sand) with two conventional bitumen (PEN 50/60 & PEN 80/100) and four polymers modified bitumen PMB (PM1_82, PM1_76, PM2_82 and PM2_76) were used. Physical, chemical and mechanical tests were performed on the sands to determine their effect when incorporated with a bituminous mixture. According to the beam fatigue results, quarry sand that has more angularity, rougher, higher shear strength and a higher percentage of Aluminium oxide presented higher resistance to fatigue. Also a PMB mixture gives better fatigue results than conventional mixtures, this is due to the PMB having better viscosity property than that of the conventional bitumen.

Keywords: Beam fatigue test, chemical property, mechanical property, physical property

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2813
592 Dual-Network Memory Model for Temporal Sequences

Authors: Motonobu Hattori, Rina Suzuki

Abstract:

In neural networks, when new patters are learned by a network, they radically interfere with previously stored patterns. This drawback is called catastrophic forgetting. We have already proposed a biologically inspired dual-network memory model which can much reduce this forgetting for static patterns. In this model, information is first stored in the hippocampal network, and thereafter, it is transferred to the neocortical network using pseudopatterns. Because temporal sequence learning is more important than static pattern learning in the real world, in this study, we improve our conventional  dual-network memory model so that it can deal with temporal sequences without catastrophic forgetting. The computer simulation results show the effectiveness of the proposed dual-network memory model.  

Keywords: Catastrophic forgetting, dual-network, temporal sequences.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1424
591 Experimental and Finite Element Study of Bending Fatigue Failure: A Case Study on Main Shaft of a Gyrator Crusher

Authors: Rahim Sotoudeh Bahreini, Alireza Foroughi Nematollahi, Akbar Jafari

Abstract:

This study investigates the mechanism of a Gyratory crusher-located in Golgohar mining and industrial Co. specifically with a focus on stresses distribution and fatigue failure of its main shaft. At first step, the cross section of the fractured shaft is studied, and the crack growth is analyzed. Then, the rotational motion of the shaft and the oil temperature of oil circuit of equipment are monitored. Condition monitoring is used to help finding a better modification. Based on the results of this study, the main causes of shaft failure are identified, and corrective solution is offered to increase crusher performance, especially its main shaft life. To predict the efficiency of the proposed modification, finite element simulation is performed, and its results are compared with the similar modified cases. The comparison and interpretation of simulation results confirm the efficiency of proposed corrective method.

Keywords: Fatigue failure, finite element method, gyratory crusher, condition monitoring.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1635
590 Integrated Power Saving for Multiple Relays and UEs in LTE-TDD

Authors: Chun-Chuan Yang, Jeng-Yueng Chen, Yi-Ting Mai, Chen-Ming Yang

Abstract:

In this paper, the design of integrated sleep scheduling for relay nodes and user equipments under a Donor eNB (DeNB) in the mode of Time Division Duplex (TDD) in LTE-A is presented. The idea of virtual time is proposed to deal with the discontinuous pattern of the available radio resource in TDD, and based on the estimation of the traffic load, three power saving schemes in the top-down strategy are presented. Associated mechanisms in each scheme including calculation of the virtual subframe capacity, the algorithm of integrated sleep scheduling, and the mapping mechanisms for the backhaul link and the access link are presented in the paper. Simulation study shows the advantage of the proposed schemes in energy saving over the standard DRX scheme.

Keywords: LTE-A, Relay, TDD, Power Saving.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1160
589 Water End-Use Classification with Contemporaneous Water-Energy Data and Deep Learning Network

Authors: Khoi A. Nguyen, Rodney A. Stewart, Hong Zhang

Abstract:

‘Water-related energy’ is energy use which is directly or indirectly influenced by changes to water use. Informatics applying a range of mathematical, statistical and rule-based approaches can be used to reveal important information on demand from the available data provided at second, minute or hourly intervals. This study aims to combine these two concepts to improve the current water end use disaggregation problem through applying a wide range of most advanced pattern recognition techniques to analyse the concurrent high-resolution water-energy consumption data. The obtained results have shown that recognition accuracies of all end-uses have significantly increased, especially for mechanised categories, including clothes washer, dishwasher and evaporative air cooler where over 95% of events were correctly classified.

Keywords: Deep learning network, smart metering, water end use, water-energy data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1363
588 Feature Selection with Kohonen Self Organizing Classification Algorithm

Authors: Francesco Maiorana

Abstract:

In this paper a one-dimension Self Organizing Map algorithm (SOM) to perform feature selection is presented. The algorithm is based on a first classification of the input dataset on a similarity space. From this classification for each class a set of positive and negative features is computed. This set of features is selected as result of the procedure. The procedure is evaluated on an in-house dataset from a Knowledge Discovery from Text (KDT) application and on a set of publicly available datasets used in international feature selection competitions. These datasets come from KDT applications, drug discovery as well as other applications. The knowledge of the correct classification available for the training and validation datasets is used to optimize the parameters for positive and negative feature extractions. The process becomes feasible for large and sparse datasets, as the ones obtained in KDT applications, by using both compression techniques to store the similarity matrix and speed up techniques of the Kohonen algorithm that take advantage of the sparsity of the input matrix. These improvements make it feasible, by using the grid, the application of the methodology to massive datasets.

Keywords: Clustering algorithm, Data mining, Feature selection, Grid, Kohonen Self Organizing Map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3052
587 Review and Experiments on SDMSCue

Authors: Ashraf Anwar

Abstract:

In this work, I present a review on Sparse Distributed Memory for Small Cues (SDMSCue), a variant of Sparse Distributed Memory (SDM) that is capable of handling small cues. I then conduct and show some cognitive experiments on SDMSCue to test its cognitive soundness compared to SDM. Small cues refer to input cues that are presented to memory for reading associations; but have many missing parts or fields from them. The original SDM failed to handle such a problem. SDMSCue handles and overcomes this pitfall. The main idea in SDMSCue; is the repeated projection of the semantic space on smaller subspaces; that are selected based on the input cue length and pattern. This process allows for Read/Write operations using an input cue that is missing a large portion. SDMSCue is augmented with the use of genetic algorithms for memory allocation and initialization. I claim that SDM functionality is a subset of SDMSCue functionality.

Keywords: Artificial intelligence, recall, recognition, SDM, SDMSCue.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1373
586 Automatic Extraction of Features and Opinion-Oriented Sentences from Customer Reviews

Authors: Khairullah Khan, Baharum B. Baharudin, Aurangzeb Khan, Fazal_e_Malik

Abstract:

Opinion extraction about products from customer reviews is becoming an interesting area of research. Customer reviews about products are nowadays available from blogs and review sites. Also tools are being developed for extraction of opinion from these reviews to help the user as well merchants to track the most suitable choice of product. Therefore efficient method and techniques are needed to extract opinions from review and blogs. As reviews of products mostly contains discussion about the features, functions and services, therefore, efficient techniques are required to extract user comments about the desired features, functions and services. In this paper we have proposed a novel idea to find features of product from user review in an efficient way. Our focus in this paper is to get the features and opinion-oriented words about products from text through auxiliary verbs (AV) {is, was, are, were, has, have, had}. From the results of our experiments we found that 82% of features and 85% of opinion-oriented sentences include AVs. Thus these AVs are good indicators of features and opinion orientation in customer reviews.

Keywords: Classification, Customer Reviews, Helping Verbs, Opinion Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2096
585 A Novel Non-Uniformity Correction Algorithm Based On Non-Linear Fit

Authors: Yang Weiping, Zhang Zhilong, Zhang Yan, Chen Zengping

Abstract:

Infrared focal plane arrays (IRFPA) sensors, due to their high sensitivity, high frame frequency and simple structure, have become the most prominently used detectors in military applications. However, they suffer from a common problem called the fixed pattern noise (FPN), which severely degrades image quality and limits the infrared imaging applications. Therefore, it is necessary to perform non-uniformity correction (NUC) on IR image. The algorithms of non-uniformity correction are classified into two main categories, the calibration-based and scene-based algorithms. There exist some shortcomings in both algorithms, hence a novel non-uniformity correction algorithm based on non-linear fit is proposed, which combines the advantages of the two algorithms. Experimental results show that the proposed algorithm acquires a good effect of NUC with a lower non-uniformity ratio.

Keywords: Non-uniformity correction, non-linear fit, two-point correction, temporal Kalman filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2316
584 Research on Hybrid Neural Network in Intrusion Detection System

Authors: Jianhua Wang, Yan Yu

Abstract:

This paper presents an intrusion detection system of hybrid neural network model based on RBF and Elman. It is used for anomaly detection and misuse detection. This model has the memory function .It can detect discrete and related aggressive behavior effectively. RBF network is a real-time pattern classifier, and Elman network achieves the memory ability for former event. Based on the hybrid model intrusion detection system uses DARPA data set to do test evaluation. It uses ROC curve to display the test result intuitively. After the experiment it proves this hybrid model intrusion detection system can effectively improve the detection rate, and reduce the rate of false alarm and fail.

Keywords: RBF, Elman, anomaly detection, misuse detection, hybrid neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2327
583 Numerical Simulation and Experimental Validation of the Hydraulic L-Shaped Check Ball Behavior

Authors: Shinji Kajiwara

Abstract:

The spring-driven ball-type check valve is one of the most important components of hydraulic systems: it controls the position of the ball and prevents backward flow. To simplify the structure, the spring must be eliminated, and to accomplish this, the flow pattern and the behavior of the check ball in L-shaped pipe must be determined. In this paper, we present a full-scale model of a check ball made of acrylic resin, and we determine the relationship between the initial position of the ball, the position and diameter of the inflow port. The check flow rate increases in a standard center inflow model, and it is possible to greatly decrease the check-flow rate by shifting the inflow from the center.

Keywords: Hydraulics, Pipe Flow, Numerical Simulation, Flow Visualization, Check ball, L-shaped Pipe.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2078
582 Topic Modeling Using Latent Dirichlet Allocation and Latent Semantic Indexing on South African Telco Twitter Data

Authors: Phumelele P. Kubheka, Pius A. Owolawi, Gbolahan Aiyetoro

Abstract:

Twitter is one of the most popular social media platforms where users share their opinions on different subjects. Twitter can be considered a great source for mining text due to the high volumes of data generated through the platform daily. Many industries such as telecommunication companies can leverage the availability of Twitter data to better understand their markets and make an appropriate business decision. This study performs topic modeling on Twitter data using Latent Dirichlet Allocation (LDA). The obtained results are benchmarked with another topic modeling technique, Latent Semantic Indexing (LSI). The study aims to retrieve topics on a Twitter dataset containing user tweets on South African Telcos. Results from this study show that LSI is much faster than LDA. However, LDA yields better results with higher topic coherence by 8% for the best-performing model in this experiment. A higher topic coherence score indicates better performance of the model.

Keywords: Big data, latent Dirichlet allocation, latent semantic indexing, Telco, topic modeling, Twitter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 459
581 An Intelligent System for Phish Detection, using Dynamic Analysis and Template Matching

Authors: Chinmay Soman, Hrishikesh Pathak, Vishal Shah, Aniket Padhye, Amey Inamdar

Abstract:

Phishing, or stealing of sensitive information on the web, has dealt a major blow to Internet Security in recent times. Most of the existing anti-phishing solutions fail to handle the fuzziness involved in phish detection, thus leading to a large number of false positives. This fuzziness is attributed to the use of highly flexible and at the same time, highly ambiguous HTML language. We introduce a new perspective against phishing, that tries to systematically prove, whether a given page is phished or not, using the corresponding original page as the basis of the comparison. It analyzes the layout of the pages under consideration to determine the percentage distortion between them, indicative of any form of malicious alteration. The system design represents an intelligent system, employing dynamic assessment which accurately identifies brand new phishing attacks and will prove effective in reducing the number of false positives. This framework could potentially be used as a knowledge base, in educating the internet users against phishing.

Keywords: World Wide Web, Phishing, Internet security, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1832
580 A Distance Function for Data with Missing Values and Its Application

Authors: Loai AbdAllah, Ilan Shimshoni

Abstract:

Missing values in data are common in real world applications. Since the performance of many data mining algorithms depend critically on it being given a good metric over the input space, we decided in this paper to define a distance function for unlabeled datasets with missing values. We use the Bhattacharyya distance, which measures the similarity of two probability distributions, to define our new distance function. According to this distance, the distance between two points without missing attributes values is simply the Mahalanobis distance. When on the other hand there is a missing value of one of the coordinates, the distance is computed according to the distribution of the missing coordinate. Our distance is general and can be used as part of any algorithm that computes the distance between data points. Because its performance depends strongly on the chosen distance measure, we opted for the k nearest neighbor classifier to evaluate its ability to accurately reflect object similarity. We experimented on standard numerical datasets from the UCI repository from different fields. On these datasets we simulated missing values and compared the performance of the kNN classifier using our distance to other three basic methods. Our  experiments show that kNN using our distance function outperforms the kNN using other methods. Moreover, the runtime performance of our method is only slightly higher than the other methods.

Keywords: Missing values, Distance metric, Bhattacharyya distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2751
579 Application of Artificial Neural Network to Classification Surface Water Quality

Authors: S. Wechmongkhonkon, N.Poomtong, S. Areerachakul

Abstract:

Water quality is a subject of ongoing concern. Deterioration of water quality has initiated serious management efforts in many countries. This study endeavors to automatically classify water quality. The water quality classes are evaluated using 6 factor indices. These factors are pH value (pH), Dissolved Oxygen (DO), Biochemical Oxygen Demand (BOD), Nitrate Nitrogen (NO3N), Ammonia Nitrogen (NH3N) and Total Coliform (TColiform). The methodology involves applying data mining techniques using multilayer perceptron (MLP) neural network models. The data consisted of 11 sites of canals in Dusit district in Bangkok, Thailand. The data is obtained from the Department of Drainage and Sewerage Bangkok Metropolitan Administration during 2007-2011. The results of multilayer perceptron neural network exhibit a high accuracy multilayer perception rate at 96.52% in classifying the water quality of Dusit district canal in Bangkok Subsequently, this encouraging result could be applied with plan and management source of water quality.

Keywords: artificial neural network, classification, surface water quality

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3209
578 Indoor Mobile Robot Positioning Based on Wireless Fingerprint Matching

Authors: Xu Huang, Jing Fan, Maonian Wu, Yonggen Gu

Abstract:

This paper discusses the design of an indoor mobile robot positioning system. The problem of indoor positioning is solved through Wi-Fi fingerprint positioning to implement a low cost deployment. A wireless fingerprint matching algorithm based on the similarity of unequal length sequences is presented. Candidate sequences selection is defined as a set of mappings, and detection errors caused by wireless hotspot stability and the change of interior pattern can be corrected by transforming the unequal length sequences into equal length sequences. The presented scheme was verified experimentally to achieve the accuracy requirements for an indoor positioning system with low deployment cost.

Keywords: Fingerprint match, indoor positioning, mobile robot positioning system, Wi-Fi, wireless fingerprint.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1605
577 Characterization of Inertial Confinement Fusion Targets Based on Transmission Holographic Mach-Zehnder Interferometer

Authors: B. Zare-Farsani, M. Valieghbal, M. Tarkashvand, A. H. Farahbod

Abstract:

To provide the conditions for nuclear fusion by high energy and powerful laser beams, it is required to have a high degree of symmetry and surface uniformity of the spherical capsules to reduce the Rayleigh-Taylor hydrodynamic instabilities. In this paper, we have used the digital microscopic holography based on Mach-Zehnder interferometer to study the quality of targets for inertial fusion. The interferometric pattern of the target has been registered by a CCD camera and analyzed by Holovision software. The uniformity of the surface and shell thickness are investigated and measured in reconstructed image. We measured shell thickness in different zone where obtained non uniformity 22.82 percent.  

Keywords: Inertial confinement fusion, Mach-Zehnder interferometer, Digital holographic microscopy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1314
576 Adaptive Network Intrusion Detection Learning: Attribute Selection and Classification

Authors: Dewan Md. Farid, Jerome Darmont, Nouria Harbi, Nguyen Huu Hoa, Mohammad Zahidur Rahman

Abstract:

In this paper, a new learning approach for network intrusion detection using naïve Bayesian classifier and ID3 algorithm is presented, which identifies effective attributes from the training dataset, calculates the conditional probabilities for the best attribute values, and then correctly classifies all the examples of training and testing dataset. Most of the current intrusion detection datasets are dynamic, complex and contain large number of attributes. Some of the attributes may be redundant or contribute little for detection making. It has been successfully tested that significant attribute selection is important to design a real world intrusion detection systems (IDS). The purpose of this study is to identify effective attributes from the training dataset to build a classifier for network intrusion detection using data mining algorithms. The experimental results on KDD99 benchmark intrusion detection dataset demonstrate that this new approach achieves high classification rates and reduce false positives using limited computational resources.

Keywords: Attributes selection, Conditional probabilities, information gain, network intrusion detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2698