Search results for: classification of patterns

1241 A Survey: Clustering Ensembles Techniques

Authors: Reza Ghaemi , Md. Nasir Sulaiman , Hamidah Ibrahim , Norwati Mustapha

Abstract:

The clustering ensembles combine multiple partitions generated by different clustering algorithms into a single clustering solution. Clustering ensembles have emerged as a prominent method for improving robustness, stability and accuracy of unsupervised classification solutions. So far, many contributions have been done to find consensus clustering. One of the major problems in clustering ensembles is the consensus function. In this paper, firstly, we introduce clustering ensembles, representation of multiple partitions, its challenges and present taxonomy of combination algorithms. Secondly, we describe consensus functions in clustering ensembles including Hypergraph partitioning, Voting approach, Mutual information, Co-association based functions and Finite mixture model, and next explain their advantages, disadvantages and computational complexity. Finally, we compare the characteristics of clustering ensembles algorithms such as computational complexity, robustness, simplicity and accuracy on different datasets in previous techniques.

Keywords: Clustering Ensembles, Combinational Algorithm, Consensus Function, Unsupervised Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3414

1240 Classification of Earthquake Distribution in the Banda Sea Collision Zone with Point Process Approach

Authors: Henry J. Wattimanela, Udjianna S. Pasaribu, Nanang T. Puspito, Sapto W. Indratno

Abstract:

Banda Sea Collision Zone (BSCZ) is the result of the interaction and convergence of Indo-Australian plate, Eurasian plate and Pacific plate. This location is located in eastern Indonesia. This zone has a very high seismic activity. In this research, we will calculate the rate (λ) and Mean Square Error (MSE). By this result, we will classification earthquakes distribution in the BSCZ with the point process approach. Chi-square is used to determine the type of earthquakes distribution in the sub region of BSCZ. The data used in this research is data of earthquakes with a magnitude ≥ 6 SR for the period 1964-2013 and sourced from BMKG Jakarta. This research is expected to contribute to the Moluccas Province and surrounding local governments in performing spatial plan document related to disaster management.

Keywords: Banda sea collision zone, earthquakes, mean square error, Poisson distribution, chi-square test.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2092

1239 Dynamic Interaction Network to Model the Interactive Patterns of International Stock Markets

Authors: Laura Lukmanto, Harya Widiputra, Lukas

Abstract:

Studies in economics domain tried to reveal the correlation between stock markets. Since the globalization era, interdependence between stock markets becomes more obvious. The Dynamic Interaction Network (DIN) algorithm, which was inspired by a Gene Regulatory Network (GRN) extraction method in the bioinformatics field, is applied to reveal important and complex dynamic relationship between stock markets. We use the data of the stock market indices from eight countries around the world in this study. Our results conclude that DIN is able to reveal and model patterns of dynamic interaction from the observed variables (i.e. stock market indices). Furthermore, it is also found that the extracted network models can be utilized to predict movement of the stock market indices with a considerably good accuracy.

Keywords: complex dynamic relationship, dynamic interaction network, interactive stock markets, stock market interdependence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1384

1238 Speech Enhancement Using Wavelet Coefficients Masking with Local Binary Patterns

Authors: Christian Arcos, Marley Vellasco, Abraham Alcaim

Abstract:

In this paper, we present a wavelet coefficients masking based on Local Binary Patterns (WLBP) approach to enhance the temporal spectra of the wavelet coefficients for speech enhancement. This technique exploits the wavelet denoising scheme, which splits the degraded speech into pyramidal subband components and extracts frequency information without losing temporal information. Speech enhancement in each high-frequency subband is performed by binary labels through the local binary pattern masking that encodes the ratio between the original value of each coefficient and the values of the neighbour coefficients. This approach enhances the high-frequency spectra of the wavelet transform instead of eliminating them through a threshold. A comparative analysis is carried out with conventional speech enhancement algorithms, demonstrating that the proposed technique achieves significant improvements in terms of PESQ, an international recommendation of objective measure for estimating subjective speech quality. Informal listening tests also show that the proposed method in an acoustic context improves the quality of speech, avoiding the annoying musical noise present in other speech enhancement techniques. Experimental results obtained with a DNN based speech recognizer in noisy environments corroborate the superiority of the proposed scheme in the robust speech recognition scenario.

Keywords: Binary labels, local binary patterns, mask, wavelet coefficients, speech enhancement, speech recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1000

1237 Numerical Simulations of Shear Driven Square and Triangular Cavity by Using Lattice Boltzmann Scheme

Authors: A. M. Fudhail, N. A. C. Sidik, M. Z. M. Rody, H. M. Zahir, M.T. Musthafah

Abstract:

In this paper, fluid flow patterns of steady incompressible flow inside shear driven cavity are studied. The numerical simulations are conducted by using lattice Boltzmann method (LBM) for different Reynolds numbers. In order to simulate the flow, derivation of macroscopic hydrodynamics equations from the continuous Boltzmann equation need to be performed. Then, the numerical results of shear-driven flow inside square and triangular cavity are compared with results found in literature review. Present study found that flow patterns are affected by the geometry of the cavity and the Reynolds numbers used.

Keywords: Lattice Boltzmann method, shear driven cavity, square cavity, triangular cavity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1945

1236 Knowledge Discovery and Data Mining Techniques in Textile Industry

Authors: Filiz Ersoz, Taner Ersoz, Erkin Guler

Abstract:

This paper addresses the issues and technique for textile industry using data mining techniques. Data mining has been applied to the stitching of garments products that were obtained from a textile company. Data mining techniques were applied to the data obtained from the CHAID algorithm, CART algorithm, Regression Analysis and, Artificial Neural Networks. Classification technique based analyses were used while data mining and decision model about the production per person and variables affecting about production were found by this method. In the study, the results show that as the daily working time increases, the production per person also decreases. In addition, the relationship between total daily working and production per person shows a negative result and the production per person show the highest and negative relationship.

Keywords: Data mining, textile production, decision trees, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1526

1235 Abnormality Detection of Persons Living Alone Using Daily Life Patterns Obtained from Sensors

Authors: Ippei Kamihira, Takashi Nakajima, Taiyo Matsumura, Hikaru Miura, Takashi Ono

Abstract:

In this research, the goal was construction of a system by which multiple sensors were used to observe the daily life behavior of persons living alone (while respecting their privacy), using this information to judge such conditions as bad physical condition or falling in the home, etc., so that these abnormal conditions can be made known to relatives and third parties. The daily life patterns of persons living alone are expressed by the number of responses of sensors each time that a set time period has elapsed. By comparing data for the prior two weeks, it was possible to judge a situation as “normal” when the person was in good physical condition or as “abnormal” when the person was in bad physical condition.

Keywords: Sensors, Elderly living alone, Abnormality detection, Lifestyle habit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1606

1234 A Dataset of Program Educational Objectives Mapped to ABET Outcomes: Data Cleansing, Exploratory Data Analysis and Modeling

Authors: Addin Osman, Anwar Ali Yahya, Mohammed Basit Kamal

Abstract:

Datasets or collections are becoming important assets by themselves and now they can be accepted as a primary intellectual output of a research. The quality and usage of the datasets depend mainly on the context under which they have been collected, processed, analyzed, validated, and interpreted. This paper aims to present a collection of program educational objectives mapped to student’s outcomes collected from self-study reports prepared by 32 engineering programs accredited by ABET. The manual mapping (classification) of this data is a notoriously tedious, time consuming process. In addition, it requires experts in the area, which are mostly not available. It has been shown the operational settings under which the collection has been produced. The collection has been cleansed, preprocessed, some features have been selected and preliminary exploratory data analysis has been performed so as to illustrate the properties and usefulness of the collection. At the end, the collection has been benchmarked using nine of the most widely used supervised multiclass classification techniques (Binary Relevance, Label Powerset, Classifier Chains, Pruned Sets, Random k-label sets, Ensemble of Classifier Chains, Ensemble of Pruned Sets, Multi-Label k-Nearest Neighbors and Back-Propagation Multi-Label Learning). The techniques have been compared to each other using five well-known measurements (Accuracy, Hamming Loss, Micro-F, Macro-F, and Macro-F). The Ensemble of Classifier Chains and Ensemble of Pruned Sets have achieved encouraging performance compared to other experimented multi-label classification methods. The Classifier Chains method has shown the worst performance. To recap, the benchmark has achieved promising results by utilizing preliminary exploratory data analysis performed on the collection, proposing new trends for research and providing a baseline for future studies.

Keywords: Benchmark collection, program educational objectives, student outcomes, ABET, Accreditation, machine learning, supervised multiclass classification, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 820

1233 DIFFER: A Propositionalization approach for Learning from Structured Data

Authors: Thashmee Karunaratne, Henrik Böstrom

Abstract:

Logic based methods for learning from structured data is limited w.r.t. handling large search spaces, preventing large-sized substructures from being considered by the resulting classifiers. A novel approach to learning from structured data is introduced that employs a structure transformation method, called finger printing, for addressing these limitations. The method, which generates features corresponding to arbitrarily complex substructures, is implemented in a system, called DIFFER. The method is demonstrated to perform comparably to an existing state-of-art method on some benchmark data sets without requiring restrictions on the search space. Furthermore, learning from the union of features generated by finger printing and the previous method outperforms learning from each individual set of features on all benchmark data sets, demonstrating the benefit of developing complementary, rather than competing, methods for structure classification.

Keywords: Machine learning, Structure classification, Propositionalization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1208

1232 Classification System for a Collaborative Urban Retail Logistics

Authors: Volker Lange, Stephanie Moede, Christiane Auffermann

Abstract:

From an economic standpoint the current and future road traffic situation in urban areas is a cost factor. Traffic jams and congestion prolong journey times and tie up resources in trucks and personnel. Many discussions about imposing charges or tolls for cities in Europe in order to reduce traffic congestion are currently in progress. Both of these effects lead – directly or indirectly - to additional costs for the urban distribution systems in retail companies. One approach towards improving the efficiency of retail distribution systems, and thus towards avoiding negative environmental factors in urban areas, is horizontal collaboration for deliveries to retail outlets – Urban Retail Logistics. This paper presents a classification system to help reveal where cooperation between retail companies is possible and makes sense for deliveries to retail outlets in urban areas.

Keywords: City Logistics, Horizontal Collaboration, Urban Freight Transport, Urban Retail Logistics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2344

1231 The Phonology and Phonetics of Second Language Intonation in Case of “Downstep”

Authors: Tayebeh Norouzi

Abstract:

This study aims to investigate the acquisition process of intonation. It examines the intonation structure of Tokyo Japanese and its realization by Iranian learners of Japanese. Seven Iranian learners of Japanese, differing in fluency, and two Japanese speakers participated in the experiment. Two sentences were used to test the phonological and phonetic characteristics of lexical pitch-accent as well as the intonation patterns produced by the speakers. Both sentences consisted of similar words with the same number of syllables and lexical pitch-accents but different syntactic structure. Speakers were asked to read each sentence three times at normal speed, and the data were analyzed by Praat. The results show that lexical pitch-accent, Accentual Phrase (AP) and AP boundary tone realization vary depending on sentence type. For sentences of type XdeYwo, the lexical pitch-accent is realized properly. However, there is a rise in AP boundary tone regardless of speakers’ level of fluency. In contrast, in sentences of type XnoYwo, the lexical pitch-accent and AP boundary tone vary depending on the speakers’ fluency level. Advanced speakers are better at grouping words into phrases and produce more native-like intonation patterns, though they are not able to realize downstep properly. The non-native speakers tried to realize proper intonation patterns by making changes in lexical accent and boundary tone.

Keywords: Intonation, Iranian learners, Japanese prosody, lexical accent, second language acquisition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 967

1230 The Imaging Methods for Classifying Crispiness of Freeze-Dried Durian using Fuzzy Logic

Authors: Sitthichon Kanitthakun, Pinit Kumhom, Kosin Chamnongthai

Abstract:

In quality control of freeze-dried durian, crispiness is a key quality index of the product. Generally, crispy testing has to be done by a destructive method. A nondestructive testing of the crispiness is required because the samples can be reused for other kinds of testing. This paper proposed a crispiness classification method of freeze-dried durians using fuzzy logic for decision making. The physical changes of a freeze-dried durian include the pores appearing in the images. Three physical features including (1) the diameters of pores, (2) the ratio of the pore area and the remaining area, and (3) the distribution of the pores are considered to contribute to the crispiness. The fuzzy logic is applied for making the decision. The experimental results comparing with food expert opinion showed that the accuracy of the proposed classification method is 83.33 percent.

Keywords: Durian, crispiness, freeze drying, pore, fuzzy logic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1956

1229 Genetic Algorithms and Kernel Matrix-based Criteria Combined Approach to Perform Feature and Model Selection for Support Vector Machines

Authors: A. Perolini

Abstract:

Feature and model selection are in the center of attention of many researches because of their impact on classifiers- performance. Both selections are usually performed separately but recent developments suggest using a combined GA-SVM approach to perform them simultaneously. This approach improves the performance of the classifier identifying the best subset of variables and the optimal parameters- values. Although GA-SVM is an effective method it is computationally expensive, thus a rough method can be considered. The paper investigates a joined approach of Genetic Algorithm and kernel matrix criteria to perform simultaneously feature and model selection for SVM classification problem. The purpose of this research is to improve the classification performance of SVM through an efficient approach, the Kernel Matrix Genetic Algorithm method (KMGA).

Keywords: Feature and model selection, Genetic Algorithms, Support Vector Machines, kernel matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1580

1228 One-Class Support Vector Machines for Protein-Protein Interactions Prediction

Authors: Hany Alashwal, Safaai Deris, Razib M. Othman

Abstract:

Predicting protein-protein interactions represent a key step in understanding proteins functions. This is due to the fact that proteins usually work in context of other proteins and rarely function alone. Machine learning techniques have been applied to predict protein-protein interactions. However, most of these techniques address this problem as a binary classification problem. Although it is easy to get a dataset of interacting proteins as positive examples, there are no experimentally confirmed non-interacting proteins to be considered as negative examples. Therefore, in this paper we solve this problem as a one-class classification problem using one-class support vector machines (SVM). Using only positive examples (interacting protein pairs) in training phase, the one-class SVM achieves accuracy of about 80%. These results imply that protein-protein interaction can be predicted using one-class classifier with comparable accuracy to the binary classifiers that use artificially constructed negative examples.

Keywords: Bioinformatics, Protein-protein interactions, One-Class Support Vector Machines

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1975

1227 Determining the Gender of Korean Names for Pronoun Generation

Authors: Seong-Bae Park, Hee-Geun Yoon

Abstract:

It is an important task in Korean-English machine translation to classify the gender of names correctly. When a sentence is composed of two or more clauses and only one subject is given as a proper noun, it is important to find the gender of the proper noun for correct translation of the sentence. This is because a singular pronoun has a gender in English while it does not in Korean. Thus, in Korean-English machine translation, the gender of a proper noun should be determined. More generally, this task can be expanded into the classification of the general Korean names. This paper proposes a statistical method for this problem. By considering a name as just a sequence of syllables, it is possible to get a statistics for each name from a collection of names. An evaluation of the proposed method yields the improvement in accuracy over the simple looking-up of the collection. While the accuracy of the looking-up method is 64.11%, that of the proposed method is 81.49%. This implies that the proposed method is more plausible for the gender classification of the Korean names.

Keywords: machine translation, natural language processing, gender of proper nouns, statistical method

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2344

1226 Fabric Printing Design, an Inspired from the Five-Color Porcelain (Benjarong)

Authors: Suwit Sadsunk

Abstract:

The study is about the designed and decorative fabric printing that derived from the Five-color porcelain (Benjarong). The researcher examined the pattern and creativity of the decorative design of the Five-color porcelain (Benjarong) by the artists in order to apply for contemporary arts so that young generation will acknowledge the importance of the Five-color porcelain (Benjarong). The research methodology is both quantitative and qualitative. The researcher conducted an in-depth interview with the operator of five-color porcelain (Benjarong) at Ampawa, Samutsongkram. The information from the interview can be useful and implemented for designing the fabric patterns. The researcher found that there were many formats and designs of the Five-color porcelain (Benjarong) from the past to the present. Its unique design can be applied for the fabric patterns and ready-to-wear clothes properly. After advertising and showing the work of the Five-color porcelain (Benjarong) publicly, there were more young people interested in the Five-color porcelain (Benjarong) than expected which exceeded the objective with positive attitudes towards the Five-color porcelain (Benjarong).

Keywords: Decorative fabric printing, Five-color porcelain (Benjarong).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2322

1225 A Low Complexity Frequency Offset Estimation for MB-OFDM based UWB Systems

Authors: Wang Xue, Liu Dan, Liu Ying, Wang Molin, Qian Zhihong

Abstract:

A low-complexity, high-accurate frequency offset estimation for multi-band orthogonal frequency division multiplexing (MB-OFDM) based ultra-wide band systems is presented regarding different carrier frequency offsets, different channel frequency responses, different preamble patterns in different bands. Utilizing a half-cycle Constant Amplitude Zero Auto Correlation (CAZAC) sequence as the preamble sequence, the estimator with a semi-cross contrast scheme between two successive OFDM symbols is proposed. The CRLB and complexity of the proposed algorithm are derived. Compared to the reference estimators, the proposed method achieves significantly less complexity (about 50%) for all preamble patterns of the MB-OFDM systems. The CRLBs turn out to be of well performance.

Keywords: CAZAC, Frequency Offset, Semi-cross Contrast, MB-OFDM, UWB

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1647

1224 The Effects of Wind Forcing on Surface Currents on the Continental Shelf Surrounding Rottnest Island

Authors: Jennifer Penton, Charitha Pattiaratchi

Abstract:

Surface currents play a major role in the distribution of contaminants, the connectivity of marine populations, and can influence the vertical and horizontal distribution of nutrients within the water column. This paper aims to determine the effects of sea breeze-wind patterns on the climatology of the surface currents on the continental shelf surrounding Rottnest Island, WA Australia. The alternating wind patterns allow for full cyclic rotations of wind direction, permitting the interpretation of the effect of the wind on the surface currents. It was found that the surface currents only clearly follow the northbound Capes Current in times when the Fremantle Doctor sets in. Surface currents react within an hour to a change of direction of the wind, allowing southerly currents to dominate during strong northerly sea breezes, often followed by mixed currents dominated by eddies in the inter-lying times.

Keywords: HF radar, surface currents, sea breeze.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1532

1223 Integrated ACOR/IACOMV-R-SVM Algorithm

Authors: Hiba Basim Alwan, Ku Ruhana Ku-Mahamud

Abstract:

A direction for ACO is to optimize continuous and mixed (discrete and continuous) variables in solving problems with various types of data. Support Vector Machine (SVM), which originates from the statistical approach, is a present day classification technique. The main problems of SVM are selecting feature subset and tuning the parameters. Discretizing the continuous value of the parameters is the most common approach in tuning SVM parameters. This process will result in loss of information which affects the classification accuracy. This paper presents two algorithms that can simultaneously tune SVM parameters and select the feature subset. The first algorithm, ACO_R-SVM, will tune SVM parameters, while the second IACO_MV-R-SVM algorithm will simultaneously tune SVM parameters and select the feature subset. Three benchmark UCI datasets were used in the experiments to validate the performance of the proposed algorithms. The results show that the proposed algorithms have good performances as compared to other approaches.

Keywords: Continuous ant colony optimization, incremental continuous ant colony, simultaneous optimization, support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 861

1222 Analysis of the EEG Signal for a Practical Biometric System

Authors: Muhammad Kamil Abdullah, Khazaimatol S Subari, Justin Leo Cheang Loong, Nurul Nadia Ahmad

Abstract:

This paper discusses the effectiveness of the EEG signal for human identification using four or less of channels of two different types of EEG recordings. Studies have shown that the EEG signal has biometric potential because signal varies from person to person and impossible to replicate and steal. Data were collected from 10 male subjects while resting with eyes open and eyes closed in 5 separate sessions conducted over a course of two weeks. Features were extracted using the wavelet packet decomposition and analyzed to obtain the feature vectors. Subsequently, the neural networks algorithm was used to classify the feature vectors. Results show that, whether or not the subjects- eyes were open are insignificant for a 4– channel biometrics system with a classification rate of 81%. However, for a 2–channel system, the P4 channel should not be included if data is acquired with the subjects- eyes open. It was observed that for 2– channel system using only the C3 and C4 channels, a classification rate of 71% was achieved.

Keywords: Biometric, EEG, Wavelet Packet Decomposition, NeuralNetworks

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3012

1221 Optimal Multilayer Perceptron Structure For Classification of HIV Sub-Type Viruses

Authors: Zeyneb Kurt, Oguzhan Yavuz

Abstract:

The feature of HIV genome is in a wide range because of it is highly heterogeneous. Hence, the infection ability of the virus changes related with different chemokine receptors. From this point, R5 and X4 HIV viruses use CCR5 and CXCR5 coreceptors respectively while R5X4 viruses can utilize both coreceptors. Recently, in Bioinformatics, R5X4 viruses have been studied to classify by using the coreceptors of HIV genome. The aim of this study is to develop the optimal Multilayer Perceptron (MLP) for high classification accuracy of HIV sub-type viruses. To accomplish this purpose, the unit number in hidden layer was incremented one by one, from one to a particular number. The statistical data of R5X4, R5 and X4 viruses was preprocessed by the signal processing methods. Accessible residues of these virus sequences were extracted and modeled by Auto-Regressive Model (AR) due to the dimension of residues is large and different from each other. Finally the pre-processed dataset was used to evolve MLP with various number of hidden units to determine R5X4 viruses. Furthermore, ROC analysis was used to figure out the optimal MLP structure.

Keywords: Multilayer Perceptron, Auto-Regressive Model, HIV, ROC Analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1427

1220 Fake Account Detection in Twitter Based on Minimum Weighted Feature set

Authors: Ahmed El Azab, Amira M. Idrees, Mahmoud A. Mahmoud, Hesham Hefny

Abstract:

Social networking sites such as Twitter and Facebook attracts over 500 million users across the world, for those users, their social life, even their practical life, has become interrelated. Their interaction with social networking has affected their life forever. Accordingly, social networking sites have become among the main channels that are responsible for vast dissemination of different kinds of information during real time events. This popularity in Social networking has led to different problems including the possibility of exposing incorrect information to their users through fake accounts which results to the spread of malicious content during life events. This situation can result to a huge damage in the real world to the society in general including citizens, business entities, and others. In this paper, we present a classification method for detecting the fake accounts on Twitter. The study determines the minimized set of the main factors that influence the detection of the fake accounts on Twitter, and then the determined factors are applied using different classification techniques. A comparison of the results of these techniques has been performed and the most accurate algorithm is selected according to the accuracy of the results. The study has been compared with different recent researches in the same area; this comparison has proved the accuracy of the proposed study. We claim that this study can be continuously applied on Twitter social network to automatically detect the fake accounts; moreover, the study can be applied on different social network sites such as Facebook with minor changes according to the nature of the social network which are discussed in this paper.

Keywords: Fake accounts detection, classification algorithms, twitter accounts analysis, features based techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5818

1219 Evaluating some Feature Selection Methods for an Improved SVM Classifier

Authors: Daniel Morariu, Lucian N. Vintan, Volker Tresp

Abstract:

Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of features selection methods to reduce the dimensionality of the document-representation vector. Four feature selection methods are evaluated: Random Selection, Information Gain (IG), Support Vector Machine (called SVM_FS) and Genetic Algorithm with SVM (GA_FS). We showed that the best results were obtained with SVM_FS and GA_FS methods for a relatively small dimension of the features vector comparative with the IG method that involves longer vectors, for quite similar classification accuracies. Also we present a novel method to better correlate SVM kernel-s parameters (Polynomial or Gaussian kernel).

Keywords: Features selection, learning with kernels, support vector machine, genetic algorithms and classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1522

1218 Real-Time Testing of Steel Strip Welds based on Bayesian Decision Theory

Authors: Julio Molleda, Daniel F. García, Juan C. Granda, Francisco J. Suárez

Abstract:

One of the main trouble in a steel strip manufacturing line is the breakage of whatever weld carried out between steel coils, that are used to produce the continuous strip to be processed. A weld breakage results in a several hours stop of the manufacturing line. In this process the damages caused by the breakage must be repaired. After the reparation and in order to go on with the production it will be necessary a restarting process of the line. For minimizing this problem, a human operator must inspect visually and manually each weld in order to avoid its breakage during the manufacturing process. The work presented in this paper is based on the Bayesian decision theory and it presents an approach to detect, on real-time, steel strip defective welds. This approach is based on quantifying the tradeoffs between various classification decisions using probability and the costs that accompany such decisions.

Keywords: Classification, Pattern Recognition, ProbabilisticReasoning, Statistical Data Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1395

1217 A Study on Finding Similar Document with Multiple Categories

Authors: R. Saraçoğlu, N. Allahverdi

Abstract:

Searching similar documents and document management subjects have important place in text mining. One of the most important parts of similar document research studies is the process of classifying or clustering the documents. In this study, a similar document search approach that includes discussion of out the case of belonging to multiple categories (multiple categories problem) has been carried. The proposed method that based on Fuzzy Similarity Classification (FSC) has been compared with Rocchio algorithm and naive Bayes method which are widely used in text mining. Empirical results show that the proposed method is quite successful and can be applied effectively. For the second stage, multiple categories vector method based on information of categories regarding to frequency of being seen together has been used. Empirical results show that achievement is increased almost two times, when proposed method is compared with classical approach.

Keywords: Document similarity, Fuzzy classification, Multiple categories, Text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1690

1216 Inter-frame Collusion Attack in SS-N Video Watermarking System

Authors: Yaser Mohammad Taheri, Alireza Zolghadr–asli, Mehran Yazdi

Abstract:

Video watermarking is usually considered as watermarking of a set of still images. In frame-by-frame watermarking approach, each video frame is seen as a single watermarked image, so collusion attack is more critical in video watermarking. If the same or redundant watermark is used for embedding in every frame of video, the watermark can be estimated and then removed by watermark estimate remodolulation (WER) attack. Also if uncorrelated watermarks are used for every frame, these watermarks can be washed out with frame temporal filtering (FTF). Switching watermark system or so-called SS-N system has better performance against WER and FTF attacks. In this system, for each frame, the watermark is randomly picked up from a finite pool of watermark patterns. At first SS-N system will be surveyed and then a new collusion attack for SS-N system will be proposed using a new algorithm for separating video frame based on watermark pattern. So N sets will be built in which every set contains frames carrying the same watermark. After that, using WER attack in every set, N different watermark patterns will be estimated and removed later.

Keywords: Watermark estimation remodulation (WER), Frame Temporal Averaging (FTF), switching watermark system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1481

1215 Proposed Geometric Printed Patch Shapes for Microstrip Ultra-Wideband Antennas

Authors: Rashid A. Fayadh, F. Malek, Hilal A. Fadhil, Norshafinash Saudin

Abstract:

In this paper, a design of ultra wideband (UWB) printed microstrip antennas that fed by microstrip transmission line were presented and printed on a substrate Taconic TLY-5 material with relative dielectric constant of 2.2. The proposed antennas were designed to cover the frequency range of 3.5 to 12 GHz. The antennas of printed patch shapes are rectangular, triangle/rectangular, hexagonal, and circular with the same dimensions of feeder and ground plane. The proposed antennas were simulated using a package of CST microwave studio in the 2 to 12 GHz operating frequency range. Simulation results and comparison for return loss (S₁₁), radiation patterns, and voltage standing wave ratio (VSWR) were presented and discussed over the UWB frequency.

Keywords: Microstrip patch antenna, ultra-wideband frequency, wireless communication systems, return loss and radiation patterns.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2654

1214 Participatory Patterns of Community in Water and Waste Management: A Case Study of Municipality in Amphawa District, Samut Songkram Province

Authors: Srisuwan Kasemsawat

Abstract:

This is a survey research using quantitative and qualitative methodology. There were three objectives: 1) To study participatory level of community in water and waste environment management. 2) To study the affecting factors for community participation in water and waste environment management in Ampawa District, Samut Songkram Province. 3) To search for the participatory patterns in water and waste management. The population sample for the quantitative research was 1,364 people living in Ampawa District. The methodology was simple random sampling. Research instrument was a questionnaire and the qualitative research used purposive sampling in 6 Sub Districts which are Ta Ka, Suanluang, Bangkae, Muangmai, Kwae-om, and Bangnanglee Sub District Administration Organization. Total population is 63. For data analysis, the study used content analysis from quantitative research to synthesize and build question frame from the content for interview and conducting focus group interview. The study found that the community participatory in the issue of level in water and waste management are moderate of planning, operation, and evaluation. The issue of being beneficial is at low level. Therefore, the overall participatory level of community in water and waste environment management is at a medium level. The factors affecting the participatory of community in water and waste management are age, the period dwelling in the community and membership in which the mean difference is statistic significant at 0.05 in area of operation, being beneficial, and evaluation. For patterns of community participation, there is the correlation with water and waste management in 4 concerns which are 1) Participation in planning 2) Participation in operation 3) Participation in being beneficial both directly and indirectly benefited 4) Participation in evaluation and monitoring. The recommendation from this study is the need to create conscious awareness in order to increase participation level of people by organizing activities that promote participation with volunteer spirit. Government should open opportunities for people to participate in sharing ideas and create the culture of living together with equality which would build more concrete participation.

Keywords: Participation, Participatory Patterns, Water and Waste Management, Environmental Management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2355

1213 Automatic Classification of the Stand-to-Sit Phase in the TUG Test Using Machine Learning

Authors: Y. A. Adla, R. Soubra, M. Kasab, M. O. Diab, A. Chkeir

Abstract:

Over the past several years, researchers have shown a great interest in assessing the mobility of elderly people to measure their functional status. Usually, such an assessment is done by conducting tests that require the subject to walk a certain distance, turn around, and finally sit back down. Consequently, this study aims to provide an at home monitoring system to assess the patient’s status continuously. Thus, we proposed a technique to automatically detect when a subject sits down while walking at home. In this study, we utilized a Doppler radar system to capture the motion of the subjects. More than 20 features were extracted from the radar signals out of which 11 were chosen based on their Intraclass Correlation Coefficient (ICC > 0.75). Accordingly, the sequential floating forward selection wrapper was applied to further narrow down the final feature vector. Finally, five features were introduced to the Linear Discriminant Analysis classifier and an accuracy of 93.75% was achieved as well as a precision and recall of 95% and 90% respectively.

Keywords: Doppler radar system, stand-to-sit phase, TUG test, machine learning, classification

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 427

1212 Differential Protection for Power Transformer Using Wavelet Transform and PNN

Authors: S. Sendilkumar, B. L. Mathur, Joseph Henry

Abstract:

A new approach for protection of power transformer is presented using a time-frequency transform known as Wavelet transform. Different operating conditions such as inrush, Normal, load, External fault and internal fault current are sampled and processed to obtain wavelet coefficients. Different Operating conditions provide variation in wavelet coefficients. Features like energy and Standard deviation are calculated using Parsevals theorem. These features are used as inputs to PNN (Probabilistic neural network) for fault classification. The proposed algorithm provides more accurate results even in the presence of noise inputs and accurately identifies inrush and fault currents. Overall classification accuracy of the proposed method is found to be 96.45%. Simulation of the fault (with and without noise) was done using MATLAB AND SIMULINK software taking 2 cycles of data window (40 m sec) containing 800 samples. The algorithm was evaluated by using 10 % Gaussian white noise.

Keywords: Power Transformer, differential Protection, internalfault, inrush current, Wavelet Energy, Db9.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3116