Search results for: Emotions in tweets emotion detection in text

1930 Evolutionary Feature Selection for Text Documents using the SVM

Authors: Daniel I. Morariu, Lucian N. Vintan, Volker Tresp

Abstract:

Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step, the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of feature selection methods to reduce the dimensionality of the document-representation vector. In this paper, we present three feature selection methods: Information Gain, Support Vector Machine feature selection called (SVM_FS) and Genetic Algorithm with SVM (called GA_SVM). We show that the best results were obtained with GA_SVM method for a relatively small dimension of the feature vector.

Keywords: Feature Selection, Learning with Kernels, Support Vector Machine, Genetic Algorithm, and Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1660

1929 A Study of Touching Characters in Degraded Gurmukhi Text

Authors: M. K. Jindal, G. S. Lehal, R. K. Sharma

Abstract:

Character segmentation is an important preprocessing step for text recognition. In degraded documents, existence of touching characters decreases recognition rate drastically, for any optical character recognition (OCR) system. In this paper a study of touching Gurmukhi characters is carried out and these characters have been divided into various categories after a careful analysis.Structural properties of the Gurmukhi characters are used for defining the categories. New algorithms have been proposed to segment the touching characters in middle zone. These algorithms have shown a reasonable improvement in segmenting the touching characters in degraded Gurmukhi script. The algorithms proposed in this paper are applicable only to machine printed text.

Keywords: Character Segmentation, Middle Zone, Touching Characters.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1793

1928 Network Anomaly Detection using Soft Computing

Authors: Surat Srinoy, Werasak Kurutach, Witcha Chimphlee, Siriporn Chimphlee

Abstract:

One main drawback of intrusion detection system is the inability of detecting new attacks which do not have known signatures. In this paper we discuss an intrusion detection method that proposes independent component analysis (ICA) based feature selection heuristics and using rough fuzzy for clustering data. ICA is to separate these independent components (ICs) from the monitored variables. Rough set has to decrease the amount of data and get rid of redundancy and Fuzzy methods allow objects to belong to several clusters simultaneously, with different degrees of membership. Our approach allows us to recognize not only known attacks but also to detect activity that may be the result of a new, unknown attack. The experimental results on Knowledge Discovery and Data Mining- (KDDCup 1999) dataset.

Keywords: Network security, intrusion detection, rough set, ICA, anomaly detection, independent component analysis, rough fuzzy .

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1906

1927 Providing Medical Information in Braille: Research and Development of Automatic Braille Translation Program for Japanese “eBraille“

Authors: Aki Sugano, Mika Ohta, Mineko Ikegami, Kenji Miura, Sayo Tsukamoto, Akihiro Ichinose, Toshiko Ohshima, Eiichi Maeda, Masako Matsuura, Yutaka Takao

Abstract:

Along with the advances in medicine, providing medical information to individual patient is becoming more important. In Japan such information via Braille is hardly provided to blind and partially sighted people. Thus we are researching and developing a Web-based automatic translation program “eBraille" to translate Japanese text into Japanese Braille. First we analyzed the Japanese transcription rules to implement them on our program. We then added medical words to the dictionary of the program to improve its translation accuracy for medical text. Finally we examined the efficacy of statistical learning models (SLMs) for further increase of word segmentation accuracy in braille translation. As a result, eBraille had the highest translation accuracy in the comparison with other translation programs, improved the accuracy for medical text and is utilized to make hospital brochures in braille for outpatients and inpatients.

Keywords: Automatic Braille translation, Medical text, Partially sighted people.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1559

1926 Medical Image Edge Detection Based on Neuro-Fuzzy Approach

Authors: J. Mehena, M. C. Adhikary

Abstract:

Edge detection is one of the most important tasks in image processing. Medical image edge detection plays an important role in segmentation and object recognition of the human organs. It refers to the process of identifying and locating sharp discontinuities in medical images. In this paper, a neuro-fuzzy based approach is introduced to detect the edges for noisy medical images. This approach uses desired number of neuro-fuzzy subdetectors with a postprocessor for detecting the edges of medical images. The internal parameters of the approach are optimized by training pattern using artificial images. The performance of the approach is evaluated on different medical images and compared with popular edge detection algorithm. From the experimental results, it is clear that this approach has better performance than those of other competing edge detection algorithms for noisy medical images.

Keywords: Edge detection, neuro-fuzzy, image segmentation, artificial image, object recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1229

1925 Anomaly Based On Frequent-Outlier for Outbreak Detection in Public Health Surveillance

Authors: Zalizah Awang Long, Abdul Razak Hamdan, Azuraliza Abu Bakar

Abstract:

Public health surveillance system focuses on outbreak detection and data sources used. Variation or aberration in the frequency distribution of health data, compared to historical data is often used to detect outbreaks. It is important that new techniques be developed to improve the detection rate, thereby reducing wastage of resources in public health. Thus, the objective is to developed technique by applying frequent mining and outlier mining techniques in outbreak detection. 14 datasets from the UCI were tested on the proposed technique. The performance of the effectiveness for each technique was measured by t-test. The overall performance shows that DTK can be used to detect outlier within frequent dataset. In conclusion the outbreak detection technique using anomaly-based on frequent-outlier technique can be used to identify the outlier within frequent dataset.

Keywords: Outlier detection, frequent-outlier, outbreak, anomaly, surveillance, public health

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2234

1924 Edit Distance Algorithm to Increase Storage Efficiency of Javanese Corpora

Authors: Aji P. Wibawa, Andrew Nafalski, Neil Murray, Wayan F. Mahmudy

Abstract:

Since the one-to-one word translator does not have the facility to translate pragmatic aspects of Javanese, the parallel text alignment model described uses a phrase pair combination. The algorithm aligns the parallel text automatically from the beginning to the end of each sentence. Even though the results of the phrase pair combination outperform the previous algorithm, it is still inefficient. Recording all possible combinations consume more space in the database and time consuming. The original algorithm is modified by applying the edit distance coefficient to improve the data-storage efficiency. As a result, the data-storage consumption is 90% reduced as well as its learning period (42s).

Keywords: edit distance coefficient, Javanese, parallel text alignment, phrase pair combination

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1673

1923 Intelligent Network-Based Stepping Stone Detection Approach

Authors: Mohd Nizam Omar, Rahmat Budiarto

Abstract:

This research intends to introduce a new usage of Artificial Intelligent (AI) approaches in Stepping Stone Detection (SSD) fields of research. By using Self-Organizing Map (SOM) approaches as the engine, through the experiment, it is shown that SOM has the capability to detect the number of connection chains that involved in a stepping stones. Realizing that by counting the number of connection chain is one of the important steps of stepping stone detection and it become the research focus currently, this research has chosen SOM as the AI techniques because of its capabilities. Through the experiment, it is shown that SOM can detect the number of involved connection chains in Network-based Stepping Stone Detection (NSSD).

Keywords: Artificial Intelligent, Self-Organizing Map (SOM), Stepping Stone Detection, Tracing Intruder.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1442

1922 A Study on the Nostalgia Contents Analysis of Hometown Alumni in the Online Community

Authors: Heejin Yun, Juanjuan Zang

Abstract:

This study aims to analyze the text terms posted on an online community of people from the same hometown and to understand the topic and trend of nostalgia composed online. For this purpose, this study collected 144 writings which the natives of Yeongjong Island, Incheon, South-Korea have posted on an online community. And it analyzed association relations. As a result, online community texts means that just defining nostalgia as ‘a mind longing for hometown’ is not an enough explanation. Second, texts composed online have abstractness rather than persons’ individual stories. This study figured out the relationship that had the most critical and closest mutual association among the terms that constituted nostalgia through literature research and association rule concerning nostalgia. The result of this study has a characteristic that it summed up the core terms and emotions related to nostalgia.

Keywords: Nostalgia, cultural memory, data mining, online community.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 994

1921 Efficient Web-Learning Collision Detection Tool on Five-Axis Machine

Authors: Chia-Jung Chen, Rong-Shine Lin, Rong-Guey Chang

Abstract:

As networking has become popular, Web-learning tends to be a trend while designing a tool. Moreover, five-axis machining has been widely used in industry recently; however, it has potential axial table colliding problems. Thus this paper aims at proposing an efficient web-learning collision detection tool on five-axis machining. However, collision detection consumes heavy resource that few devices can support, thus this research uses a systematic approach based on web knowledge to detect collision. The methodologies include the kinematics analyses for five-axis motions, separating axis method for collision detection, and computer simulation for verification. The machine structure is modeled as STL format in CAD software. The input to the detection system is the g-code part program, which describes the tool motions to produce the part surface. This research produced a simulation program with C programming language and demonstrated a five-axis machining example with collision detection on web site. The system simulates the five-axis CNC motion for tool trajectory and detects for any collisions according to the input g-codes and also supports high-performance web service benefiting from C. The result shows that our method improves 4.5 time of computational efficiency, comparing to the conventional detection method.

Keywords: Collision detection, Five-axis machining, Separating axis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2143

1920 A Weighted Group EI Incorporating Role Information for More Representative Group EI Measurement

Authors: Siyu Wang, Anthony Ward

Abstract:

Emotional intelligence (EI) is a well-established personal characteristic. It has been viewed as a critical factor which can influence an individual's academic achievement, ability to work and potential to succeed. When working in a group, EI is fundamentally connected to the group members' interaction and ability to work as a team. The ability of a group member to intelligently perceive and understand own emotions (Intrapersonal EI), to intelligently perceive and understand other members' emotions (Interpersonal EI), and to intelligently perceive and understand emotions between different groups (Cross-boundary EI) can be considered as Group emotional intelligence (Group EI). In this research, a more representative Group EI measurement approach, which incorporates the information of the composition of a group and an individual’s role in that group, is proposed. To demonstrate the claim of being more representative Group EI measurement approach, this study adopts a multi-method research design, involving a combination of both qualitative and quantitative techniques to establish a metric of Group EI. From the results, it can be concluded that by introducing the weight coefficient of each group member on group work into the measurement of Group EI, Group EI will be more representative and more capable of understanding what happens during teamwork than previous approaches.

Keywords: Emotional intelligence, EI, Group EI, multi-method research, teamwork.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 577

1919 A Text Clustering System based on k-means Type Subspace Clustering and Ontology

Authors: Liping Jing, Michael K. Ng, Xinhua Yang, Joshua Zhexue Huang

Abstract:

This paper presents a text clustering system developed based on a k-means type subspace clustering algorithm to cluster large, high dimensional and sparse text data. In this algorithm, a new step is added in the k-means clustering process to automatically calculate the weights of keywords in each cluster so that the important words of a cluster can be identified by the weight values. For understanding and interpretation of clustering results, a few keywords that can best represent the semantic topic are extracted from each cluster. Two methods are used to extract the representative words. The candidate words are first selected according to their weights calculated by our new algorithm. Then, the candidates are fed to the WordNet to identify the set of noun words and consolidate the synonymy and hyponymy words. Experimental results have shown that the clustering algorithm is superior to the other subspace clustering algorithms, such as PROCLUS and HARP and kmeans type algorithm, e.g., Bisecting-KMeans. Furthermore, the word extraction method is effective in selection of the words to represent the topics of the clusters.

Keywords: Subspace Clustering, Text Mining, Feature Weighting, Cluster Interpretation, Ontology

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2412

1918 Opinion Mining and Sentiment Analysis on DEFT

Authors: Najiba Ouled Omar, Azza Harbaoui, Henda Ben Ghezala

Abstract:

Current research practices sentiment analysis with a focus on social networks, DEfi Fouille de Texte (DEFT) (Text Mining Challenge) evaluation campaign focuses on opinion mining and sentiment analysis on social networks, especially social network Twitter. It aims to confront the systems produced by several teams from public and private research laboratories. DEFT offers participants the opportunity to work on regularly renewed themes and proposes to work on opinion mining in several editions. The purpose of this article is to scrutinize and analyze the works relating to opinions mining and sentiment analysis in the Twitter social network realized by DEFT. It examines the tasks proposed by the organizers of the challenge and the methods used by the participants.

Keywords: Opinion mining, sentiment analysis, emotion, polarity, annotation, OSEE, figurative language, DEFT, Twitter, Tweet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 761

1917 Lung Nodule Detection in CT Scans

Authors: M. Antonelli, G. Frosini, B. Lazzerini, F. Marcelloni

Abstract:

In this paper we describe a computer-aided diagnosis (CAD) system for automated detection of pulmonary nodules in computed-tomography (CT) images. After extracting the pulmonary parenchyma using a combination of image processing techniques, a region growing method is applied to detect nodules based on 3D geometric features. We applied the CAD system to CT scans collected in a screening program for lung cancer detection. Each scan consists of a sequence of about 300 slices stored in DICOM (Digital Imaging and Communications in Medicine) format. All malignant nodules were detected and a low false-positive detection rate was achieved.

Keywords: computer assisted diagnosis, medical imagesegmentation, shape recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1784

1916 Automatic Change Detection for High-Resolution Satellite Images of Urban and Suburban Areas

Authors: Antigoni Panagiotopoulou, Lemonia Ragia

Abstract:

High-resolution satellite images can provide detailed information about change detection on the earth. In the present work, QuickBird images of spatial resolution 60 cm/pixel and WorldView images of resolution 30 cm/pixel are utilized to perform automatic change detection in urban and suburban areas of Crete, Greece. There is a relative time difference of 13 years among the satellite images. Multiindex scene representation is applied on the images to classify the scene into buildings, vegetation, water and ground. Then, automatic change detection is made possible by pixel-per-pixel comparison of the classified multi-temporal images. The vegetation index and the water index which have been developed in this study prove effective. Furthermore, the proposed change detection approach not only indicates whether changes have taken place or not but also provides specific information relative to the types of changes. Experimentations with other different scenes in the future could help optimize the proposed spectral indices as well as the entire change detection methodology.

Keywords: Change detection, multiindex scene representation, spectral index, QuickBird, WorldView.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 431

1915 The Negative Effect of Traditional Loops Style on the Performance of Algorithms

Authors: Mahmoud Moh'd Mhashi

Abstract:

A new algorithm called Character-Comparison to Character-Access (CCCA) is developed to test the effect of both: 1) converting character-comparison and number-comparison into character-access and 2) the starting point of checking on the performance of the checking operation in string searching. An experiment is performed using both English text and DNA text with different sizes. The results are compared with five algorithms, namely, Naive, BM, Inf_Suf_Pref, Raita, and Cycle. With the CCCA algorithm, the results suggest that the evaluation criteria of the average number of total comparisons are improved up to 35%. Furthermore, the results suggest that the clock time required by the other algorithms is improved in range from 22.13% to 42.33% by the new CCCA algorithm.

Keywords: Pattern matching, string searching, charactercomparison, character-access, text type, and checking

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1227

1914 The Comparison Study of Harmonic Detection Methods for Shunt Active Power Filters

Authors: K-L. Areerak, K-N. Areerak

Abstract:

The paper deals with the comparison study of harmonic detection methods for a shunt active power filter. The %THD and the power factor value at the PCC point after compensation are considered for the comparison. There are three harmonic detection methods used in the paper that are synchronous reference frame method, synchronous detection method, and DQ axis with Fourier method. In addition, the ideal current source is used to represent the active power filter by assuming an infinitely fast controller action of the active power filter. The simulation results show that the DQ axis with Fourier method provides the minimum %THD after compensation compared with other methods. However, the power factor value at the PCC point after compensation is slightly lower than that of synchronous detection method.

Keywords: Harmonic detection, shunt active power filter, DQaxis with Fourier, power factor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3238

1913 A Psychophysiological Evaluation of an Effective Recognition Technique Using Interactive Dynamic Virtual Environments

Authors: Mohammadhossein Moghimi, Robert Stone, Pia Rotshtein

Abstract:

Recording psychological and physiological correlates of human performance within virtual environments and interpreting their impacts on human engagement, ‘immersion’ and related emotional or ‘effective’ states is both academically and technologically challenging. By exposing participants to an effective, real-time (game-like) virtual environment, designed and evaluated in an earlier study, a psychophysiological database containing the EEG, GSR and Heart Rate of 30 male and female gamers, exposed to 10 games, was constructed. Some 174 features were subsequently identified and extracted from a number of windows, with 28 different timing lengths (e.g. 2, 3, 5, etc. seconds). After reducing the number of features to 30, using a feature selection technique, K-Nearest Neighbour (KNN) and Support Vector Machine (SVM) methods were subsequently employed for the classification process. The classifiers categorised the psychophysiological database into four effective clusters (defined based on a 3-dimensional space – valence, arousal and dominance) and eight emotion labels (relaxed, content, happy, excited, angry, afraid, sad, and bored). The KNN and SVM classifiers achieved average cross-validation accuracies of 97.01% (±1.3%) and 92.84% (±3.67%), respectively. However, no significant differences were found in the classification process based on effective clusters or emotion labels.

Keywords: Virtual Reality, effective computing, effective VR, emotion-based effective physiological database.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 950

1912 A Background Subtraction Based Moving Object Detection around the Host Vehicle

Authors: Hyojin Lim, Cuong Nguyen Khac, Ho-Youl Jung

Abstract:

In this paper, we propose moving object detection method which is helpful for driver to safely take his/her car out of parking lot. When moving objects such as motorbikes, pedestrians, the other cars and some obstacles are detected at the rear-side of host vehicle, the proposed algorithm can provide to driver warning. We assume that the host vehicle is just before departure. Gaussian Mixture Model (GMM) based background subtraction is basically applied. Pre-processing such as smoothing and post-processing as morphological filtering are added. We examine “which color space has better performance for detection of moving objects?” Three color spaces including RGB, YCbCr, and Y are applied and compared, in terms of detection rate. Through simulation, we prove that RGB space is more suitable for moving object detection based on background subtraction.

Keywords: Gaussian mixture model, background subtraction, Moving object detection, color space, morphological filtering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2515

1911 Light Tracking Fault Tolerant Control System

Authors: J. Florescu, T. Vinay, L. Wang

Abstract:

A fault detection and identification (FDI) technique is presented to create a fault tolerant control system (FTC). The fault detection is achieved by monitoring the position of the light source using an array of light sensors. When a decision is made about the presence of a fault an identification process is initiated to locate the faulty component and reconfigure the controller signals. The signals provided by the sensors are predictable; therefore the existence of a fault is easily identified. Identification of the faulty sensor is based on the dynamics of the frame. The technique is not restricted to a particular type of controllers and the results show consistency.

Keywords: algorithm, detection and diagnostic, fault-tolerantcontrol, fault detection and identification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1371

1910 Comparative Study of QRS Complex Detection in ECG

Authors: Ibtihel Nouira, Asma Ben Abdallah, Ibtissem Kouaja, Mohamed Hèdi Bedoui

Abstract:

The processing of the electrocardiogram (ECG) signal consists essentially in the detection of the characteristic points of signal which are an important tool in the diagnosis of heart diseases. The most suitable are the detection of R waves. In this paper, we present various mathematical tools used for filtering ECG using digital filtering and Discreet Wavelet Transform (DWT) filtering. In addition, this paper will include two main R peak detection methods by applying a windowing process: The first method is based on calculations derived, the second is a time-frequency method based on Dyadic Wavelet Transform DyWT.

Keywords: Derived calculation methods, Electrocardiogram, R peaks, Wavelet Transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2517

1909 Design of a Neural Networks Classifier for Face Detection

Authors: F. Smach, M. Atri, J. Mitéran, M. Abid

Abstract:

Face detection and recognition has many applications in a variety of fields such as security system, videoconferencing and identification. Face classification is currently implemented in software. A hardware implementation allows real-time processing, but has higher cost and time to-market. The objective of this work is to implement a classifier based on neural networks MLP (Multi-layer Perceptron) for face detection. The MLP is used to classify face and non-face patterns. The systm is described using C language on a P4 (2.4 Ghz) to extract weight values. Then a Hardware implementation is achieved using VHDL based Methodology. We target Xilinx FPGA as the implementation support.

Keywords: Classification, Face Detection, FPGA Hardware description, MLP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2236

1908 The Laser Line Detection for Autonomous Mapping Based on Color Segmentation

Authors: Pavel Chmelar, Martin Dobrovolny

Abstract:

Laser projection or laser footprint detection is today widely used in many fields of robotics, measurement or electronics. The system accuracy strictly depends on precise laser footprint detection on target objects. This article deals with the laser line detection based on the RGB segmentation and the component labeling. As a measurement device was used the developed optical rangefinder. The optical rangefinder is equipped with vertical sweeping of the laser beam and high quality camera. This system was developed mainly for automatic exploration and mapping of unknown spaces. In the first section is presented a new detection algorithm. In the second section are presented measurements results. The measurements were performed in variable light conditions in interiors. The last part of the article present achieved results and their differences between day and night measurements.

Keywords: Automatic mapping, color segmentation, component labeling, distance measurement, laser line detection, vector map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3511

1907 A Highly Sensitive Dip Strip for Detection of Phosphate in Water

Authors: Hojat Heidari-Bafroui, Amer Charbaji, Constantine Anagnostopoulos, Mohammad Faghri

Abstract:

Phosphorus is an essential nutrient for plant life which is most frequently found as phosphate in water. Once phosphate is found in abundance in surface water, a series of adverse effects on an ecosystem can be initiated. Therefore, a portable and reliable method is needed to monitor the phosphate concentrations in the field. In this paper, an inexpensive dip strip device with the ascorbic acid/antimony reagent dried on blotting paper along with wet chemistry is developed for the detection of low concentrations of phosphate in water. Ammonium molybdate and sulfuric acid are separately stored in liquid form so as to improve significantly the lifetime of the device and enhance the reproducibility of the device’s performance. The limit of detection and quantification for the optimized device are 0.134 ppm and 0.472 ppm for phosphate in water, respectively. The device’s shelf life, storage conditions, and limit of detection are superior to what has been previously reported for the paper-based phosphate detection devices.

Keywords: Phosphate detection, paper-based device, molybdenum blue method, colorimetric assay.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 433

1906 Adaptive Nonparametric Approach for Guaranteed Real-Time Detection of Targeted Signals in Multichannel Monitoring Systems

Authors: Andrey V. Timofeev

Abstract:

An adaptive nonparametric method is proposed for stable real-time detection of seismoacoustic sources in multichannel C-OTDR systems with a significant number of channels. This method guarantees given upper boundaries for probabilities of Type I and Type II errors. Properties of the proposed method are rigorously proved. The results of practical applications of the proposed method in a real C-OTDR-system are presented in this report.

Keywords: Adaptive detection, change point, interval estimation, guaranteed detection, multichannel monitoring systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1837

1905 A Text Mining Technique Using Association Rules Extraction

Authors: Hany Mahgoub, Dietmar Rösner, Nabil Ismail, Fawzy Torkey

Abstract:

This paper describes text mining technique for automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association rules amongst keywords labeling the documents. In this work, the EART system ignores the order in which the words occur, but instead focusing on the words and their statistical distributions in documents. The main contributions of the technique are that it integrates XML technology with Information Retrieval scheme (TFIDF) (for keyword/feature selection that automatically selects the most discriminative keywords for use in association rules generation) and use Data Mining technique for association rules discovery. It consists of three phases: Text Preprocessing phase (transformation, filtration, stemming and indexing of the documents), Association Rule Mining (ARM) phase (applying our designed algorithm for Generating Association Rules based on Weighting scheme GARW) and Visualization phase (visualization of results). Experiments applied on WebPages news documents related to the outbreak of the bird flu disease. The extracted association rules contain important features and describe the informative news included in the documents collection. The performance of the EART system compared with another system that uses the Apriori algorithm throughout the execution time and evaluating extracted association rules.

Keywords: Text mining, data mining, association rule mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4378

1904 Robust Probabilistic Online Change Detection Algorithm Based On the Continuous Wavelet Transform

Authors: Sergei Yendiyarov, Sergei Petrushenko

Abstract:

In this article we present a change point detection algorithm based on the continuous wavelet transform. At the beginning of the article we describe a necessary transformation of a signal which has to be made for the purpose of change detection. Then case study related to iron ore sinter production which can be solved using our proposed technique is discussed. After that we describe a probabilistic algorithm which can be used to find changes using our transformed signal. It is shown that our algorithm works well with the presence of some noise and abnormal random bursts.

Keywords: Change detection, sinter production, wavelet transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1420

1903 The Guaranteed Detection of the Seismoacoustic Emission Source in the C-OTDR Systems

Authors: Andrey V. Timofeev

Abstract:

A method is proposed for stable detection of seismoacoustic sources in C-OTDR systems that guarantee given upper bounds for probabilities of type I and type II errors. Properties of the proposed method are rigorously proved. The results of practical applications of the proposed method in a real C-OTDRsystem are presented.

Keywords: Guaranteed detection, C-OTDR systems, change point, interval estimation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1946

1902 Text Mining Technique for Data Mining Application

Authors: M. Govindarajan

Abstract:

Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge discovery in text (KDT), or Text data mining or Text Mining. In decision tree approach is most useful in classification problem. With this technique, tree is constructed to model the classification process. There are two basic steps in the technique: building the tree and applying the tree to the database. This paper describes a proposed C5.0 classifier that performs rulesets, cross validation and boosting for original C5.0 in order to reduce the optimization of error ratio. The feasibility and the benefits of the proposed approach are demonstrated by means of medial data set like hypothyroid. It is shown that, the performance of a classifier on the training cases from which it was constructed gives a poor estimate by sampling or using a separate test file, either way, the classifier is evaluated on cases that were not used to build and evaluate the classifier are both are large. If the cases in hypothyroid.data and hypothyroid.test were to be shuffled and divided into a new 2772 case training set and a 1000 case test set, C5.0 might construct a different classifier with a lower or higher error rate on the test cases. An important feature of see5 is its ability to classifiers called rulesets. The ruleset has an error rate 0.5 % on the test cases. The standard errors of the means provide an estimate of the variability of results. One way to get a more reliable estimate of predictive is by f-fold –cross- validation. The error rate of a classifier produced from all the cases is estimated as the ratio of the total number of errors on the hold-out cases to the total number of cases. The Boost option with x trials instructs See5 to construct up to x classifiers in this manner. Trials over numerous datasets, large and small, show that on average 10-classifier boosting reduces the error rate for test cases by about 25%.

Keywords: C5.0, Error Ratio, text mining, training data, test data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2434

1901 A Novel Impulse Detector for Filtering of Highly Corrupted Images

Authors: Umesh Ghanekar

Abstract:

As the performance of the filtering system depends upon the accuracy of the noise detection scheme, in this paper, we present a new scheme for impulse noise detection based on two levels of decision. In this scheme in the first stage we coarsely identify the corrupted pixels and in the second stage we finally decide whether the pixel under consideration is really corrupt or not. The efficacy of the proposed filter has been confirmed by extensive simulations.

Keywords: Impulse detection, noise removal, image filtering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1369