Search results for: text mining analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9315

Search results for: text mining analysis

9195 Analysis of Textual Data Based On Multiple 2-Class Classification Models

Authors: Shigeaki Sakurai, Ryohei Orihara

Abstract:

This paper proposes a new method for analyzing textual data. The method deals with items of textual data, where each item is described based on various viewpoints. The method acquires 2- class classification models of the viewpoints by applying an inductive learning method to items with multiple viewpoints. The method infers whether the viewpoints are assigned to the new items or not by using the models. The method extracts expressions from the new items classified into the viewpoints and extracts characteristic expressions corresponding to the viewpoints by comparing the frequency of expressions among the viewpoints. This paper also applies the method to questionnaire data given by guests at a hotel and verifies its effect through numerical experiments.

Keywords: Text mining, Multiple viewpoints, Differential analysis, Questionnaire data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1253
9194 Modelling of Powered Roof Supports Work

Authors: Marcin Michalak

Abstract:

Due to the increasing efforts on saving our natural environment a change in the structure of energy resources can be observed - an increasing fraction of a renewable energy sources. In many countries traditional underground coal mining loses its significance but there are still countries, like Poland or Germany, in which the coal based technologies have the greatest fraction in a total energy production. This necessitates to make an effort to limit the costs and negative effects of underground coal mining. The longwall complex is as essential part of the underground coal mining. The safety and the effectiveness of the work is strongly dependent of the diagnostic state of powered roof supports. The building of a useful and reliable diagnostic system requires a lot of data. As the acquisition of a data of any possible operating conditions it is important to have a possibility to generate a demanded artificial working characteristics. In this paper a new approach of modelling a leg pressure in the single unit of powered roof support. The model is a result of the analysis of a typical working cycles.

Keywords: Machine modelling, underground mining, coal mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1872
9193 An Efficient Data Mining Approach on Compressed Transactions

Authors: Jia-Yu Dai, Don-Lin Yang, Jungpin Wu, Ming-Chuan Hung

Abstract:

In an era of knowledge explosion, the growth of data increases rapidly day by day. Since data storage is a limited resource, how to reduce the data space in the process becomes a challenge issue. Data compression provides a good solution which can lower the required space. Data mining has many useful applications in recent years because it can help users discover interesting knowledge in large databases. However, existing compression algorithms are not appropriate for data mining. In [1, 2], two different approaches were proposed to compress databases and then perform the data mining process. However, they all lack the ability to decompress the data to their original state and improve the data mining performance. In this research a new approach called Mining Merged Transactions with the Quantification Table (M2TQT) was proposed to solve these problems. M2TQT uses the relationship of transactions to merge related transactions and builds a quantification table to prune the candidate itemsets which are impossible to become frequent in order to improve the performance of mining association rules. The experiments show that M2TQT performs better than existing approaches.

Keywords: Association rule, data mining, merged transaction, quantification table.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1920
9192 Explorative Data Mining of Constructivist Learning Experiences and Activities with Multiple Dimensions

Authors: Patrick Wessa, Bart Baesens

Abstract:

This paper discusses the use of explorative data mining tools that allow the educator to explore new relationships between reported learning experiences and actual activities, even if there are multiple dimensions with a large number of measured items. The underlying technology is based on the so-called Compendium Platform for Reproducible Computing (http://www.freestatistics.org) which was built on top the computational R Framework (http://www.wessa.net).

Keywords: Reproducible computing, data mining, explorative data analysis, compendium technology, computer assisted education

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1210
9191 Value Analysis of Islamic Banking and Conventional Banking to Measure Value Co-creation

Authors: Amna Javed, Hisashi Masuda, Youji Kohda

Abstract:

This study examines the value analysis in Islamic and conventional banking services in Pakistan. Many scholars have focused on co-creation of values in services but mainly economic values not non-economic. As Islamic banking is based on Islamic principles that are more concerned with non-economic values (well-being, partnership, fairness, trust worthy, and justice) than economic values as money in terms of interest.  This study is important to know the providers point of view about the co-created values, because, it may be more sustainable and appropriate for today’s unpredictable socio-economic environment. Data were collected from 4 banks (2 Islamic and 2 conventional banks). Text mining technique is applied for data analysis, and values with 100% occurrences in Islamic banking are chosen. The results reflect that Islamic banking is more centric towards non-economic values than economic values and it promotes team work and partnership concept by applying Islamic spirit and trust worthiness concept.

Keywords: Economic values, Islamic banking, Non-economic values, Value system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3197
9190 Moving Data Mining Tools toward a Business Intelligence System

Authors: Nittaya Kerdprasop, Kittisak Kerdprasop

Abstract:

Data mining (DM) is the process of finding and extracting frequent patterns that can describe the data, or predict unknown or future values. These goals are achieved by using various learning algorithms. Each algorithm may produce a mining result completely different from the others. Some algorithms may find millions of patterns. It is thus the difficult job for data analysts to select appropriate models and interpret the discovered knowledge. In this paper, we describe a framework of an intelligent and complete data mining system called SUT-Miner. Our system is comprised of a full complement of major DM algorithms, pre-DM and post-DM functionalities. It is the post-DM packages that ease the DM deployment for business intelligence applications.

Keywords: Business intelligence, data mining, functionalprogramming, intelligent system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1683
9189 Parallel Text Processing: Alignment of Indonesian to Javanese Language

Authors: Aji P. Wibawa, Andrew Nafalski, Neil Murray, Wayan F. Mahmudy

Abstract:

Parallel text alignment is proposed as a way of aligning bahasa Indonesia to words in Javanese. Since the one-to-one word translator does not have the facility to translate pragmatic aspects of Javanese, the parallel text alignment model described uses a phrase pair combination. The algorithm aligns the parallel text automatically from the beginning to the end of each sentence. Even though the results of the phrase pair combination outperform the previous algorithm, it is still inefficient. Recording all possible combinations consume more space in the database and time consuming. The original algorithm is modified by applying the edit distance coefficient to improve the data-storage efficiency. As a result, the data-storage consumption is 90% reduced as well as its learning period (42s).

Keywords: Parallel text alignment, phrase pair combination, edit distance coefficient, Javanese-Indonesian language.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2434
9188 AudioMine: Medical Data Mining in Heterogeneous Audiology Records

Authors: Shaun Cox, Michael Oakes, Stefan Wermter, Maurice Hawthorne

Abstract:

We report on the results of a pilot study in which a data-mining tool was developed for mining audiology records. The records were heterogeneous in that they contained numeric, category and textual data. The tools developed are designed to observe associations between any field in the records and any other field. The techniques employed were the statistical chi-squared test, and the use of self-organizing maps, an unsupervised neural learning approach.

Keywords: Audiology, data mining, chi-squared, self-organizing maps

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1615
9187 Association Rules Mining and NOSQL Oriented Document in Big Data

Authors: Sarra Senhadji, Imene Benzeguimi, Zohra Yagoub

Abstract:

Big Data represents the recent technology of manipulating voluminous and unstructured data sets over multiple sources. Therefore, NOSQL appears to handle the problem of unstructured data. Association rules mining is one of the popular techniques of data mining to extract hidden relationship from transactional databases. The algorithm for finding association dependencies is well-solved with Map Reduce. The goal of our work is to reduce the time of generating of frequent itemsets by using Map Reduce and NOSQL database oriented document. A comparative study is given to evaluate the performances of our algorithm with the classical algorithm Apriori.

Keywords: Apriori, Association rules mining, Big Data, data mining, Hadoop, Map Reduce, MongoDB, NoSQL.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 619
9186 Influence of Non-Structural Elements on Dynamic Response of Multi-Storey Rc Building to Mining Shock

Authors: Joanna M. Dulińska, Maria Fabijańska

Abstract:

In the paper the results of calculations of the dynamic response of a multi-storey reinforced concrete building to a strong mining shock originated from the main region of mining activity in Poland (i.e. the Legnica-Glogow Copper District) are presented. The representative time histories of accelerations registered in three directions were used as ground motion data in calculations of the dynamic response of the structure. Two variants of a numerical model were applied: the model including only structural elements of the building and the model including both structural and non-structural elements (i.e. partition walls and ventilation ducts made of brick). It turned out that non-structural elements of multi-storey RC buildings have a small impact of about 10 % on natural frequencies of these structures. It was also proved that the dynamic response of building to mining shock obtained in case of inclusion of all non-structural elements in the numerical model is about 20 % smaller than in case of consideration of structural elements only. The principal stresses obtained in calculations of dynamic response of multi-storey building to strong mining shock are situated on the level of about 30% of values obtained from static analysis (dead load).

Keywords: Dynamic characteristics of buildings, mining shocks, dynamic response of buildings, non-structural elements

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1827
9185 Performance Evaluation of an Online Text-Based Strategy Game

Authors: Nazleeni S. Haron, Mohd K. Zaime , Izzatdin A. Aziz, Mohd H. Hasan

Abstract:

Text-based game is supposed to be a low resource consumption application that delivers good performances when compared to graphical-intensive type of games. But, nowadays, some of the online text-based games are not offering performances that are acceptable to the users. Therefore, an online text-based game called Star_Quest has been developed in order to analyze its behavior under different performance measurements. Performance metrics such as throughput, scalability, response time and page loading time are captured to yield the performance of the game. The techniques in performing the load testing are also disclosed to exhibit the viability of our work. The comparative assessment between the results obtained and the accepted level of performances are conducted as to determine the performance level of the game. The study reveals that the developed game managed to meet all the performance objectives set forth.

Keywords: Online text-based games, performance evaluation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1560
9184 Analysis of Sequence Moves in Successful Chess Openings Using Data Mining with Association Rules

Authors: R.M.Rani

Abstract:

Chess is one of the indoor games, which improves the level of human confidence, concentration, planning skills and knowledge. The main objective of this paper is to help the chess players to improve their chess openings using data mining techniques. Budding Chess Players usually do practices by analyzing various existing openings. When they analyze and correlate thousands of openings it becomes tedious and complex for them. The work done in this paper is to analyze the best lines of Blackmar- Diemer Gambit(BDG) which opens with White D4... using data mining analysis. It is carried out on the collection of winning games by applying association rules. The first step of this analysis is assigning variables to each different sequence moves. In the second step, the sequence association rules were generated to calculate support and confidence factor which help us to find the best subsequence chess moves that may lead to winning position.

Keywords: Blackmar-Diemer Gambit(BDG), Confidence, sequence Association Rules, Support.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3025
9183 Social and Economic Effects of Mining Industry Restructuring in Romania -Case Studies

Authors: Andra Costache, Gica Pehoiu

Abstract:

As in other countries from Central and Eastern Europe, the economic restructuring occurred in the last decade of the twentieth century affected the mining industry in Romania, an oversize and heavily subsidized sector before 1989. After more than a decade since the beginning of mining restructuring, an evaluation of current social implications of the process it is required, together with an efficiency analysis of the adaptation mechanisms developed at governmental level. This article aims to provide an insight into these issues through case studies conducted in the most important coal basin of Romania, Petroşani Depression.

Keywords: case studies, government programs, miningrestructuring, social effects.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2617
9182 W3-Miner: Mining Weighted Frequent Subtree Patterns in a Collection of Trees

Authors: R. AliMohammadzadeh, M. Haghir Chehreghani, A. Zarnani, M. Rahgozar

Abstract:

Mining frequent tree patterns have many useful applications in XML mining, bioinformatics, network routing, etc. Most of the frequent subtree mining algorithms (i.e. FREQT, TreeMiner and CMTreeMiner) use anti-monotone property in the phase of candidate subtree generation. However, none of these algorithms have verified the correctness of this property in tree structured data. In this research it is shown that anti-monotonicity does not generally hold, when using weighed support in tree pattern discovery. As a result, tree mining algorithms that are based on this property would probably miss some of the valid frequent subtree patterns in a collection of trees. In this paper, we investigate the correctness of anti-monotone property for the problem of weighted frequent subtree mining. In addition we propose W3-Miner, a new algorithm for full extraction of frequent subtrees. The experimental results confirm that W3-Miner finds some frequent subtrees that the previously proposed algorithms are not able to discover.

Keywords: Semi-Structured Data Mining, Anti-Monotone Property, Trees.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1333
9181 A Study of Touching Characters in Degraded Gurmukhi Text

Authors: M. K. Jindal, G. S. Lehal, R. K. Sharma

Abstract:

Character segmentation is an important preprocessing step for text recognition. In degraded documents, existence of touching characters decreases recognition rate drastically, for any optical character recognition (OCR) system. In this paper a study of touching Gurmukhi characters is carried out and these characters have been divided into various categories after a careful analysis.Structural properties of the Gurmukhi characters are used for defining the categories. New algorithms have been proposed to segment the touching characters in middle zone. These algorithms have shown a reasonable improvement in segmenting the touching characters in degraded Gurmukhi script. The algorithms proposed in this paper are applicable only to machine printed text.

Keywords: Character Segmentation, Middle Zone, Touching Characters.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1791
9180 Mine Production Index (MPI): New Method to Evaluate Effectiveness of Mining Machinery

Authors: Amol Lanke, Hadi Hoseinie, Behzad Ghodrati

Abstract:

OEE has been used in many industries as measure of performance. However due to limitations of original OEE, it has been modified by various researchers. OEE for mining application is special version of classic equation, carries these limitation over. In this paper it has been aimed to modify the OEE for mining application by introducing the weights to the elements of it and termed as Mine Production index (MPi). As a special application of new index MPishovel has been developed by authors. This can be used for evaluating the shovel effectiveness. Based on analysis, utilization followed by performance and availability were ranked in this order. To check the applicability of this index, a case study was done on four electrical and one hydraulic shovel in a Swedish mine. The results shows that MPishovel can evaluate production effectiveness of shovels and can determine effectiveness values in optimistic view compared to OEE. MPi with calculation not only give the effectiveness but also can predict which elements should be focused for improving the productivity.

Keywords: Mining, Overall equipment efficiency (OEE), Mine Production index, Shovels.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4700
9179 A Comparative Study of Page Ranking Algorithms for Information Retrieval

Authors: Ashutosh Kumar Singh, Ravi Kumar P

Abstract:

This paper gives an introduction to Web mining, then describes Web Structure mining in detail, and explores the data structure used by the Web. This paper also explores different Page Rank algorithms and compare those algorithms used for Information Retrieval. In Web Mining, the basics of Web mining and the Web mining categories are explained. Different Page Rank based algorithms like PageRank (PR), WPR (Weighted PageRank), HITS (Hyperlink-Induced Topic Search), DistanceRank and DirichletRank algorithms are discussed and compared. PageRanks are calculated for PageRank and Weighted PageRank algorithms for a given hyperlink structure. Simulation Program is developed for PageRank algorithm because PageRank is the only ranking algorithm implemented in the search engine (Google). The outputs are shown in a table and chart format.

Keywords: Web Mining, Web Structure, Web Graph, LinkAnalysis, PageRank, Weighted PageRank, HITS, DistanceRank, DirichletRank,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2769
9178 Video Mining for Creative Rendering

Authors: Mei Chen

Abstract:

More and more home videos are being generated with the ever growing popularity of digital cameras and camcorders. For many home videos, a photo rendering, whether capturing a moment or a scene within the video, provides a complementary representation to the video. In this paper, a video motion mining framework for creative rendering is presented. The user-s capture intent is derived by analyzing video motions, and respective metadata is generated for each capture type. The metadata can be used in a number of applications, such as creating video thumbnail, generating panorama posters, and producing slideshows of video.

Keywords: Motion mining, semantic abstraction, video mining, video representation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1603
9177 Interactive, Topic-Oriented Search Support by a Centroid-Based Text Categorisation

Authors: Mario Kubek, Herwig Unger

Abstract:

Centroid terms are single words that semantically and topically characterise text documents and so may serve as their very compact representation in automatic text processing. In the present paper, centroids are used to measure the relevance of text documents with respect to a given search query. Thus, a new graphbased paradigm for searching texts in large corpora is proposed and evaluated against keyword-based methods. The first, promising experimental results demonstrate the usefulness of the centroid-based search procedure. It is shown that especially the routing of search queries in interactive and decentralised search systems can be greatly improved by applying this approach. A detailed discussion on further fields of its application completes this contribution.

Keywords: Search algorithm, centroid, query, keyword, cooccurrence, categorisation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 578
9176 Application of Neural Networks in Financial Data Mining

Authors: Defu Zhang, Qingshan Jiang, Xin Li

Abstract:

This paper deals with the application of a well-known neural network technique, multilayer back-propagation (BP) neural network, in financial data mining. A modified neural network forecasting model is presented, and an intelligent mining system is developed. The system can forecast the buying and selling signs according to the prediction of future trends to stock market, and provide decision-making for stock investors. The simulation result of seven years to Shanghai Composite Index shows that the return achieved by this mining system is about three times as large as that achieved by the buy and hold strategy, so it is advantageous to apply neural networks to forecast financial time series, the different investors could benefit from it.

Keywords: Data mining, neural network, stock forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3539
9175 Navigation Patterns Mining Approach based on Expectation Maximization Algorithm

Authors: Norwati Mustapha, Manijeh Jalali, Abolghasem Bozorgniya, Mehrdad Jalali

Abstract:

Web usage mining algorithms have been widely utilized for modeling user web navigation behavior. In this study we advance a model for mining of user-s navigation pattern. The model makes user model based on expectation-maximization (EM) algorithm.An EM algorithm is used in statistics for finding maximum likelihood estimates of parameters in probabilistic models, where the model depends on unobserved latent variables. The experimental results represent that by decreasing the number of clusters, the log likelihood converges toward lower values and probability of the largest cluster will be decreased while the number of the clusters increases in each treatment.

Keywords: Web Usage Mining, Expectation maximization, navigation pattern mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1532
9174 Connectionist Approach to Generic Text Summarization

Authors: Rajesh S.Prasad, U. V. Kulkarni, Jayashree.R.Prasad

Abstract:

As the enormous amount of on-line text grows on the World-Wide Web, the development of methods for automatically summarizing this text becomes more important. The primary goal of this research is to create an efficient tool that is able to summarize large documents automatically. We propose an Evolving connectionist System that is adaptive, incremental learning and knowledge representation system that evolves its structure and functionality. In this paper, we propose a novel approach for Part of Speech disambiguation using a recurrent neural network, a paradigm capable of dealing with sequential data. We observed that connectionist approach to text summarization has a natural way of learning grammatical structures through experience. Experimental results show that our approach achieves acceptable performance.

Keywords: Artificial Neural Networks (ANN); Computational Intelligence (CI); Connectionist Text Summarizer ECTS (ECTS); Evolving Connectionist systems; Evolving systems; Fuzzy systems (FS); Part of Speech (POS) disambiguation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1543
9173 Prospects, Problems of Marketing Research and Data Mining in Turkey

Authors: Sema Kurtuluş, Kemal Kurtuluş

Abstract:

The objective of this paper is to review and assess the methodological issues and problems in marketing research, data and knowledge mining in Turkey. As a summary, academic marketing research publications in Turkey have significant problems. The most vital problem seems to be related with modeling. Most of the publications had major weaknesses in modeling. There were also, serious problems regarding measurement and scaling, sampling and analyses. Analyses myopia seems to be the most important problem for young academia in Turkey. Another very important finding is the lack of publications on data and knowledge mining in the academic world.

Keywords: Marketing research, data mining, knowledge mining, research modeling, analyses.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1922
9172 Hybrid Machine Learning Approach for Text Categorization

Authors: Nerijus Remeikis, Ignas Skucas, Vida Melninkaite

Abstract:

Text categorization - the assignment of natural language documents to one or more predefined categories based on their semantic content - is an important component in many information organization and management tasks. Performance of neural networks learning is known to be sensitive to the initial weights and architecture. This paper discusses the use multilayer neural network initialization with decision tree classifier for improving text categorization accuracy. An adaptation of the algorithm is proposed in which a decision tree from root node until a final leave is used for initialization of multilayer neural network. The experimental evaluation demonstrates this approach provides better classification accuracy with Reuters-21578 corpus, one of the standard benchmarks for text categorization tasks. We present results comparing the accuracy of this approach with multilayer neural network initialized with traditional random method and decision tree classifiers.

Keywords: Text categorization, decision trees, neural networks, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1763
9171 Speech Encryption and Decryption Using Linear Feedback Shift Register (LFSR)

Authors: Tin Lai Win, Nant Christina Kyaw

Abstract:

This paper is taken into consideration the problem of cryptanalysis of stream ciphers. There is some attempts need to improve the existing attacks on stream cipher and to make an attempt to distinguish the portions of cipher text obtained by the encryption of plain text in which some parts of the text are random and the rest are non-random. This paper presents a tutorial introduction to symmetric cryptography. The basic information theoretic and computational properties of classic and modern cryptographic systems are presented, followed by an examination of the application of cryptography to the security of VoIP system in computer networks using LFSR algorithm. The implementation program will be developed Java 2. LFSR algorithm is appropriate for the encryption and decryption of online streaming data, e.g. VoIP (voice chatting over IP). This paper is implemented the encryption module of speech signals to cipher text and decryption module of cipher text to speech signals.

Keywords: Linear Feedback Shift Register.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3066
9170 Web Log Mining by an Improved AprioriAll Algorithm

Authors: Wang Tong, He Pi-lian

Abstract:

This paper sets forth the possibility and importance about applying Data Mining in Web logs mining and shows some problems in the conventional searching engines. Then it offers an improved algorithm based on the original AprioriAll algorithm which has been used in Web logs mining widely. The new algorithm adds the property of the User ID during the every step of producing the candidate set and every step of scanning the database by which to decide whether an item in the candidate set should be put into the large set which will be used to produce next candidate set. At the meantime, in order to reduce the number of the database scanning, the new algorithm, by using the property of the Apriori algorithm, limits the size of the candidate set in time whenever it is produced. Test results show the improved algorithm has a more lower complexity of time and space, better restrain noise and fit the capacity of memory.

Keywords: Candidate Sets Pruning, Data Mining, ImprovedAlgorithm, Noise Restrain, Web Log

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2236
9169 Improved Dynamic Bayesian Networks Applied to Arabic on Line Characters Recognition

Authors: Redouane Tlemsani, Abdelkader Benyettou

Abstract:

Work is in on line Arabic character recognition and the principal motivation is to study the Arab manuscript with on line technology.

This system is a Markovian system, which one can see as like a Dynamic Bayesian Network (DBN). One of the major interests of these systems resides in the complete models training (topology and parameters) starting from training data.

Our approach is based on the dynamic Bayesian Networks formalism. The DBNs theory is a Bayesians networks generalization to the dynamic processes. Among our objective, amounts finding better parameters, which represent the links (dependences) between dynamic network variables.

In applications in pattern recognition, one will carry out the fixing of the structure, which obliges us to admit some strong assumptions (for example independence between some variables). Our application will relate to the Arabic isolated characters on line recognition using our laboratory database: NOUN. A neural tester proposed for DBN external optimization.

The DBN scores and DBN mixed are respectively 70.24% and 62.50%, which lets predict their further development; other approaches taking account time were considered and implemented until obtaining a significant recognition rate 94.79%.

Keywords: Arabic on line character recognition, dynamic Bayesian network, pattern recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1689
9168 National Image in the Age of Mass Self-Communication: An Analysis of Internet Users' Perception of Portugal

Authors: L. Godinho, N. Teixeira

Abstract:

Nowadays, massification of Internet access represents one of the major challenges to the traditional powers of the State, among which the power to control its external image. The virtual world has also sparked the interest of social sciences which consider it a new field of study, an immense open text where sense is expressed. In this paper, that immense text has been accessed to so as to understand the perception Internet users from all over the world have of Portugal. Ours is a quantitative and qualitative approach, as we have resorted to buzz, thematic and category analysis. The results confirm the predominance of sea stereotype in others' vision of the Portuguese people, and evidence that national image has adapted to network communication through processes of individuation and paganization.

Keywords: Internet, national image, perception, web analytics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1018
9167 Proposing an Efficient Method for Frequent Pattern Mining

Authors: Vaibhav Kant Singh, Vijay Shah, Yogendra Kumar Jain, Anupam Shukla, A.S. Thoke, Vinay KumarSingh, Chhaya Dule, Vivek Parganiha

Abstract:

Data mining, which is the exploration of knowledge from the large set of data, generated as a result of the various data processing activities. Frequent Pattern Mining is a very important task in data mining. The previous approaches applied to generate frequent set generally adopt candidate generation and pruning techniques for the satisfaction of the desired objective. This paper shows how the different approaches achieve the objective of frequent mining along with the complexities required to perform the job. This paper will also look for hardware approach of cache coherence to improve efficiency of the above process. The process of data mining is helpful in generation of support systems that can help in Management, Bioinformatics, Biotechnology, Medical Science, Statistics, Mathematics, Banking, Networking and other Computer related applications. This paper proposes the use of both upward and downward closure property for the extraction of frequent item sets which reduces the total number of scans required for the generation of Candidate Sets.

Keywords: Data Mining, Candidate Sets, Frequent Item set, Pruning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1638
9166 SMaTTS: Standard Malay Text to Speech System

Authors: Othman O. Khalifa, Zakiah Hanim Ahmad, Teddy Surya Gunawan

Abstract:

This paper presents a rule-based text- to- speech (TTS) Synthesis System for Standard Malay, namely SMaTTS. The proposed system using sinusoidal method and some pre- recorded wave files in generating speech for the system. The use of phone database significantly decreases the amount of computer memory space used, thus making the system very light and embeddable. The overall system was comprised of two phases the Natural Language Processing (NLP) that consisted of the high-level processing of text analysis, phonetic analysis, text normalization and morphophonemic module. The module was designed specially for SM to overcome few problems in defining the rules for SM orthography system before it can be passed to the DSP module. The second phase is the Digital Signal Processing (DSP) which operated on the low-level process of the speech waveform generation. A developed an intelligible and adequately natural sounding formant-based speech synthesis system with a light and user-friendly Graphical User Interface (GUI) is introduced. A Standard Malay Language (SM) phoneme set and an inclusive set of phone database have been constructed carefully for this phone-based speech synthesizer. By applying the generative phonology, a comprehensive letter-to-sound (LTS) rules and a pronunciation lexicon have been invented for SMaTTS. As for the evaluation tests, a set of Diagnostic Rhyme Test (DRT) word list was compiled and several experiments have been performed to evaluate the quality of the synthesized speech by analyzing the Mean Opinion Score (MOS) obtained. The overall performance of the system as well as the room for improvements was thoroughly discussed.

Keywords: Natural Language Processing, Text-To-Speech (TTS), Diphone, source filter, low-/ high- level synthesis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1929