Search results for: Classification algorithms; data mining; tourism; knowledge discovery.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 10389

Search results for: Classification algorithms; data mining; tourism; knowledge discovery.

10029 Large-Dimensional Shells under Mining Tremors from Various Mining Regions in Poland

Authors: Joanna M. Dulińska, Maria Fabijańska

Abstract:

In the paper a detailed analysis of the dynamic response of a cooling tower shell to mining tremors originated from two main regions of mining activity in Poland (Upper Silesian Coal Basin and Legnica-Glogow Copper District) was presented. The representative time histories registered in the both regions were used as ground motion data in calculations of the dynamic response of the structure. It was proved that the dynamic response of the shell is strongly dependent not only on the level of vibration amplitudes but on the dominant frequency range of the mining shock typical for the mining region as well. Also a vertical component of vibrations occurred to have considerable influence on the total dynamic response of the shell. Finally, it turned out that non-uniformity of kinematic excitation resulting from spatial variety of ground motion plays a significant role in dynamic analysis of large-dimensional shells under mining shocks.

Keywords: Cooling towers, dynamic response, mining tremors, non-uniform kinematic excitation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1394
10028 An Integrated Planning Framework for Sustainable Tourism: Case Study of Tunisia

Authors: S. Halioui, I. Arikan, M. Schmidt

Abstract:

Tourism sector in Tunisia faces several problems that range from economic challenges to environmental degradation and social instability. These problems have been intensified because of the increased competition in the tourism market, the political instability, financial crises, and recently terrorism problems have aggravated the situation. As a consequence, a new framework that promotes sustainable tourism in the country and increases its competitiveness is urgently needed. Planning for sustainable tourism sector requires the integration of complex interactions between economic, social and environmental aspects. Sustainable tourism principles can be implemented with the help of Strategic Environmental Assessment (SEA) process, which ensures the full integration of economic, social and environmental considerations while planning for the tourism sector in Tunisia. Results of the paper have broad implications for policy makers and tourism professionals.

Keywords: Sustainable tourism, strategic environmental assessment, tourism planning, policy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1463
10027 Natural Disaster Impact on Annual Visitors of Recreation Area: The Taiwan Case

Authors: Ya-Fen Lee, Yun-Yao Chi

Abstract:

This paper aims to quantify the impact of natural disaster on tourism by the change of annual visitors to scenic spots. The data of visitors to Alishan, Sun Moon Lake, Sitou and Palace Museum in Taiwan during 1986 to 2012 year is collected, and the trend analysis is used to predict the annual visitors to these scenic spots. The findings show that 1999 Taiwan earthquake had significant effect on the visitors to Alishan, Sun Moon Lake and Sitou with an average impact of 55.75% during 1999 to 2000 year except for Palace Museum. The impact was greater as closer epicenter of 1999 earthquake. And the discovery period of visitors is about 2 to 9 years. Further, the impact of heavy rainfall on Alishan, Taiwan is estimated. As the accumulative rainfall reaches to 500 mm, the impact on visitors can be predicted. 

Keywords: Impact, Natural disaster, tourism, visitors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1978
10026 Gender Discrimination and Pay Gap on Tourism Labor Market

Authors: Alka Obadić

Abstract:

The research concentrates on the role of tourism in generating female employment and on impact of gender discrimination in tourism sector. Unfortunately, in many countries there are still some barriers to the inclusion of women at all hierarchical levels of tourism labor market. Research analysis focuses on EU countries where tourism is a main employer of women. The analysis shows that women represent over third persons employed in the non-financial business economy and almost two thirds in core tourism activities. Women's gross hourly earnings in accommodation and food services were below those of men in the European Union and only countries who recorded increase of gender pay gap from the beginning of crisis are Bulgaria and Croatia. Women in tourism industry are still overrepresented in lower status jobs with fewer opportunities for career progression and are often treated unequally.

Keywords: Employment, gender discrimination, tourism, women’s participation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3459
10025 Tourism-Impact on Environment-Observations from North Coastal Districts of A.P, India

Authors: K. Mythili

Abstract:

This paper deals with the status of solid waste pollution in touristic spots of North coastal Andhra Pradesh. Case studies of Eco tourism, cultural tourism and pilgrim tourism are elaborately discussed and the study is based on both primary and secondary data. Data collection includes field collection of solid waste, semi structured interviews and observation of tourists. Results indicate generation of 72% Non biodegradable material in Eco touristic places like RK beach Visakhapatnam, Araku Valley. Pydithalli Jathra is a famous cultural touristic attraction and more than one lakh people converge here. The solid waste at this spot includes 20% coconut shells, 50% plastic bottles and covers, 20% Banana peelings and remaining are food materials. Radhasapthami is the most important festival celebrated at famous sun temple Arasavalli of Srikakulam. Here solid waste includes 50% water bottles, plastic covers, 10% papers, 10% hair, 30% left out food material and Banana peelings.

Keywords: Cultural tourism, Eco tourism, Pilgrimage tourism, Solid waste.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3274
10024 Comparison between Associative Classification and Decision Tree for HCV Treatment Response Prediction

Authors: Enas M. F. El Houby, Marwa S. Hassan

Abstract:

Combined therapy using Interferon and Ribavirin is the standard treatment in patients with chronic hepatitis C. However, the number of responders to this treatment is low, whereas its cost and side effects are high. Therefore, there is a clear need to predict patient’s response to the treatment based on clinical information to protect the patients from the bad drawbacks, Intolerable side effects and waste of money. Different machine learning techniques have been developed to fulfill this purpose. From these techniques are Associative Classification (AC) and Decision Tree (DT). The aim of this research is to compare the performance of these two techniques in the prediction of virological response to the standard treatment of HCV from clinical information. 200 patients treated with Interferon and Ribavirin; were analyzed using AC and DT. 150 cases had been used to train the classifiers and 50 cases had been used to test the classifiers. The experiment results showed that the two techniques had given acceptable results however the best accuracy for the AC reached 92% whereas for DT reached 80%.

Keywords: Associative Classification, Data mining, Decision tree, HCV, interferon.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1860
10023 Sensitive Analysis of the ZF Model for ABC Multi Criteria Inventory Classification

Authors: Makram Ben Jeddou

Abstract:

ABC classification is widely used by managers for inventory control. The classical ABC classification is based on Pareto principle and according to the criterion of the annual use value only. Single criterion classification is often insufficient for a closely inventory control. Multi-criteria inventory classification models have been proposed by researchers in order to consider other important criteria. From these models, we will consider a specific model in order to make a sensitive analysis on the composite score calculated for each item. In fact, this score, based on a normalized average between a good and a bad optimized index, can affect the ABC-item classification. We will focus on items differently assigned to classes and then propose a classification compromise.

Keywords: ABC classification, Multi criteria inventory classification models, ZF-model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2486
10022 Iterative Clustering Algorithm for Analyzing Temporal Patterns of Gene Expression

Authors: Seo Young Kim, Jae Won Lee, Jong Sung Bae

Abstract:

Microarray experiments are information rich; however, extensive data mining is required to identify the patterns that characterize the underlying mechanisms of action. For biologists, a key aim when analyzing microarray data is to group genes based on the temporal patterns of their expression levels. In this paper, we used an iterative clustering method to find temporal patterns of gene expression. We evaluated the performance of this method by applying it to real sporulation data and simulated data. The patterns obtained using the iterative clustering were found to be superior to those obtained using existing clustering algorithms.

Keywords: Clustering, microarray experiment, temporal pattern of gene expression data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1328
10021 Characteristics of the Long-Term Regional Tourism Development in Georgia

Authors: Valeri Arghutashvili, Mari Gogochuri

Abstract:

Tourism industry development is one of the key priorities in Georgia, as it has positive influence on economic activities. Its contribution is very important for the different regions, as well as for the national economy. Benefits of the tourism industry include new jobs, service development, and increasing tax revenues, etc. The main aim of this research is to review and analyze the potential of the Georgian tourism industry with its long-term strategy and current challenges. To plan activities in a long-term development, it is required to evaluate several factors on the regional and on the national level. Factors include activities, transportation, services, lodging facilities, infrastructure and institutions. The major research contributions are practical estimates about regional tourism development which plays an important role in the integration process with global markets.

Keywords: Regional tourism, tourism industry, tourism in Georgia, tourism benefits.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 827
10020 Analysis of Web User Identification Methods

Authors: Renáta Iváncsy, Sándor Juhász

Abstract:

Web usage mining has become a popular research area, as a huge amount of data is available online. These data can be used for several purposes, such as web personalization, web structure enhancement, web navigation prediction etc. However, the raw log files are not directly usable; they have to be preprocessed in order to transform them into a suitable format for different data mining tasks. One of the key issues in the preprocessing phase is to identify web users. Identifying users based on web log files is not a straightforward problem, thus various methods have been developed. There are several difficulties that have to be overcome, such as client side caching, changing and shared IP addresses and so on. This paper presents three different methods for identifying web users. Two of them are the most commonly used methods in web log mining systems, whereas the third on is our novel approach that uses a complex cookie-based method to identify web users. Furthermore we also take steps towards identifying the individuals behind the impersonal web users. To demonstrate the efficiency of the new method we developed an implementation called Web Activity Tracking (WAT) system that aims at a more precise distinction of web users based on log data. We present some statistical analysis created by the WAT on real data about the behavior of the Hungarian web users and a comprehensive analysis and comparison of the three methods

Keywords: Data preparation, Tracking individuals, Web useridentification, Web usage mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4366
10019 Automatic Classification of Initial Categories of Alzheimer's Disease from Structural MRI Phase Images: A Comparison of PSVM, KNN and ANN Methods

Authors: Ahsan Bin Tufail, Ali Abidi, Adil Masood Siddiqui, Muhammad Shahzad Younis

Abstract:

An early and accurate detection of Alzheimer's disease (AD) is an important stage in the treatment of individuals suffering from AD. We present an approach based on the use of structural magnetic resonance imaging (sMRI) phase images to distinguish between normal controls (NC), mild cognitive impairment (MCI) and AD patients with clinical dementia rating (CDR) of 1. Independent component analysis (ICA) technique is used for extracting useful features which form the inputs to the support vector machines (SVM), K nearest neighbour (kNN) and multilayer artificial neural network (ANN) classifiers to discriminate between the three classes. The obtained results are encouraging in terms of classification accuracy and effectively ascertain the usefulness of phase images for the classification of different stages of Alzheimer-s disease.

Keywords: Biomedical image processing, classification algorithms, feature extraction, statistical learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2733
10018 Policy of Tourism and Opportunities of Development of Wellness Industry in Georgia

Authors: G. Erkomaishvili, R. Gvelesiani, E. Kharaishvili, M. Chavleishvili

Abstract:

The topic reviews the situation existing currently in Georgia in the field of tourism in conditions of globalization: Touristic resources, the paces of development of the tourism infrastructure, tourism policy, possibilities of development of the Wellness industry in Georgia, that is the newest direction of the medical tourism. The factors impeding the development of the industry of tourism, namely – existence of the conflict zones, high rates of the bank credits, deficiencies associated with the tax laws, a level of infrastructural development, quality of services, deficit in the competitive staff, increase of prices in the peak seasons, insufficient promotion of the touristic opportunities of Georgia on the international markets are studied and analyzed. Besides, the level of development of tourism in Georgia according to the World Economic Forum, aspects of cooperation with the European Union, etc., is reviewed. As a result of these studies, a strategy of development of tourism and one of its direction – Wellness industry in Georgia, is introduced with the relevant conclusions, on which basis the recommendations are provided.

Keywords: Tourism, Tourism Policy, Wellness Industry.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2830
10017 Combining Fuzzy Logic and Data Miningto Predict the Result of an EIA Review

Authors: Kevin Fong-Rey Liu, Jia-Shen Chen, Han-Hsi Liang, Cheng-Wu Chen, Yung-Shuen Shen

Abstract:

The purpose of determining impact significance is to place value on impacts. Environmental impact assessment review is a process that judges whether impact significance is acceptable or not in accordance with the scientific facts regarding environmental, ecological and socio-economical impacts described in environmental impact statements (EIS) or environmental impact assessment reports (EIAR). The first aim of this paper is to summarize the criteria of significance evaluation from the past review results and accordingly utilize fuzzy logic to incorporate these criteria into scientific facts. The second aim is to employ data mining technique to construct an EIS or EIAR prediction model for reviewing results which can assist developers to prepare and revise better environmental management plans in advance. The validity of the previous prediction model proposed by authors in 2009 is 92.7%. The enhanced validity in this study can attain 100.0%.

Keywords: Environmental impact assessment review, impactsignificance, fuzzy logic, data mining, classification tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1914
10016 Brainwave Classification for Brain Balancing Index (BBI) via 3D EEG Model Using k-NN Technique

Authors: N. Fuad, M. N. Taib, R. Jailani, M. E. Marwan

Abstract:

In this paper, the comparison between k-Nearest Neighbor (kNN) algorithms for classifying the 3D EEG model in brain balancing is presented. The EEG signal recording was conducted on 51 healthy subjects. Development of 3D EEG models involves pre-processing of raw EEG signals and construction of spectrogram images. Then, maximum PSD values were extracted as features from the model. There are three indexes for balanced brain; index 3, index 4 and index 5. There are significant different of the EEG signals due to the brain balancing index (BBI). Alpha-α (8–13 Hz) and beta-β (13–30 Hz) were used as input signals for the classification model. The k-NN classification result is 88.46% accuracy. These results proved that k-NN can be used in order to predict the brain balancing application.

Keywords: Brain balancing, kNN, power spectral density, 3D EEG model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2590
10015 Design and Implementation of a Counting and Differentiation System for Vehicles through Video Processing

Authors: Derlis Gregor, Kevin Cikel, Mario Arzamendia, Raúl Gregor

Abstract:

This paper presents a self-sustaining mobile system for counting and classification of vehicles through processing video. It proposes a counting and classification algorithm divided in four steps that can be executed multiple times in parallel in a SBC (Single Board Computer), like the Raspberry Pi 2, in such a way that it can be implemented in real time. The first step of the proposed algorithm limits the zone of the image that it will be processed. The second step performs the detection of the mobile objects using a BGS (Background Subtraction) algorithm based on the GMM (Gaussian Mixture Model), as well as a shadow removal algorithm using physical-based features, followed by morphological operations. In the first step the vehicle detection will be performed by using edge detection algorithms and the vehicle following through Kalman filters. The last step of the proposed algorithm registers the vehicle passing and performs their classification according to their areas. An auto-sustainable system is proposed, powered by batteries and photovoltaic solar panels, and the data transmission is done through GPRS (General Packet Radio Service)eliminating the need of using external cable, which will facilitate it deployment and translation to any location where it could operate. The self-sustaining trailer will allow the counting and classification of vehicles in specific zones with difficult access.

Keywords: Intelligent transportation systems, object detection, video processing, road traffic, vehicle counting, vehicle classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1593
10014 Automated Heart Sound Classification from Unsegmented Phonocardiogram Signals Using Time Frequency Features

Authors: Nadia Masood Khan, Muhammad Salman Khan, Gul Muhammad Khan

Abstract:

Cardiologists perform cardiac auscultation to detect abnormalities in heart sounds. Since accurate auscultation is a crucial first step in screening patients with heart diseases, there is a need to develop computer-aided detection/diagnosis (CAD) systems to assist cardiologists in interpreting heart sounds and provide second opinions. In this paper different algorithms are implemented for automated heart sound classification using unsegmented phonocardiogram (PCG) signals. Support vector machine (SVM), artificial neural network (ANN) and cartesian genetic programming evolved artificial neural network (CGPANN) without the application of any segmentation algorithm has been explored in this study. The signals are first pre-processed to remove any unwanted frequencies. Both time and frequency domain features are then extracted for training the different models. The different algorithms are tested in multiple scenarios and their strengths and weaknesses are discussed. Results indicate that SVM outperforms the rest with an accuracy of 73.64%.

Keywords: Pattern recognition, machine learning, computer aided diagnosis, heart sound classification, and feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1242
10013 Pruning Algorithm for the Minimum Rule Reduct Generation

Authors: Şahin Emrah Amrahov, Fatih Aybar, Serhat Doğan

Abstract:

In this paper we consider the rule reduct generation problem. Rule Reduct Generation (RG) and Modified Rule Generation (MRG) algorithms, that are used to solve this problem, are well-known. Alternative to these algorithms, we develop Pruning Rule Generation (PRG) algorithm. We compare the PRG algorithm with RG and MRG.

Keywords: Rough sets, Decision rules, Rule induction, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2020
10012 Rank-Based Chain-Mode Ensemble for Binary Classification

Authors: Chongya Song, Kang Yen, Alexander Pons, Jin Liu

Abstract:

In the field of machine learning, the ensemble has been employed as a common methodology to improve the performance upon multiple base classifiers. However, the true predictions are often canceled out by the false ones during consensus due to a phenomenon called “curse of correlation” which is represented as the strong interferences among the predictions produced by the base classifiers. In addition, the existing practices are still not able to effectively mitigate the problem of imbalanced classification. Based on the analysis on our experiment results, we conclude that the two problems are caused by some inherent deficiencies in the approach of consensus. Therefore, we create an enhanced ensemble algorithm which adopts a designed rank-based chain-mode consensus to overcome the two problems. In order to evaluate the proposed ensemble algorithm, we employ a well-known benchmark data set NSL-KDD (the improved version of dataset KDDCup99 produced by University of New Brunswick) to make comparisons between the proposed and 8 common ensemble algorithms. Particularly, each compared ensemble classifier uses the same 22 base classifiers, so that the differences in terms of the improvements toward the accuracy and reliability upon the base classifiers can be truly revealed. As a result, the proposed rank-based chain-mode consensus is proved to be a more effective ensemble solution than the traditional consensus approach, which outperforms the 8 ensemble algorithms by 20% on almost all compared metrices which include accuracy, precision, recall, F1-score and area under receiver operating characteristic curve.

Keywords: Consensus, curse of correlation, imbalanced classification, rank-based chain-mode ensemble.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 688
10011 Comparison of Different Methods to Produce Fuzzy Tolerance Relations for Rainfall Data Classification in the Region of Central Greece

Authors: N. Samarinas, C. Evangelides, C. Vrekos

Abstract:

The aim of this paper is the comparison of three different methods, in order to produce fuzzy tolerance relations for rainfall data classification. More specifically, the three methods are correlation coefficient, cosine amplitude and max-min method. The data were obtained from seven rainfall stations in the region of central Greece and refers to 20-year time series of monthly rainfall height average. Three methods were used to express these data as a fuzzy relation. This specific fuzzy tolerance relation is reformed into an equivalence relation with max-min composition for all three methods. From the equivalence relation, the rainfall stations were categorized and classified according to the degree of confidence. The classification shows the similarities among the rainfall stations. Stations with high similarity can be utilized in water resource management scenarios interchangeably or to augment data from one to another. Due to the complexity of calculations, it is important to find out which of the methods is computationally simpler and needs fewer compositions in order to give reliable results.

Keywords: Classification, fuzzy logic, tolerance relations, rainfall data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 993
10010 Developing Islamic Tourism in Kazakhstan: A Result of a Religious Revival or a New Trend of Tourism

Authors: A. A. Mustafayeva, G. E. Nadirova, Sh. S. Kaliyeva, B. Zh. Aktaulova

Abstract:

all of religions free towards society in Kazakhstan. Considering that Islam is more widespread religion in the region, Islamic industry is developing sector of Economy. There are some new sectors of Halal (Islamic) industry, which have importance for state developing on the whole. One of the youngest sectors of Halal industry is Islamic tourism, which became an object of disputes and led to dilemma, such as Islamic tourism is a result of a Religious revival and Islamic tourism is a new trend of Tourism. The paper was written under the research project “Islam in modern Kazakhstan: the nature and outcome of the religious revival".

Keywords: Halal industry, Islamic tourism, pillars, pilgrims.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2878
10009 Synthetic Aperture Radar Remote Sensing Classification Using the Bag of Visual Words Model to Land Cover Studies

Authors: Reza Mohammadi, Mahmod R. Sahebi, Mehrnoosh Omati, Milad Vahidi

Abstract:

Classification of high resolution polarimetric Synthetic Aperture Radar (PolSAR) images plays an important role in land cover and land use management. Recently, classification algorithms based on Bag of Visual Words (BOVW) model have attracted significant interest among scholars and researchers in and out of the field of remote sensing. In this paper, BOVW model with pixel based low-level features has been implemented to classify a subset of San Francisco bay PolSAR image, acquired by RADARSAR 2 in C-band. We have used segment-based decision-making strategy and compared the result with the result of traditional Support Vector Machine (SVM) classifier. 90.95% overall accuracy of the classification with the proposed algorithm has shown that the proposed algorithm is comparable with the state-of-the-art methods. In addition to increase in the classification accuracy, the proposed method has decreased undesirable speckle effect of SAR images.

Keywords: Bag of Visual Words, classification, feature extraction, land cover management, Polarimetric Synthetic Aperture Radar.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 748
10008 A Multiresolution Approach for Noised Texture Classification based on the Co-occurrence Matrix and First Order Statistics

Authors: M. Ben Othmen, M. Sayadi, F. Fnaiech

Abstract:

Wavelet transform provides several important characteristics which can be used in a texture analysis and classification. In this work, an efficient texture classification method, which combines concepts from wavelet and co-occurrence matrices, is presented. An Euclidian distance classifier is used to evaluate the various methods of classification. A comparative study is essential to determine the ideal method. Using this conjecture, we developed a novel feature set for texture classification and demonstrate its effectiveness

Keywords: Classification, Wavelet, Co-occurrence, Euclidian Distance, Classifier, Texture.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1455
10007 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni

Abstract:

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Keywords: Cooccurrence graph, entity relation graph, unstructured text, weighted distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 656
10006 Machine Scoring Model Using Data Mining Techniques

Authors: Wimalin S. Laosiritaworn, Pongsak Holimchayachotikul

Abstract:

this article proposed a methodology for computer numerical control (CNC) machine scoring. The case study company is a manufacturer of hard disk drive parts in Thailand. In this company, sample of parts manufactured from CNC machine are usually taken randomly for quality inspection. These inspection data were used to make a decision to shut down the machine if it has tendency to produce parts that are out of specification. Large amount of data are produced in this process and data mining could be very useful technique in analyzing them. In this research, data mining techniques were used to construct a machine scoring model called 'machine priority assessment model (MPAM)'. This model helps to ensure that the machine with higher risk of producing defective parts be inspected before those with lower risk. If the defective prone machine is identified sooner, defective part and rework could be reduced hence improving the overall productivity. The results showed that the proposed method can be successfully implemented and approximately 351,000 baht of opportunity cost could have saved in the case study company.

Keywords: Computer Numerical Control, Data Mining, HardDisk Drive.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1371
10005 Fuzzy Inference System Based Unhealthy Region Classification in Plant Leaf Image

Authors: K. Muthukannan, P. Latha

Abstract:

In addition to environmental parameters like rain, temperature diseases on crop is a major factor which affects production quality & quantity of crop yield. Hence disease management is a key issue in agriculture. For the management of disease, it needs to be detected at early stage. So, treat it properly & control spread of the disease. Now a day, it is possible to use the images of diseased leaf to detect the type of disease by using image processing techniques. This can be achieved by extracting features from the images which can be further used with classification algorithms or content based image retrieval systems. In this paper, color image is used to extract the features such as mean and standard deviation after the process of region cropping. The selected features are taken from the cropped image with different image size samples. Then, the extracted features are taken in to the account for classification using Fuzzy Inference System (FIS).

Keywords: Image Cropping, Classification, Color, Fuzzy Rule, Feature Extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1864
10004 Relationship between Hofstede’s Cultural Dimensions and Tourism Product Satisfaction

Authors: Thanawit Buafai, Siyathorn Khunon

Abstract:

This paper aims to explore the satisfaction levels of tourism product components on the island of Samui by studying the cultural dimension relationships of Hofsted’s classic theory. Both the six Hofsted cultural dimensions and tourism production satisfaction measures have been of interest worldwide. Therefore, the challenge of this study is to re-confirm previous research results in the ever-changing current contexts of the modern globalized business era. Self-rated questionnaires were employed to collect data from six nationalities of tourists in Samui, totaling 386 samples. The reliability of this research methodology was 0.967. Correlation was applied to analyze the relationships. The results indicate that Masculinity is significantly related to tourism destination satisfaction for every factor, while the other five cultural dimensions are related to some factors of tourism satisfaction. Surprisingly, tourist satisfaction toward the bar/restaurant factor is significantly correlated with all six cultural dimensions.

Keywords: Cultural dimensions, tourism products, Samui, Thailand.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3294
10003 Application of Artificial Neural Network to Classification Surface Water Quality

Authors: S. Wechmongkhonkon, N.Poomtong, S. Areerachakul

Abstract:

Water quality is a subject of ongoing concern. Deterioration of water quality has initiated serious management efforts in many countries. This study endeavors to automatically classify water quality. The water quality classes are evaluated using 6 factor indices. These factors are pH value (pH), Dissolved Oxygen (DO), Biochemical Oxygen Demand (BOD), Nitrate Nitrogen (NO3N), Ammonia Nitrogen (NH3N) and Total Coliform (TColiform). The methodology involves applying data mining techniques using multilayer perceptron (MLP) neural network models. The data consisted of 11 sites of canals in Dusit district in Bangkok, Thailand. The data is obtained from the Department of Drainage and Sewerage Bangkok Metropolitan Administration during 2007-2011. The results of multilayer perceptron neural network exhibit a high accuracy multilayer perception rate at 96.52% in classifying the water quality of Dusit district canal in Bangkok Subsequently, this encouraging result could be applied with plan and management source of water quality.

Keywords: artificial neural network, classification, surface water quality

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3179
10002 Forecasting Fraudulent Financial Statements using Data Mining

Authors: S. Kotsiantis, E. Koumanakos, D. Tzelepis, V. Tampakas

Abstract:

This paper explores the effectiveness of machine learning techniques in detecting firms that issue fraudulent financial statements (FFS) and deals with the identification of factors associated to FFS. To this end, a number of experiments have been conducted using representative learning algorithms, which were trained using a data set of 164 fraud and non-fraud Greek firms in the recent period 2001-2002. The decision of which particular method to choose is a complicated problem. A good alternative to choosing only one method is to create a hybrid forecasting system incorporating a number of possible solution methods as components (an ensemble of classifiers). For this purpose, we have implemented a hybrid decision support system that combines the representative algorithms using a stacking variant methodology and achieves better performance than any examined simple and ensemble method. To sum up, this study indicates that the investigation of financial information can be used in the identification of FFS and underline the importance of financial ratios.

Keywords: Machine learning, stacking, classifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3021
10001 Initializing K-Means using Genetic Algorithms

Authors: Bashar Al-Shboul, Sung-Hyon Myaeng

Abstract:

K-Means (KM) is considered one of the major algorithms widely used in clustering. However, it still has some problems, and one of them is in its initialization step where it is normally done randomly. Another problem for KM is that it converges to local minima. Genetic algorithms are one of the evolutionary algorithms inspired from nature and utilized in the field of clustering. In this paper, we propose two algorithms to solve the initialization problem, Genetic Algorithm Initializes KM (GAIK) and KM Initializes Genetic Algorithm (KIGA). To show the effectiveness and efficiency of our algorithms, a comparative study was done among GAIK, KIGA, Genetic-based Clustering Algorithm (GCA), and FCM [19].

Keywords: Clustering, Genetic Algorithms, K-means.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2068
10000 Evaluating 8D Reports Using Text-Mining

Authors: Benjamin Kuester, Bjoern Eilert, Malte Stonis, Ludger Overmeyer

Abstract:

Increasing quality requirements make reliable and effective quality management indispensable. This includes the complaint handling in which the 8D method is widely used. The 8D report as a written documentation of the 8D method is one of the key quality documents as it internally secures the quality standards and acts as a communication medium to the customer. In practice, however, the 8D report is mostly faulty and of poor quality. There is no quality control of 8D reports today. This paper describes the use of natural language processing for the automated evaluation of 8D reports. Based on semantic analysis and text-mining algorithms the presented system is able to uncover content and formal quality deficiencies and thus increases the quality of the complaint processing in the long term.

Keywords: 8D report, complaint management, evaluation system, text-mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 995