Search results for: mining industry

1831 Text Mining Technique for Data Mining Application

Abstract:

Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge discovery in text (KDT), or Text data mining or Text Mining. In decision tree approach is most useful in classification problem. With this technique, tree is constructed to model the classification process. There are two basic steps in the technique: building the tree and applying the tree to the database. This paper describes a proposed C5.0 classifier that performs rulesets, cross validation and boosting for original C5.0 in order to reduce the optimization of error ratio. The feasibility and the benefits of the proposed approach are demonstrated by means of medial data set like hypothyroid. It is shown that, the performance of a classifier on the training cases from which it was constructed gives a poor estimate by sampling or using a separate test file, either way, the classifier is evaluated on cases that were not used to build and evaluate the classifier are both are large. If the cases in hypothyroid.data and hypothyroid.test were to be shuffled and divided into a new 2772 case training set and a 1000 case test set, C5.0 might construct a different classifier with a lower or higher error rate on the test cases. An important feature of see5 is its ability to classifiers called rulesets. The ruleset has an error rate 0.5 % on the test cases. The standard errors of the means provide an estimate of the variability of results. One way to get a more reliable estimate of predictive is by f-fold –cross- validation. The error rate of a classifier produced from all the cases is estimated as the ratio of the total number of errors on the hold-out cases to the total number of cases. The Boost option with x trials instructs See5 to construct up to x classifiers in this manner. Trials over numerous datasets, large and small, show that on average 10-classifier boosting reduces the error rate for test cases by about 25%.

Keywords: C5.0, Error Ratio, text mining, training data, test data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2438

1830 Analysis of a Population of Diabetic Patients Databases with Classifiers

Authors: Murat Koklu, Yavuz Unal

Abstract:

Data mining can be called as a technique to extract information from data. It is the process of obtaining hidden information and then turning it into qualified knowledge by statistical and artificial intelligence technique. One of its application areas is medical area to form decision support systems for diagnosis just by inventing meaningful information from given medical data. In this study a decision support system for diagnosis of illness that make use of data mining and three different artificial intelligence classifier algorithms namely Multilayer Perceptron, Naive Bayes Classifier and J.48. Pima Indian dataset of UCI Machine Learning Repository was used. This dataset includes urinary and blood test results of 768 patients. These test results consist of 8 different feature vectors. Obtained classifying results were compared with the previous studies. The suggestions for future studies were presented.

Keywords: Artificial Intelligence, Classifiers, Data Mining, Diabetic Patients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5389

1829 Multi-Dimensional Concerns Mining for Web Applications via Concept-Analysis

Authors: Carlo Bellettini, Alessandro Marchetto, Andrea Trentini

Abstract:

Web applications have become very complex and crucial, especially when combined with areas such as CRM (Customer Relationship Management) and BPR (Business Process Reengineering), the scientific community has focused attention to Web applications design, development, analysis, and testing, by studying and proposing methodologies and tools. This paper proposes an approach to automatic multi-dimensional concern mining for Web Applications, based on concepts analysis, impact analysis, and token-based concern identification. This approach lets the user to analyse and traverse Web software relevant to a particular concern (concept, goal, purpose, etc.) via multi-dimensional separation of concerns, to document, understand and test Web applications. This technique was developed in the context of WAAT (Web Applications Analysis and Testing) project. A semi-automatic tool to support this technique is currently under development.

Keywords: Concepts Analysis, Concerns Mining, Multi-Dimensional Separation of Concerns, Impact Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1430

1828 Ensemble Approach for Predicting Student's Academic Performance

Authors: L. A. Muhammad, M. S. Argungu

Abstract:

Educational data mining (EDM) has recorded substantial considerations. Techniques of data mining in one way or the other have been proposed to dig out out-of-sight knowledge in educational data. The result of the study got assists academic institutions in further enhancing their process of learning and methods of passing knowledge to students. Consequently, the performance of students boasts and the educational products are by no doubt enhanced. This study adopted a student performance prediction model premised on techniques of data mining with Students' Essential Features (SEF). SEF are linked to the learner's interactivity with the e-learning management system. The performance of the student's predictive model is assessed by a set of classifiers, viz. Bayes Network, Logistic Regression, and Reduce Error Pruning Tree (REP). Consequently, ensemble methods of Bagging, Boosting, and Random Forest (RF) are applied to improve the performance of these single classifiers. The study reveals that the result shows a robust affinity between learners' behaviors and their academic attainment. Result from the study shows that the REP Tree and its ensemble record the highest accuracy of 83.33% using SEF. Hence, in terms of the Receiver Operating Curve (ROC), boosting method of REP Tree records 0.903, which is the best. This result further demonstrates the dependability of the proposed model.

Keywords: Ensemble, bagging, Random Forest, boosting, data mining, classifiers, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 677

1827 Identifying Project Delay Factors in the Australian Construction Industry

Authors: Syed Sohaib Bin Hasib, Hiyam Al-Kilidar

Abstract:

Meeting project deadlines is a major challenge for most construction projects. In this study, perceptions of contractors, clients, and consultants are compared relative to a list of factors derived from the review of the extant literature on project delay. 59 causes (categorized into 8 groups) of project delays were identified from the literature. A survey was devised to get insights and ranking of these factors from clients, consultants & contractors in the Australian construction industry. Findings showed that project delays in the Australian construction industry are mainly the result of skill shortages, interference in execution, and poor coordination and communication between the project stakeholders.

Keywords: Construction, delay factors, time delay, Australian construction industry.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1140

1826 Mining Network Data for Intrusion Detection through Naïve Bayesian with Clustering

Authors: Dewan Md. Farid, Nouria Harbi, Suman Ahmmed, Md. Zahidur Rahman, Chowdhury Mofizur Rahman

Abstract:

Network security attacks are the violation of information security policy that received much attention to the computational intelligence society in the last decades. Data mining has become a very useful technique for detecting network intrusions by extracting useful knowledge from large number of network data or logs. Naïve Bayesian classifier is one of the most popular data mining algorithm for classification, which provides an optimal way to predict the class of an unknown example. It has been tested that one set of probability derived from data is not good enough to have good classification rate. In this paper, we proposed a new learning algorithm for mining network logs to detect network intrusions through naïve Bayesian classifier, which first clusters the network logs into several groups based on similarity of logs, and then calculates the prior and conditional probabilities for each group of logs. For classifying a new log, the algorithm checks in which cluster the log belongs and then use that cluster-s probability set to classify the new log. We tested the performance of our proposed algorithm by employing KDD99 benchmark network intrusion detection dataset, and the experimental results proved that it improves detection rates as well as reduces false positives for different types of network intrusions.

Keywords: Clustering, detection rate, false positive, naïveBayesian classifier, network intrusion detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5497

1825 Investigating Crime Hotspot Places and their Implication to Urban Environmental Design: A Geographic Visualization and Data Mining Approach

Authors: Donna R. Tabangin, Jacqueline C. Flores, Nelson F. Emperador

Abstract:

Information is power. Geographical information is an emerging science that is advancing the development of knowledge to further help in the understanding of the relationship of “place" with other disciplines such as crime. The researchers used crime data for the years 2004 to 2007 from the Baguio City Police Office to determine the incidence and actual locations of crime hotspots. Combined qualitative and quantitative research methodology was employed through extensive fieldwork and observation, geographic visualization with Geographic Information Systems (GIS) and Global Positioning Systems (GPS), and data mining. The paper discusses emerging geographic visualization and data mining tools and methodologies that can be used to generate baseline data for environmental initiatives such as urban renewal and rejuvenation. The study was able to demonstrate that crime hotspots can be computed and were seen to be occurring to some select places in the Central Business District (CBD) of Baguio City. It was observed that some characteristics of the hotspot places- physical design and milieu may play an important role in creating opportunities for crime. A list of these environmental attributes was generated. This derived information may be used to guide the design or redesign of the urban environment of the City to be able to reduce crime and at the same time improve it physically.

Keywords: Crime mapping, data mining, environmental design, geographic visualization, GIS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2561

1824 Knowledge-Driven Decision Support System Based on Knowledge Warehouse and Data Mining by Improving Apriori Algorithm with Fuzzy Logic

Authors: Pejman Hosseinioun, Hasan Shakeri, Ghasem Ghorbanirostam

Abstract:

In recent years, we have seen an increasing importance of research and study on knowledge source, decision support systems, data mining and procedure of knowledge discovery in data bases and it is considered that each of these aspects affects the others. In this article, we have merged information source and knowledge source to suggest a knowledge based system within limits of management based on storing and restoring of knowledge to manage information and improve decision making and resources. In this article, we have used method of data mining and Apriori algorithm in procedure of knowledge discovery one of the problems of Apriori algorithm is that, a user should specify the minimum threshold for supporting the regularity. Imagine that a user wants to apply Apriori algorithm for a database with millions of transactions. Definitely, the user does not have necessary knowledge of all existing transactions in that database, and therefore cannot specify a suitable threshold. Our purpose in this article is to improve Apriori algorithm. To achieve our goal, we tried using fuzzy logic to put data in different clusters before applying the Apriori algorithm for existing data in the database and we also try to suggest the most suitable threshold to the user automatically.

Keywords: Decision support system, data mining, knowledge discovery, data discovery, fuzzy logic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2085

1823 Discovering Complex Regularities by Adaptive Self Organizing Classification

Authors: A. Faro, D. Giordano, F. Maiorana

Abstract:

Data mining uses a variety of techniques each of which is useful for some particular task. It is important to have a deep understanding of each technique and be able to perform sophisticated analysis. In this article we describe a tool built to simulate a variation of the Kohonen network to perform unsupervised clustering and support the entire data mining process up to results visualization. A graphical representation helps the user to find out a strategy to optmize classification by adding, moving or delete a neuron in order to change the number of classes. The tool is also able to automatically suggest a strategy for number of classes optimization.The tool is used to classify macroeconomic data that report the most developed countries? import and export. It is possible to classify the countries based on their economic behaviour and use an ad hoc tool to characterize the commercial behaviour of a country in a selected class from the analysis of positive and negative features that contribute to classes formation.

Keywords: Unsupervised classification, Kohonen networks, macroeconomics, Visual data mining, cluster interpretation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1520

1822 Hierarchical Clustering Algorithms in Data Mining

Authors: Z. Abdullah, A. R. Hamdan

Abstract:

Clustering is a process of grouping objects and data into groups of clusters to ensure that data objects from the same cluster are identical to each other. Clustering algorithms in one of the area in data mining and it can be classified into partition, hierarchical, density based and grid based. Therefore, in this paper we do survey and review four major hierarchical clustering algorithms called CURE, ROCK, CHAMELEON and BIRCH. The obtained state of the art of these algorithms will help in eliminating the current problems as well as deriving more robust and scalable algorithms for clustering.

Keywords: Clustering, method, algorithm, hierarchical, survey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3338

1821 A Heuristics Approach for Fast Detecting Suspicious Money Laundering Cases in an Investment Bank

Authors: Nhien-An Le-Khac, Sammer Markos, M-Tahar Kechadi

Abstract:

Today, money laundering (ML) poses a serious threat not only to financial institutions but also to the nation. This criminal activity is becoming more and more sophisticated and seems to have moved from the cliché of drug trafficking to financing terrorism and surely not forgetting personal gain. Most international financial institutions have been implementing anti-money laundering solutions (AML) to fight investment fraud. However, traditional investigative techniques consume numerous man-hours. Recently, data mining approaches have been developed and are considered as well-suited techniques for detecting ML activities. Within the scope of a collaboration project for the purpose of developing a new solution for the AML Units in an international investment bank, we proposed a data mining-based solution for AML. In this paper, we present a heuristics approach to improve the performance for this solution. We also show some preliminary results associated with this method on analysing transaction datasets.

Keywords: data mining, anti money laundering, clustering, heuristics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3542

1820 Demand and Price Evolution Forecasting as Tools for Facilitating the RoadMapping Process of the Photonic Component Industry

Authors: T. Kamalakis, I. Neokosmidis, D. Varoutas, T. Sphicopoulos

Abstract:

The photonic component industry is a highly innovative industry with a large value chain. In order to ensure the growth of the industry much effort must be devoted to road mapping activities. In such activities demand and price evolution forecasting tools can prove quite useful in order to help in the roadmap refinement and update process. This paper attempts to provide useful guidelines in roadmapping of optical components and considers two models based on diffusion theory and the extended learning curve for demand and price evolution forecasting.

Keywords: Roadmapping, Photonic Components, Forecasting, Diffusion Theory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1337

1819 A Study on Finding Similar Document with Multiple Categories

Authors: R. Saraçoğlu, N. Allahverdi

Abstract:

Searching similar documents and document management subjects have important place in text mining. One of the most important parts of similar document research studies is the process of classifying or clustering the documents. In this study, a similar document search approach that includes discussion of out the case of belonging to multiple categories (multiple categories problem) has been carried. The proposed method that based on Fuzzy Similarity Classification (FSC) has been compared with Rocchio algorithm and naive Bayes method which are widely used in text mining. Empirical results show that the proposed method is quite successful and can be applied effectively. For the second stage, multiple categories vector method based on information of categories regarding to frequency of being seen together has been used. Empirical results show that achievement is increased almost two times, when proposed method is compared with classical approach.

Keywords: Document similarity, Fuzzy classification, Multiple categories, Text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1667

1818 The Research Report of Employment Trends in Printing Industry for Prepress

Authors: Weera Chotithammaporn

Abstract:

This research aims to study employment trends in printing industry for prepress support by Suan Sunandha University Fund. The objectives of this research are to explain the trends of the employment in Thai Printing Industry for prepress in Bangkok and the description of different personnel that prepress entrepreneur need and also the problems of employment. The population of prepress entrepreneurs is about 100 organizations in the area of Bangkok. The questionnaires has been taken and analyzed with SPSS program by using the average percentage and standard deviation. This research is multiple case studies. The conceptual framework is developed on the basis of the open systems theory. The research result show that 1. The most of prepress entrepreneur have trend to choose the employee by any sex, the age 25-29 years old, bachelor degree and have 1-2 years experience. 2. The most problems are the understanding in job, communication/relation and the understanding in new technology. 3. The trends aims to employment in 1-3 years have 57.8% for prepress industry in Bangkok. This research suggests that: 1. Thai printing industry for prepress in Bangkok need quality employee that expert in printing technology. 2. Prepress entrepreneur should have agreement to development with university for practice the employee. 3. Prepress entrepreneur should support personal to fulfill the knowledge.

Keywords: Printing Industry, Prepress.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2210

1817 Lead and Cadmium Spatial Pattern and Risk Assessment around Coal Mine in Hyrcanian Forest, North Iran

Authors: Mahsa Tavakoli, Seyed Mohammad Hojjati, Yahya Kooch

Abstract:

In this study, the effect of coal mining activities on lead and cadmium concentrations and distribution in soil was investigated in Hyrcanian forest, North Iran. 16 plots (20×20 m²) were established by systematic-randomly (60×60 m²) in an area of 4 ha (200×200 m²-mine entrance placed at center). An area adjacent to the mine was not affected by the mining activity; considered as the controlled area. In order to investigate soil lead and cadmium concentration, one sample was taken from the 0-10 cm in each plot. To study the spatial pattern of soil properties and lead and cadmium concentrations in the mining area, an area of 80×80m² (the mine as the center) was considered and 80 soil samples were systematic-randomly taken (10 m intervals). Geostatistical analysis was performed via Kriging method and GS⁺software (version 5.1). In order to estimate the impact of coal mining activities on soil quality, pollution index was measured. Lead and cadmium concentrations were significantly higher in mine area (Pb: 10.97±0.30, Cd: 184.47±6.26 mg.kg^-1) in comparison to control area (Pb: 9.42±0.17, Cd: 131.71±15.77 mg.kg^-1). The mean values of the PI index indicate that Pb (1.16) and Cd (1.77) presented slightly polluted. Results of the NIPI index showed that Pb (1.44) and Cd (2.52) presented slight pollution and moderate pollution respectively. Results of variography and kriging method showed that it is possible to prepare interpolation maps of lead and cadmium around the mining areas in Hyrcanian forest. According to results of pollution and risk assessments, forest soil was contaminated by heavy metals (lead and cadmium); therefore, using reclamation and remediation techniques in these areas is necessary.

Keywords: Traditional coal mining, heavy metals, pollution indicators, geostatistics, caspian forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 995

1816 Policy of Tourism and Opportunities of Development of Wellness Industry in Georgia

Authors: G. Erkomaishvili, R. Gvelesiani, E. Kharaishvili, M. Chavleishvili

Abstract:

The topic reviews the situation existing currently in Georgia in the field of tourism in conditions of globalization: Touristic resources, the paces of development of the tourism infrastructure, tourism policy, possibilities of development of the Wellness industry in Georgia, that is the newest direction of the medical tourism. The factors impeding the development of the industry of tourism, namely – existence of the conflict zones, high rates of the bank credits, deficiencies associated with the tax laws, a level of infrastructural development, quality of services, deficit in the competitive staff, increase of prices in the peak seasons, insufficient promotion of the touristic opportunities of Georgia on the international markets are studied and analyzed. Besides, the level of development of tourism in Georgia according to the World Economic Forum, aspects of cooperation with the European Union, etc., is reviewed. As a result of these studies, a strategy of development of tourism and one of its direction – Wellness industry in Georgia, is introduced with the relevant conclusions, on which basis the recommendations are provided.

Keywords: Tourism, Tourism Policy, Wellness Industry.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2817

1815 Parallel and Distributed Mining of Association Rule on Knowledge Grid

Authors: U. Sakthi, R. Hemalatha, R. S. Bhuvaneswaran

Abstract:

In Virtual organization, Knowledge Discovery (KD) service contains distributed data resources and computing grid nodes. Computational grid is integrated with data grid to form Knowledge Grid, which implements Apriori algorithm for mining association rule on grid network. This paper describes development of parallel and distributed version of Apriori algorithm on Globus Toolkit using Message Passing Interface extended with Grid Services (MPICHG2). The creation of Knowledge Grid on top of data and computational grid is to support decision making in real time applications. In this paper, the case study describes design and implementation of local and global mining of frequent item sets. The experiments were conducted on different configurations of grid network and computation time was recorded for each operation. We analyzed our result with various grid configurations and it shows speedup of computation time is almost superlinear.

Keywords: Association rule, Grid computing, Knowledge grid, Mobility prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2134

1814 Solid Waste Management in Steel Industry - Challenges and Opportunities

Authors: Sushovan Sarkar, Debabrata Mazumder

Abstract:

Solid waste management in steel industry is broadly classified in “4 Rs” i.e. reduce, reuse, recycle and restore the materials. Reuse and recycling the entire solid waste generated in the process of steel making is a viable solution in targeting a clean, green and zero waste technology leading to sustainable development of the steel industry. Solid waste management has gained importance in steel industry in view of its uncertainty, volatility and speculation due to world competitive standards, rising input costs, scarcity of raw materials and solid waste generated like in other sectors. The challenges that the steel Industry faces today are the requirement of a sustainable development by meeting the needs of our present generation without compromising the ability of future generations. Technologies are developed not only for gainful utilization of solid wastes in manufacture of conventional products but also for conversion of same in to completely new products.

Keywords: Reuse, recycle, solid waste, sustainable development.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8490

1813 Analysis of Sequence Moves in Successful Chess Openings Using Data Mining with Association Rules

Authors: R.M.Rani

Abstract:

Chess is one of the indoor games, which improves the level of human confidence, concentration, planning skills and knowledge. The main objective of this paper is to help the chess players to improve their chess openings using data mining techniques. Budding Chess Players usually do practices by analyzing various existing openings. When they analyze and correlate thousands of openings it becomes tedious and complex for them. The work done in this paper is to analyze the best lines of Blackmar- Diemer Gambit(BDG) which opens with White D4... using data mining analysis. It is carried out on the collection of winning games by applying association rules. The first step of this analysis is assigning variables to each different sequence moves. In the second step, the sequence association rules were generated to calculate support and confidence factor which help us to find the best subsequence chess moves that may lead to winning position.

Keywords: Blackmar-Diemer Gambit(BDG), Confidence, sequence Association Rules, Support.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3036

1812 Designing Software Quality Measurement System for Telecommunication Industry Using Object-Oriented Technique

Authors: Nor Fazlina Iryani Abdul Hamid, Mohamad Khatim Hasan

Abstract:

Numbers of software quality measurement system have been implemented over the past few years, but none of them focuses on telecommunication industry. Software quality measurement system for telecommunication industry was a system that could calculate the quality value of the measured software that totally focused in telecommunication industry. Before designing a system, quality factors, quality attributes and quality metrics were identified based on literature review and survey. Then, using the identified quality factors, quality attributes and quality metrics, quality model for telecommunication industry was constructed. Each identified quality metrics had its own formula. Quality value for the system was measured based on the quality metrics and aggregated by referring to the quality model. It would classify the quality level of the software based on Net Satisfaction Index (NSI). The system was designed using object-oriented approach in web-based environment. Thus, existing of software quality measurement system was important to both developers and users in order to produce high quality software product for telecommunication industry.

Keywords: Software Quality, Quality Measurement, Object-oriented Approach, Net satisfaction Index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2412

1811 Role of Lemna minor Lin. in Treating the Textile Industry Wastewater

Authors: D. Sivakumar

Abstract:

Textile industry processes are among the most environmentally unfriendly industrial processes; because, they produce color wastewater that is heavily polluted the environment. Therefore, textile industry wastewater has to be treated before being discharged into the environment. In this study, experiments were conducted for different process parameters like nutrient dosage and dilution ratio against the pH and contact time to remove COD and color in a textile industrial wastewater using aquatic macrophytes Lemna minor L. The experimental results showed that the maximum percentage reduction of COD and color in a textile industry wastewater by Lemna minor L. was obtained at an optimum nutrient dosage of 50g, dilution ratio of 8, pH of 8 and contact time of 4 days. Similarly, the results of validation experiments showed that the experiments were able to reproduce the obtained optimum process parameters. The maximum removal percentage of color in an aqueous solution (86.35%) is higher than the removal of color in a textile industry wastewater (82.85). Further, the first order kinetic model was fitted well with the experimental data of this present study. Finally, this study concluded that Lemna minor L. may be used for removing all types of parameters in any type of textile industry wastewater.

Keywords: Aquatic Macrophyte, Process Parameters, Textile Industry Wastewater.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2566

1810 Multi-Hazard Risk Assessment and Management in Tourism Industry- A Case Study from the Island of Taiwan

Authors: Chung-Hung Tsai

Abstract:

Global environmental changes lead to increased frequency and scale of natural disaster, Taiwan is under the influence of global warming and extreme weather. Therefore, the vulnerability was increased and variability and complexity of disasters is relatively enhanced. The purpose of this study is to consider the source and magnitude of hazard characteristics on the tourism industry. Using modern risk management concepts, integration of related domestic and international basic research, this goes beyond the Taiwan typhoon disaster risk assessment model and evaluation of loss. This loss evaluation index system considers the impact of extreme weather, in particular heavy rain on the tourism industry in Taiwan. Consider the extreme climate of the compound impact of disaster for the tourism industry; we try to make multi-hazard risk assessment model, strategies and suggestions. Related risk analysis results are expected to provide government department, the tourism industry asset owners, insurance companies and banking include tourist disaster risk necessary information to help its tourism industry for effective natural disaster risk management.

Keywords: Tourism industry, extreme weather, multi-hazard, vulnerability analysis, loss exceeding probability, risk management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3778

1809 An Educational Data Mining System for Advising Higher Education Students

Authors: Heba Mohammed Nagy, Walid Mohamed Aly, Osama Fathy Hegazy

Abstract:

Educational data mining is a specific data mining field applied to data originating from educational environments, it relies on different approaches to discover hidden knowledge from the available data. Among these approaches are machine learning techniques which are used to build a system that acquires learning from previous data. Machine learning can be applied to solve different regression, classification, clustering and optimization problems.

In our research, we propose a “Student Advisory Framework” that utilizes classification and clustering to build an intelligent system. This system can be used to provide pieces of consultations to a first year university student to pursue a certain education track where he/she will likely succeed in, aiming to decrease the high rate of academic failure among these students. A real case study in Cairo Higher Institute for Engineering, Computer Science and Management is presented using real dataset collected from 2000−2012.The dataset has two main components: pre-higher education dataset and first year courses results dataset. Results have proved the efficiency of the suggested framework.

Keywords: Classification, Clustering, Educational Data Mining (EDM), Machine Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5171

1808 Using Data Mining Technique for Scholarship Disbursement

Authors: J. K. Alhassan, S. A. Lawal

Abstract:

This work is on decision tree-based classification for the disbursement of scholarship. Tree-based data mining classification technique is used in other to determine the generic rule to be used to disburse the scholarship. The system based on the defined rules from the tree is able to determine the class (status) to which an applicant shall belong whether Granted or Not Granted. The applicants that fall to the class of granted denote a successful acquirement of scholarship while those in not granted class are unsuccessful in the scheme. An algorithm that can be used to classify the applicants based on the rules from tree-based classification was also developed. The tree-based classification is adopted because of its efficiency, effectiveness, and easy to comprehend features. The system was tested with the data of National Information Technology Development Agency (NITDA) Abuja, a Parastatal of Federal Ministry of Communication Technology that is mandated to develop and regulate information technology in Nigeria. The system was found working according to the specification. It is therefore recommended for all scholarship disbursement organizations.

Keywords: Decision tree, classification, data mining, scholarship.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2105

1807 Development of Industry Sector Specific Factory Standards

Authors: Peter Burggräf, Moritz Krunke, Hanno Voet

Abstract:

Due to shortening product and technology lifecycles, many companies use standardization approaches in product development and factory planning to reduce costs and time to market. Unlike large companies, where modular systems are already widely used, small and medium-sized companies often show a much lower degree of standardization due to lower scale effects and missing capacities for the development of these standards. To overcome these challenges, the development of industry sector specific standards in cooperations or by third parties is an interesting approach. This paper analyzes which branches that are mainly dominated by small or medium-sized companies might be especially interesting for the development of factory standards using the example of the German industry. For this, a key performance indicator based approach was developed that will be presented in detail with its specific results for the German industry structure.

Keywords: Factory planning, factory standards, industry sector specific standardization, production planning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1176

1806 Text Mining of Twitter Data Using a Latent Dirichlet Allocation Topic Model and Sentiment Analysis

Authors: Sidi Yang, Haiyi Zhang

Abstract:

Twitter is a microblogging platform, where millions of users daily share their attitudes, views, and opinions. Using a probabilistic Latent Dirichlet Allocation (LDA) topic model to discern the most popular topics in the Twitter data is an effective way to analyze a large set of tweets to find a set of topics in a computationally efficient manner. Sentiment analysis provides an effective method to show the emotions and sentiments found in each tweet and an efficient way to summarize the results in a manner that is clearly understood. The primary goal of this paper is to explore text mining, extract and analyze useful information from unstructured text using two approaches: LDA topic modelling and sentiment analysis by examining Twitter plain text data in English. These two methods allow people to dig data more effectively and efficiently. LDA topic model and sentiment analysis can also be applied to provide insight views in business and scientific fields.

Keywords: Text mining, Twitter, topic model, sentiment analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1743

1805 Using Textual Pre-Processing and Text Mining to Create Semantic Links

Authors: Ricardo Avila, Gabriel Lopes, Vania Vidal, Jose Macedo

Abstract:

This article offers a approach to the automatic discovery of semantic concepts and links in the domain of Oil Exploration and Production (E&P). Machine learning methods combined with textual pre-processing techniques were used to detect local patterns in texts and, thus, generate new concepts and new semantic links. Even using more specific vocabularies within the oil domain, our approach has achieved satisfactory results, suggesting that the proposal can be applied in other domains and languages, requiring only minor adjustments.

Keywords: Semantic links, data mining, linked data, SKOS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1005

1804 The Relevance of Data Warehousing and Data Mining in the Field of Evidence-based Medicine to Support Healthcare Decision Making

Authors: Nevena Stolba, A Min Tjoa

Abstract:

Evidence-based medicine is a new direction in modern healthcare. Its task is to prevent, diagnose and medicate diseases using medical evidence. Medical data about a large patient population is analyzed to perform healthcare management and medical research. In order to obtain the best evidence for a given disease, external clinical expertise as well as internal clinical experience must be available to the healthcare practitioners at right time and in the right manner. External evidence-based knowledge can not be applied directly to the patient without adjusting it to the patient-s health condition. We propose a data warehouse based approach as a suitable solution for the integration of external evidence-based data sources into the existing clinical information system and data mining techniques for finding appropriate therapy for a given patient and a given disease. Through integration of data warehousing, OLAP and data mining techniques in the healthcare area, an easy to use decision support platform, which supports decision making process of care givers and clinical managers, is built. We present three case studies, which show, that a clinical data warehouse that facilitates evidence-based medicine is a reliable, powerful and user-friendly platform for strategic decision making, which has a great relevance for the practice and acceptance of evidence-based medicine.

Keywords: data mining, data warehousing, decision-support systems, evidence-based medicine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3743

1803 Application of Advanced Remote Sensing Data in Mineral Exploration in the Vicinity of Heavy Dense Forest Cover Area of Jharkhand and Odisha State Mining Area

Authors: Hemant Kumar, R. N. K. Sharma, A. P. Krishna

Abstract:

The study has been carried out on the Saranda in Jharkhand and a part of Odisha state. Geospatial data of Hyperion, a remote sensing satellite, have been used. This study has used a wide variety of patterns related to image processing to enhance and extract the mining class of Fe and Mn ores.Landsat-8, OLI sensor data have also been used to correctly explore related minerals. In this way, various processes have been applied to increase the mineralogy class and comparative evaluation with related frequency done. The Hyperion dataset for hyperspectral remote sensing has been specifically verified as an effective tool for mineral or rock information extraction within the band range of shortwave infrared used. The abundant spatial and spectral information contained in hyperspectral images enables the differentiation of different objects of any object into targeted applications for exploration such as exploration detection, mining.

Keywords: Hyperion, hyperspectral, sensor, Landsat-8.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 548

1802 The Current Awareness of Just-In-Time Techniques within the Libyan Textile Private Industry: A Case Study

Authors: Rajab Abdullah Hokoma

Abstract:

Almost all Libyan industries (both private and public) have struggled with many difficulties during the past three decades due to many problems. These problems have created a strongly negative impact on the productivity and utilization of many companies within Libya. This paper studies the current awareness and implementation levels of Just-In-Time (JIT) within the Libyan Textile private industry. A survey has been applied in this study using an intensive detailed questionnaire. Based on the analysis of the survey responses, the results show that the management body within the surveyed companies has a modest strategy towards most of the areas that are considered as being very crucial in any successful implementation of JIT. The results also show a variation within the implementation levels of the JIT elements as these varies between Low and Acceptable levels. The paper has also identified limitations within the investigated areas within this industry, and has pointed to areas where senior managers within the Libyan textile industry should take immediate actions in order to achieve effective implementation of JIT within their companies.

Keywords: Industry, questionnaire, JIT, textile.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2001