Search results for: data stream mining
7452 A Location Routing Model for the Logistic System in the Mining Collection Centers of the Northern Region of Boyacá-Colombia
Authors: Erika Ruíz, Luis Amaya, Diego Carreño
Abstract:
The main objective of this study is to design a mathematical model for the logistics of mining collection centers in the northern region of the department of Boyacá (Colombia), determining the structure that facilitates the flow of products along the supply chain. In order to achieve this, it is necessary to define a suitable design of the distribution network, taking into account the products, customer’s characteristics and the availability of information. Likewise, some other aspects must be defined, such as number and capacity of collection centers to establish, routes that must be taken to deliver products to the customers, among others. This research will use one of the operation research problems, which is used in the design of distribution networks known as Location Routing Problem (LRP).
Keywords: Location routing problem, logistic, mining collection, model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7897451 Numerical Simulation of the Flow Field around a Vertical Flat Plate of Infinite Extent
Authors: Marco Raciti Castelli, Paolo Cioppa, Ernesto Benini
Abstract:
This paper presents a CFD analysis of the flow field around a thin flat plate of infinite span inclined at 90° to a fluid stream of infinite extent. Numerical predictions have been compared to experimental measurements, in order to assess the potential of the finite volume code of determining the aerodynamic forces acting on a bluff body invested by a fluid stream of infinite extent. Several turbulence models and spatial node distributions have been tested. Flow field characteristics in the neighborhood of the flat plate have been investigated, allowing the development of a preliminary procedure to be used as guidance in selecting the appropriate grid configuration and the corresponding turbulence model for the prediction of the flow field over a two-dimensional vertical flat plate.Keywords: CFD, vertical flat plate, aerodynamic force
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26307450 A Keyword-Based Filtering Technique of Document-Centric XML using NFA Representation
Authors: Changwoo Byun, Kyounghan Lee, Seog Park
Abstract:
XML is becoming a de facto standard for online data exchange. Existing XML filtering techniques based on a publish/subscribe model are focused on the highly structured data marked up with XML tags. These techniques are efficient in filtering the documents of data-centric XML but are not effective in filtering the element contents of the document-centric XML. In this paper, we propose an extended XPath specification which includes a special matching character '%' used in the LIKE operation of SQL in order to solve the difficulty of writing some queries to adequately filter element contents using the previous XPath specification. We also present a novel technique for filtering a collection of document-centric XMLs, called Pfilter, which is able to exploit the extended XPath specification. We show several performance studies, efficiency and scalability using the multi-query processing time (MQPT).Keywords: XML Data Stream, Document-centric XML, Filtering Technique, Value-based Predicates.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17607449 Data Mining to Capture User-Experience: A Case Study in Notebook Product Appearance Design
Authors: Rhoann Kerh, Chen-Fu Chien, Kuo-Yi Lin
Abstract:
In the era of rapidly increasing notebook market, consumer electronics manufacturers are facing a highly dynamic and competitive environment. In particular, the product appearance is the first part for user to distinguish the product from the product of other brands. Notebook product should differ in its appearance to engage users and contribute to the user experience (UX). The UX evaluates various product concepts to find the design for user needs; in addition, help the designer to further understand the product appearance preference of different market segment. However, few studies have been done for exploring the relationship between consumer background and the reaction of product appearance. This study aims to propose a data mining framework to capture the user’s information and the important relation between product appearance factors. The proposed framework consists of problem definition and structuring, data preparation, rules generation, and results evaluation and interpretation. An empirical study has been done in Taiwan that recruited 168 subjects from different background to experience the appearance performance of 11 different portable computers. The results assist the designers to develop product strategies based on the characteristics of consumers and the product concept that related to the UX, which help to launch the products to the right customers and increase the market shares. The results have shown the practical feasibility of the proposed framework.
Keywords: Consumers Decision Making, Product Design, Rough Set Theory, User Experience.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 35127448 Geochemistry of Natural Radionuclides Associated with Acid Mine Drainage (AMD) in a Coal Mining Area in Southern Brazil
Authors: Juliana A. Galhardi, Daniel M. Bonotto
Abstract:
Coal is an important non-renewable energy source of and can be associated with radioactive elements. In Figueira city, Paraná state, Brazil, it was recorded high uranium activity near the coal mine that supplies a local thermoelectric power plant. In this context, the radon activity (Rn-222, produced by the Ra-226 decay in the U-238 natural series) was evaluated in groundwater, river water and effluents produced from the acid mine drainage in the coal reject dumps. The samples were collected in August 2013 and in February 2014 and analyzed at LABIDRO (Laboratory of Isotope and Hydrochemistry), UNESP, Rio Claro city, Brazil, using an alpha spectrometer (AlphaGuard) adjusted to evaluate the mean radon activity concentration in five cycles of 10 minutes. No radon activity concentration above 100 Bq.L-1, which was a previous critic value established by the World Health Organization. The average radon activity concentration in groundwater was higher than in surface water and in effluent samples, possibly due to the accumulation of uranium and radium in the aquifer layers that favors the radon trapping. The lower value in the river waters can indicate dilution and the intermediate value in the effluents may indicate radon absorption in the coal particles of the reject dumps. The results also indicate that the radon activities in the effluents increase with the sample acidification, possibly due to the higher radium leaching and the subsequent radon transport to the drainage flow. The water samples of Laranjinha River and Ribeirão das Pedras stream, which, respectively, supply Figueira city and receive the mining effluent, exhibited higher pH values upstream the mine, reflecting the acid mine drainage discharge. The radionuclides transport indicates the importance of monitoring their activity concentration in natural waters due to the risks that the radioactivity can represent to human health.Keywords: Radon, radium, acid mine drainage, coal
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20497447 Dynamical Analysis of Circadian Gene Expression
Authors: Carla Layana Luis Diambra
Abstract:
Microarrays technique allows the simultaneous measurements of the expression levels of thousands of mRNAs. By mining this data one can identify the dynamics of the gene expression time series. By recourse of principal component analysis, we uncover the circadian rhythmic patterns underlying the gene expression profiles from Cyanobacterium Synechocystis. We applied PCA to reduce the dimensionality of the data set. Examination of the components also provides insight into the underlying factors measured in the experiments. Our results suggest that all rhythmic content of data can be reduced to three main components.
Keywords: circadian rhythms, clustering, gene expression, PCA.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15927446 Arsenic Mobility from Mining Tailings of Monte San Nicolas to Presa de Mata in Guanajuato, Mexico
Authors: I. Cano-Aguilera, B. E. Rubio-Campos, G. De la Rosa, A. F. Aguilera-Alvarado
Abstract:
Mining tailings represent a generating source of rich heavy metal material with a potential danger the public health and the environment, since these metals, under certain conditions, can leach and contaminate aqueous systems that serve like supplying potable water sources. The strategy for this work is based on the observation, experimentation and the simulation that can be obtained by binding real answers of the hydrodynamic behavior of metals leached from mining tailings, and the applied mathematics that provides the logical structure to decipher the individual effects of the general physicochemical phenomenon. The case of study presented herein focuses on mining tailings deposits located in Monte San Nicolas, Guanajuato, Mexico, an abandoned mine. This was considered the contamination source that under certain physicochemical conditions can favor the metal leaching, and its transport towards aqueous systems. In addition, the cartography, meteorology, geology and the hydrodynamics and hydrological characteristics of the place, will be helpful in determining the way and the time in which these systems can interact. Preliminary results demonstrated that arsenic presents a great mobility, since this one was identified in several superficial aqueous systems of the micro watershed, as well as in sediments in concentrations that exceed the established maximum limits in the official norms. Also variations in pH and potential oxide-reduction were registered, conditions that favor the presence of different species from this element its solubility and therefore its mobility.
Keywords: Arsenic, mining tailings, transport.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16897445 Conceptual Multidimensional Model
Authors: Manpreet Singh, Parvinder Singh, Suman
Abstract:
The data is available in abundance in any business organization. It includes the records for finance, maintenance, inventory, progress reports etc. As the time progresses, the data keep on accumulating and the challenge is to extract the information from this data bank. Knowledge discovery from these large and complex databases is the key problem of this era. Data mining and machine learning techniques are needed which can scale to the size of the problems and can be customized to the application of business. For the development of accurate and required information for particular problem, business analyst needs to develop multidimensional models which give the reliable information so that they can take right decision for particular problem. If the multidimensional model does not possess the advance features, the accuracy cannot be expected. The present work involves the development of a Multidimensional data model incorporating advance features. The criterion of computation is based on the data precision and to include slowly change time dimension. The final results are displayed in graphical form.Keywords: Multidimensional, data precision.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14587444 Methods for Distinction of Cattle Using Supervised Learning
Authors: Radoslav Židek, Veronika Šidlová, Radovan Kasarda, Birgit Fuerst-Waltl
Abstract:
Machine learning represents a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. The data can present identification patterns which are used to classify into groups. The result of the analysis is the pattern which can be used for identification of data set without the need to obtain input data used for creation of this pattern. An important requirement in this process is careful data preparation validation of model used and its suitable interpretation. For breeders, it is important to know the origin of animals from the point of the genetic diversity. In case of missing pedigree information, other methods can be used for traceability of animal´s origin. Genetic diversity written in genetic data is holding relatively useful information to identify animals originated from individual countries. We can conclude that the application of data mining for molecular genetic data using supervised learning is an appropriate tool for hypothesis testing and identifying an individual.
Keywords: Genetic data, Pinzgau cattle, supervised learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23187443 Spatial Audio Player Using Musical Genre Classification
Authors: Jun-Yong Lee, Hyoung-Gook Kim
Abstract:
In this paper, we propose a smart music player that combines the musical genre classification and the spatial audio processing. The musical genre is classified based on content analysis of the musical segment detected from the audio stream. In parallel with the classification, the spatial audio quality is achieved by adding an artificial reverberation in a virtual acoustic space to the input mono sound. Thereafter, the spatial sound is boosted with the given frequency gains based on the musical genre when played back. Experiments measured the accuracy of detecting the musical segment from the audio stream and its musical genre classification. A listening test was performed based on the virtual acoustic space based spatial audio processing.
Keywords: Automatic equalization, genre classification, music segment detection, spatial audio processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16247442 Design and Performance Analysis of a Supersonic Diffuser for Plasma Wing Tunnel
Authors: R.S Pugazenthi, Andy C. McIntosh
Abstract:
Plasma Wind Tunnels (PWT) are extensively used for screening and qualification of re-entry Thermel Protection System (TPS) materials. Proper design of a supersonic diffuser for plasma wind tunnel is of importance for achieving good pressurerecovery (thereby reducing vacuum pumping requirement & run time costs) and isolating downstream stream fluctuations from propagating costs) and isolating downstream stream fluctuationnts the details of a rapid design methodology successfully employed for designing supersonic diffuser for high power (several megawatts)plasma wind tunnels and numerical performance analysis of a diffuser configuration designed for one megawatt power rated plasma wind tunnel(enthalpy ~ 30 MJ/kg) using FLUENT 6.3® solver for different diffuser operating sub-atmospheric back-pressures.Keywords: Compressible flow, plasma wind tunnel, re-entry, supersonic diffuser
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 39227441 A Modified Fuzzy C-Means Algorithm for Natural Data Exploration
Authors: Binu Thomas, Raju G., Sonam Wangmo
Abstract:
In Data mining, Fuzzy clustering algorithms have demonstrated advantage over crisp clustering algorithms in dealing with the challenges posed by large collections of vague and uncertain natural data. This paper reviews concept of fuzzy logic and fuzzy clustering. The classical fuzzy c-means algorithm is presented and its limitations are highlighted. Based on the study of the fuzzy c-means algorithm and its extensions, we propose a modification to the cmeans algorithm to overcome the limitations of it in calculating the new cluster centers and in finding the membership values with natural data. The efficiency of the new modified method is demonstrated on real data collected for Bhutan-s Gross National Happiness (GNH) program.Keywords: Adaptive fuzzy clustering, clustering, fuzzy logic, fuzzy clustering, c-means.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19907440 Word Recognition and Learning based on Associative Memories and Hidden Markov Models
Authors: Zöhre Kara Kayikci, Günther Palm
Abstract:
A word recognition architecture based on a network of neural associative memories and hidden Markov models has been developed. The input stream, composed of subword-units like wordinternal triphones consisting of diphones and triphones, is provided to the network of neural associative memories by hidden Markov models. The word recognition network derives words from this input stream. The architecture has the ability to handle ambiguities on subword-unit level and is also able to add new words to the vocabulary during performance. The architecture is implemented to perform the word recognition task in a language processing system for understanding simple command sentences like “bot show apple".Keywords: Hebbian learning, hidden Markov models, neuralassociative memories, word recognition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15247439 Estimation of Methane from Hydrocarbon Exploration and Production in India
Authors: A. K. Pathak, K. Ojha
Abstract:
Methane is the second most important greenhouse gas (GHG) after carbon dioxide. Amount of methane emission from energy sector is increasing day by day with various activities. In present work, various sources of methane emission from upstream, middle stream and downstream of oil & gas sectors are identified and categorised as per IPCC-2006 guidelines. Data were collected from various oil & gas sector like (i) exploration & production of oil & gas (ii) supply through pipelines (iii) refinery throughput & production (iv) storage & transportation (v) usage. Methane emission factors for various categories were determined applying Tier-II and Tier-I approach using the collected data. Total methane emission from Indian Oil & Gas sectors was thus estimated for the year 1990 to 2007.Keywords: Carbon credit, Climate change, Methane emission, Oil & Gas production
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21407438 Experimental Parallel Architecture for Rendering 3D Model into MPEG-4 Format
Authors: Ajay Joshi, Surya Ismail
Abstract:
This paper will present the initial findings of a research into distributed computer rendering. The goal of the research is to create a distributed computer system capable of rendering a 3D model into an MPEG-4 stream. This paper outlines the initial design, software architecture and hardware setup for the system. Distributed computing means designing and implementing programs that run on two or more interconnected computing systems. Distributed computing is often used to speed up the rendering of graphical imaging. Distributed computing systems are used to generate images for movies, games and simulations. A topic of interest is the application of distributed computing to the MPEG-4 standard. During the course of the research, a distributed system will be created that can render a 3D model into an MPEG-4 stream. It is expected that applying distributed computing principals will speed up rendering, thus improving the usefulness and efficiency of the MPEG-4 standardKeywords: Cluster, parallel architecture, rendering, MPEG-4.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14627437 Amine Solution Recovery Package and Controlling Corrosion in Regeneration Tower
Authors: A.Atash J ameh
Abstract:
Sarkhoon gas plant, located in south of Iran, has been installed to removal H2S contained in a high pressure natural gas stream. The solvent used for the H2S removal from gaseous stream is 34% by weight (wt%) Di-ethanol amine (DEA) solutions. Due to increasing concentration of heat stable salt (HSS) in solvent, corrosivity of amine solution had been increased. Reports indicated that there was corrosion on the shell of regeneration column. Because source formation of HSS was unknown, we decided to control the amount of HSS at the limit less than 3% wt amine solvent. Therefore, two small columns were filled by strong anionic base and carbon active, and then polluted amine was passed through beds. Finally a temporary amine recovery package on industrial scale was made based on laboratory’s results. From economical point of view we could save $700000 beside corrosion occurrence of the stripping column has been vigorously decreased.
Keywords: Amines, corrosion, heat stable salt, resin anionic.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26757436 Analysis of Textual Data Based On Multiple 2-Class Classification Models
Authors: Shigeaki Sakurai, Ryohei Orihara
Abstract:
This paper proposes a new method for analyzing textual data. The method deals with items of textual data, where each item is described based on various viewpoints. The method acquires 2- class classification models of the viewpoints by applying an inductive learning method to items with multiple viewpoints. The method infers whether the viewpoints are assigned to the new items or not by using the models. The method extracts expressions from the new items classified into the viewpoints and extracts characteristic expressions corresponding to the viewpoints by comparing the frequency of expressions among the viewpoints. This paper also applies the method to questionnaire data given by guests at a hotel and verifies its effect through numerical experiments.
Keywords: Text mining, Multiple viewpoints, Differential analysis, Questionnaire data
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12907435 Multidimensional Data Mining by Means of Randomly Travelling Hyper-Ellipsoids
Authors: Pavel Y. Tabakov, Kevin Duffy
Abstract:
The present study presents a new approach to automatic data clustering and classification problems in large and complex databases and, at the same time, derives specific types of explicit rules describing each cluster. The method works well in both sparse and dense multidimensional data spaces. The members of the data space can be of the same nature or represent different classes. A number of N-dimensional ellipsoids are used for enclosing the data clouds. Due to the geometry of an ellipsoid and its free rotation in space the detection of clusters becomes very efficient. The method is based on genetic algorithms that are used for the optimization of location, orientation and geometric characteristics of the hyper-ellipsoids. The proposed approach can serve as a basis for the development of general knowledge systems for discovering hidden knowledge and unexpected patterns and rules in various large databases.Keywords: Classification, clustering, data minig, genetic algorithms.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17727434 A Preference-Based Multi-Agent Data Mining Framework for Social Network Service Users' Decision Making
Authors: Ileladewa Adeoye Abiodun, Cheng Wai Khuen
Abstract:
Multi-Agent Systems (MAS) emerged in the pursuit to improve our standard of living, and hence can manifest complex human behaviors such as communication, decision making, negotiation and self-organization. The Social Network Services (SNSs) have attracted millions of users, many of whom have integrated these sites into their daily practices. The domains of MAS and SNS have lots of similarities such as architecture, features and functions. Exploring social network users- behavior through multiagent model is therefore our research focus, in order to generate more accurate and meaningful information to SNS users. An application of MAS is the e-Auction and e-Rental services of the Universiti Cyber AgenT(UniCAT), a Social Network for students in Universiti Tunku Abdul Rahman (UTAR), Kampar, Malaysia, built around the Belief- Desire-Intention (BDI) model. However, in spite of the various advantages of the BDI model, it has also been discovered to have some shortcomings. This paper therefore proposes a multi-agent framework utilizing a modified BDI model- Belief-Desire-Intention in Dynamic and Uncertain Situations (BDIDUS), using UniCAT system as a case study.
Keywords: Distributed Data Mining, Multi-Agent Systems, Preference-Based, SNS.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15027433 Multi-Dimensional Concerns Mining for Web Applications via Concept-Analysis
Authors: Carlo Bellettini, Alessandro Marchetto, Andrea Trentini
Abstract:
Web applications have become very complex and crucial, especially when combined with areas such as CRM (Customer Relationship Management) and BPR (Business Process Reengineering), the scientific community has focused attention to Web applications design, development, analysis, and testing, by studying and proposing methodologies and tools. This paper proposes an approach to automatic multi-dimensional concern mining for Web Applications, based on concepts analysis, impact analysis, and token-based concern identification. This approach lets the user to analyse and traverse Web software relevant to a particular concern (concept, goal, purpose, etc.) via multi-dimensional separation of concerns, to document, understand and test Web applications. This technique was developed in the context of WAAT (Web Applications Analysis and Testing) project. A semi-automatic tool to support this technique is currently under development.Keywords: Concepts Analysis, Concerns Mining, Multi-Dimensional Separation of Concerns, Impact Analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14737432 Mining Implicit Knowledge to Predict Political Risk by Providing Novel Framework with Using Bayesian Network
Authors: Siavash Asadi Ghajarloo
Abstract:
Nowadays predicting political risk level of country has become a critical issue for investors who intend to achieve accurate information concerning stability of the business environments. Since, most of the times investors are layman and nonprofessional IT personnel; this paper aims to propose a framework named GECR in order to help nonexpert persons to discover political risk stability across time based on the political news and events. To achieve this goal, the Bayesian Networks approach was utilized for 186 political news of Pakistan as sample dataset. Bayesian Networks as an artificial intelligence approach has been employed in presented framework, since this is a powerful technique that can be applied to model uncertain domains. The results showed that our framework along with Bayesian Networks as decision support tool, predicted the political risk level with a high degree of accuracy.Keywords: Bayesian Networks, Data mining, GECRframework, Predicting political risk.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21747431 A Hybrid Approach for Thread Recommendation in MOOC Forums
Authors: Ahmad. A. Kardan, Amir Narimani, Foozhan Ataiefard
Abstract:
Recommender Systems have been developed to provide contents and services compatible to users based on their behaviors and interests. Due to information overload in online discussion forums and users diverse interests, recommending relative topics and threads is considered to be helpful for improving the ease of forum usage. In order to lead learners to find relevant information in educational forums, recommendations are even more needed. We present a hybrid thread recommender system for MOOC forums by applying social network analysis and association rule mining techniques. Initial results indicate that the proposed recommender system performs comparatively well with regard to limited available data from users' previous posts in the forum.Keywords: Association rule mining, hybrid recommender system, massive open online courses, MOOCs, social network analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12637430 Learning and Evaluating Possibilistic Decision Trees using Information Affinity
Authors: Ilyes Jenhani, Salem Benferhat, Zied Elouedi
Abstract:
This paper investigates the issue of building decision trees from data with imprecise class values where imprecision is encoded in the form of possibility distributions. The Information Affinity similarity measure is introduced into the well-known gain ratio criterion in order to assess the homogeneity of a set of possibility distributions representing instances-s classes belonging to a given training partition. For the experimental study, we proposed an information affinity based performance criterion which we have used in order to show the performance of the approach on well-known benchmarks.Keywords: Data mining from uncertain data, Decision Trees, Possibility Theory.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15157429 A Study of Lean Principles Implementation in the Libyan Healthcare and Industry Sectors
Authors: Nasser M. Amaitik, Ngwan F. Elsagzli
Abstract:
Lean technique is very important in the service and industrial fields. It is defined as an effective tool to eliminate the wastes. In lean the wastes are defined as anything which does not add value to the end product. There are wastes that can be avoided, but some are unavoidable for many reasons.
The present study aims to apply the principles of lean in two different sectors, healthcare and industry. Two case studies have been selected to apply the experimental work. The first case was Al-Jalaa Hospital, while the second case study was the Technical Company of Aluminum Sections in Benghazi, LIBYA. In both case studies the Value Stream Map (VSM) of the current state has been constructed. The proposed plans have been implemented by merging or eliminating procedures or processes.
The results obtained from both case studies showed improvement in Capacity, Idle time and Utilized time.
Keywords: Healthcare service delivery, Idle time, Lean principles, Utilized time, Value stream mapping, Wastes.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23267428 A Study on Finding Similar Document with Multiple Categories
Authors: R. Saraçoğlu, N. Allahverdi
Abstract:
Searching similar documents and document management subjects have important place in text mining. One of the most important parts of similar document research studies is the process of classifying or clustering the documents. In this study, a similar document search approach that includes discussion of out the case of belonging to multiple categories (multiple categories problem) has been carried. The proposed method that based on Fuzzy Similarity Classification (FSC) has been compared with Rocchio algorithm and naive Bayes method which are widely used in text mining. Empirical results show that the proposed method is quite successful and can be applied effectively. For the second stage, multiple categories vector method based on information of categories regarding to frequency of being seen together has been used. Empirical results show that achievement is increased almost two times, when proposed method is compared with classical approach.
Keywords: Document similarity, Fuzzy classification, Multiple categories, Text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17077427 Mining User-Generated Contents to Detect Service Failures with Topic Model
Authors: Kyung Bae Park, Sung Ho Ha
Abstract:
Online user-generated contents (UGC) significantly change the way customers behave (e.g., shop, travel), and a pressing need to handle the overwhelmingly plethora amount of various UGC is one of the paramount issues for management. However, a current approach (e.g., sentiment analysis) is often ineffective for leveraging textual information to detect the problems or issues that a certain management suffers from. In this paper, we employ text mining of Latent Dirichlet Allocation (LDA) on a popular online review site dedicated to complaint from users. We find that the employed LDA efficiently detects customer complaints, and a further inspection with the visualization technique is effective to categorize the problems or issues. As such, management can identify the issues at stake and prioritize them accordingly in a timely manner given the limited amount of resources. The findings provide managerial insights into how analytics on social media can help maintain and improve their reputation management. Our interdisciplinary approach also highlights several insights by applying machine learning techniques in marketing research domain. On a broader technical note, this paper illustrates the details of how to implement LDA in R program from a beginning (data collection in R) to an end (LDA analysis in R) since the instruction is still largely undocumented. In this regard, it will help lower the boundary for interdisciplinary researcher to conduct related research.Keywords: Latent Dirichlet allocation, R program, text mining, topic model, user generated contents, visualization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12167426 Lead and Cadmium Spatial Pattern and Risk Assessment around Coal Mine in Hyrcanian Forest, North Iran
Authors: Mahsa Tavakoli, Seyed Mohammad Hojjati, Yahya Kooch
Abstract:
In this study, the effect of coal mining activities on lead and cadmium concentrations and distribution in soil was investigated in Hyrcanian forest, North Iran. 16 plots (20×20 m2) were established by systematic-randomly (60×60 m2) in an area of 4 ha (200×200 m2-mine entrance placed at center). An area adjacent to the mine was not affected by the mining activity; considered as the controlled area. In order to investigate soil lead and cadmium concentration, one sample was taken from the 0-10 cm in each plot. To study the spatial pattern of soil properties and lead and cadmium concentrations in the mining area, an area of 80×80m2 (the mine as the center) was considered and 80 soil samples were systematic-randomly taken (10 m intervals). Geostatistical analysis was performed via Kriging method and GS+ software (version 5.1). In order to estimate the impact of coal mining activities on soil quality, pollution index was measured. Lead and cadmium concentrations were significantly higher in mine area (Pb: 10.97±0.30, Cd: 184.47±6.26 mg.kg-1) in comparison to control area (Pb: 9.42±0.17, Cd: 131.71±15.77 mg.kg-1). The mean values of the PI index indicate that Pb (1.16) and Cd (1.77) presented slightly polluted. Results of the NIPI index showed that Pb (1.44) and Cd (2.52) presented slight pollution and moderate pollution respectively. Results of variography and kriging method showed that it is possible to prepare interpolation maps of lead and cadmium around the mining areas in Hyrcanian forest. According to results of pollution and risk assessments, forest soil was contaminated by heavy metals (lead and cadmium); therefore, using reclamation and remediation techniques in these areas is necessary.
Keywords: Traditional coal mining, heavy metals, pollution indicators, geostatistics, caspian forest.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10517425 Unsupervised Outlier Detection in Streaming Data Using Weighted Clustering
Authors: Yogita, Durga Toshniwal
Abstract:
Outlier detection in streaming data is very challenging because streaming data cannot be scanned multiple times and also new concepts may keep evolving. Irrelevant attributes can be termed as noisy attributes and such attributes further magnify the challenge of working with data streams. In this paper, we propose an unsupervised outlier detection scheme for streaming data. This scheme is based on clustering as clustering is an unsupervised data mining task and it does not require labeled data, both density based and partitioning clustering are combined for outlier detection. In this scheme partitioning clustering is also used to assign weights to attributes depending upon their respective relevance and weights are adaptive. Weighted attributes are helpful to reduce or remove the effect of noisy attributes. Keeping in view the challenges of streaming data, the proposed scheme is incremental and adaptive to concept evolution. Experimental results on synthetic and real world data sets show that our proposed approach outperforms other existing approach (CORM) in terms of outlier detection rate, false alarm rate, and increasing percentages of outliers.
Keywords: Concept Evolution, Irrelevant Attributes, Streaming Data, Unsupervised Outlier Detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26377424 Clustering Unstructured Text Documents Using Fading Function
Authors: Pallav Roxy, Durga Toshniwal
Abstract:
Clustering unstructured text documents is an important issue in data mining community and has a number of applications such as document archive filtering, document organization and topic detection and subject tracing. In the real world, some of the already clustered documents may not be of importance while new documents of more significance may evolve. Most of the work done so far in clustering unstructured text documents overlooks this aspect of clustering. This paper, addresses this issue by using the Fading Function. The unstructured text documents are clustered. And for each cluster a statistics structure called Cluster Profile (CP) is implemented. The cluster profile incorporates the Fading Function. This Fading Function keeps an account of the time-dependent importance of the cluster. The work proposes a novel algorithm Clustering n-ary Merge Algorithm (CnMA) for unstructured text documents, that uses Cluster Profile and Fading Function. Experimental results illustrating the effectiveness of the proposed technique are also included.Keywords: Clustering, Text Mining, Unstructured TextDocuments, Fading Function.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19857423 Signing the First Packet in Amortization Scheme for Multicast Stream Authentication
Authors: Mohammed Shatnawi, Qusai Abuein, Susumu Shibusawa
Abstract:
Signature amortization schemes have been introduced for authenticating multicast streams, in which, a single signature is amortized over several packets. The hash value of each packet is computed, some hash values are appended to other packets, forming what is known as hash chain. These schemes divide the stream into blocks, each block is a number of packets, the signature packet in these schemes is either the first or the last packet of the block. Amortization schemes are efficient solutions in terms of computation and communication overhead, specially in real-time environment. The main effictive factor of amortization schemes is it-s hash chain construction. Some studies show that signing the first packet of each block reduces the receiver-s delay and prevents DoS attacks, other studies show that signing the last packet reduces the sender-s delay. To our knowledge, there is no studies that show which is better, to sign the first or the last packet in terms of authentication probability and resistance to packet loss. In th is paper we will introduce another scheme for authenticating multicast streams that is robust against packet loss, reduces the overhead, and prevents the DoS attacks experienced by the receiver in the same time. Our scheme-The Multiple Connected Chain signing the First packet (MCF) is to append the hash values of specific packets to other packets,then append some hashes to the signature packet which is sent as the first packet in the block. This scheme is aspecially efficient in terms of receiver-s delay. We discuss and evaluate the performance of our proposed scheme against those that sign the last packet of the block.Keywords: multicast stream authentication, hash chain construction, signature amortization, authentication probability.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1518