Search results for: distributed data mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8082

Search results for: distributed data mining

7842 A Query Optimization Strategy for Autonomous Distributed Database Systems

Authors: Dina K. Badawy, Dina M. Ibrahim, Alsayed A. Sallam

Abstract:

Distributed database is a collection of logically related databases that cooperate in a transparent manner. Query processing uses a communication network for transmitting data between sites. It refers to one of the challenges in the database world. The development of sophisticated query optimization technology is the reason for the commercial success of database systems, which complexity and cost increase with increasing number of relations in the query. Mariposa, query trading and query trading with processing task-trading strategies developed for autonomous distributed database systems, but they cause high optimization cost because of involvement of all nodes in generating an optimal plan. In this paper, we proposed a modification on the autonomous strategy K-QTPT that make the seller’s nodes with the lowest cost have gradually high priorities to reduce the optimization time. We implement our proposed strategy and present the results and analysis based on those results.

Keywords: Autonomous strategies, distributed database systems, high priority, query optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1004
7841 AniMoveMineR: Animal Behavior Exploratory Analysis Using Association Rules Mining

Authors: Suelane Garcia Fontes, Silvio Luiz Stanzani, Pedro L. Pizzigatti Corrła Ronaldo G. Morato

Abstract:

Environmental changes and major natural disasters are most prevalent in the world due to the damage that humanity has caused to nature and these damages directly affect the lives of animals. Thus, the study of animal behavior and their interactions with the environment can provide knowledge that guides researchers and public agencies in preservation and conservation actions. Exploratory analysis of animal movement can determine the patterns of animal behavior and with technological advances the ability of animals to be tracked and, consequently, behavioral studies have been expanded. There is a lot of research on animal movement and behavior, but we note that a proposal that combines resources and allows for exploratory analysis of animal movement and provide statistical measures on individual animal behavior and its interaction with the environment is missing. The contribution of this paper is to present the framework AniMoveMineR, a unified solution that aggregates trajectory analysis and data mining techniques to explore animal movement data and provide a first step in responding questions about the animal individual behavior and their interactions with other animals over time and space. We evaluated the framework through the use of monitored jaguar data in the city of Miranda Pantanal, Brazil, in order to verify if the use of AniMoveMineR allows to identify the interaction level between these jaguars. The results were positive and provided indications about the individual behavior of jaguars and about which jaguars have the highest or lowest correlation.

Keywords: Data mining, data science, trajectory, animal behavior.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 830
7840 Improved FP-growth Algorithm with Multiple Minimum Supports Using Maximum Constraints

Authors: Elsayeda M. Elgaml, Dina M. Ibrahim, Elsayed A. Sallam

Abstract:

Association rule mining is one of the most important fields of data mining and knowledge discovery. In this paper, we propose an efficient multiple support frequent pattern growth algorithm which we called “MSFP-growth” that enhancing the FPgrowth algorithm by making infrequent child node pruning step with multiple minimum support using maximum constrains. The algorithm is implemented, and it is compared with other common algorithms: Apriori-multiple minimum supports using maximum constraints and FP-growth. The experimental results show that the rule mining from the proposed algorithm are interesting and our algorithm achieved better performance than other algorithms without scarifying the accuracy. 

Keywords: Association Rules, FP-growth, Multiple minimum supports, Weka Tool

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3265
7839 Finding Fuzzy Association Rules Using FWFP-Growth with Linguistic Supports and Confidences

Authors: Chien-Hua Wang, Chin-Tzong Pang

Abstract:

In data mining, the association rules are used to search for the relations of items of the transactions database. Following the data is collected and stored, it can find rules of value through association rules, and assist manager to proceed marketing strategy and plan market framework. In this paper, we attempt fuzzy partition methods and decide membership function of quantitative values of each transaction item. Also, by managers we can reflect the importance of items as linguistic terms, which are transformed as fuzzy sets of weights. Next, fuzzy weighted frequent pattern growth (FWFP-Growth) is used to complete the process of data mining. The method above is expected to improve Apriori algorithm for its better efficiency of the whole association rules. An example is given to clearly illustrate the proposed approach.

Keywords: Association Rule, Fuzzy Partition Methods, FWFP-Growth, Apiroir algorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1603
7838 Distributed Splay Suffix Arrays: A New Structure for Distributed String Search

Authors: Tu Kun, Gu Nai-jie, Bi Kun, Liu Gang, Dong Wan-li

Abstract:

As a structure for processing string problem, suffix array is certainly widely-known and extensively-studied. But if the string access pattern follows the “90/10" rule, suffix array can not take advantage of the fact that we often find something that we have just found. Although the splay tree is an efficient data structure for small documents when the access pattern follows the “90/10" rule, it requires many structures and an excessive amount of pointer manipulations for efficiently processing and searching large documents. In this paper, we propose a new and conceptually powerful data structure, called splay suffix arrays (SSA), for string search. This data structure combines the features of splay tree and suffix arrays into a new approach which is suitable to implementation on both conventional and clustered computers.

Keywords: suffix arrays, splay tree, string search, distributedalgorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1729
7837 Efficient Real-time Remote Data Propagation Mechanism for a Component-Based Approach to Distributed Manufacturing

Authors: V. Barot, S. McLeod, R. Harrison, A. A. West

Abstract:

Manufacturing Industries face a crucial change as products and processes are required to, easily and efficiently, be reconfigurable and reusable. In order to stay competitive and flexible, situations also demand distribution of enterprises globally, which requires implementation of efficient communication strategies. A prototype system called the “Broadcaster" has been developed with an assumption that the control environment description has been engineered using the Component-based system paradigm. This prototype distributes information to a number of globally distributed partners via an adoption of the circular-based data processing mechanism. The work highlighted in this paper includes the implementation of this mechanism in the domain of the manufacturing industry. The proposed solution enables real-time remote propagation of machine information to a number of distributed supply chain client resources such as a HMI, VRML-based 3D views and remote client instances regardless of their distribution nature and/ or their mechanisms. This approach is presented together with a set of evaluation results. Authors- main concentration surrounds the reliability and the performance metric of the adopted approach. Performance evaluation is carried out in terms of the response times taken to process the data in this domain and compared with an alternative data processing implementation such as the linear queue mechanism. Based on the evaluation results obtained, authors justify the benefits achieved from this proposed implementation and highlight any further research work that is to be carried out.

Keywords: Broadcaster, circular buffer, Component-based, distributed manufacturing, remote data propagation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1333
7836 Acute Coronary Syndrome Prediction Using Data Mining Techniques- An Application

Authors: Tahseen A. Jilani, Huda Yasin, Madiha Yasin, C. Ardil

Abstract:

In this paper we use data mining techniques to investigate factors that contribute significantly to enhancing the risk of acute coronary syndrome. We assume that the dependent variable is diagnosis – with dichotomous values showing presence or  absence of disease. We have applied binary regression to the factors affecting the dependent variable. The data set has been taken from two different cardiac hospitals of Karachi, Pakistan. We have total sixteen variables out of which one is assumed dependent and other 15 are independent variables. For better performance of the regression model in predicting acute coronary syndrome, data reduction techniques like principle component analysis is applied. Based on results of data reduction, we have considered only 14 out of sixteen factors.

Keywords: Acute coronary syndrome (ACS), binary logistic regression analyses, myocardial ischemia (MI), principle component analysis, unstable angina (U.A.).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2074
7835 Mining Association Rules from Unstructured Documents

Authors: Hany Mahgoub

Abstract:

This paper presents a system for discovering association rules from collections of unstructured documents called EART (Extract Association Rules from Text). The EART system treats texts only not images or figures. EART discovers association rules amongst keywords labeling the collection of textual documents. The main characteristic of EART is that the system integrates XML technology (to transform unstructured documents into structured documents) with Information Retrieval scheme (TF-IDF) and Data Mining technique for association rules extraction. EART depends on word feature to extract association rules. It consists of four phases: structure phase, index phase, text mining phase and visualization phase. Our work depends on the analysis of the keywords in the extracted association rules through the co-occurrence of the keywords in one sentence in the original text and the existing of the keywords in one sentence without co-occurrence. Experiments applied on a collection of scientific documents selected from MEDLINE that are related to the outbreak of H5N1 avian influenza virus.

Keywords: Association rules, information retrieval, knowledgediscovery in text, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2398
7834 Data Mining Determination of Sunlight Average Input for Solar Power Plant

Authors: Fl. Loury, P. Sablonière, C. Lamoureux, G. Magnier, Th. Gutierrez

Abstract:

A method is proposed to extract faithful representative patterns from data set of observations when they are suffering from non-negligible fluctuations. Supposing time interval between measurements to be extremely small compared to observation time, it consists in defining first a subset of intermediate time intervals characterizing coherent behavior. Data projection on these intervals gives a set of curves out of which an ideally “perfect” one is constructed by taking the sup limit of them. Then comparison with average real curve in corresponding interval gives an efficiency parameter expressing the degradation consecutive to fluctuation effect. The method is applied to sunlight data collected in a specific place, where ideal sunlight is the one resulting from direct exposure at location latitude over the year, and efficiency is resulting from action of meteorological parameters, mainly cloudiness, at different periods of the year. The extracted information already gives interesting element of decision, before being used for analysis of plant control.

Keywords: Base Input Reconstruction, Data Mining, Efficiency Factor, Information Pattern Operator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1481
7833 Application of Granular Computing Paradigm in Knowledge Induction

Authors: Iftikhar U. Sikder

Abstract:

This paper illustrates an application of granular computing approach, namely rough set theory in data mining. The paper outlines the formalism of granular computing and elucidates the mathematical underpinning of rough set theory, which has been widely used by the data mining and the machine learning community. A real-world application is illustrated, and the classification performance is compared with other contending machine learning algorithms. The predictive performance of the rough set rule induction model shows comparative success with respect to other contending algorithms.

Keywords: Concept approximation, granular computing, reducts, rough set theory, rule induction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 780
7832 Distributed Generator Placement for Loss Reduction and Improvement in Reliability

Authors: Priyanka Paliwal, N.P. Patidar

Abstract:

Distributed Power generation has gained a lot of attention in recent times due to constraints associated with conventional power generation and new advancements in DG technologies .The need to operate the power system economically and with optimum levels of reliability has further led to an increase in interest in Distributed Generation. However it is important to place Distributed Generator on an optimum location so that the purpose of loss minimization and voltage regulation is dully served on the feeder. This paper investigates the impact of DG units installation on electric losses, reliability and voltage profile of distribution networks. In this paper, our aim would be to find optimal distributed generation allocation for loss reduction subjected to constraint of voltage regulation in distribution network. The system is further analyzed for increased levels of Reliability. Distributed Generator offers the additional advantage of increase in reliability levels as suggested by the improvements in various reliability indices such as SAIDI, CAIDI and AENS. Comparative studies are performed and related results are addressed. An analytical technique is used in order to find the optimal location of Distributed Generator. The suggested technique is programmed under MATLAB software. The results clearly indicate that DG can reduce the electrical line loss while simultaneously improving the reliability of the system.

Keywords: AENS, CAIDI, Distributed Generation, lossreduction, Reliability, SAIDI

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3056
7831 A Novel Hybrid Mobile Agent Based Distributed Intrusion Detection System

Authors: Amir Vahid Dastjerdi, Kamalrulnizam Abu Bakar

Abstract:

The first generation of Mobile Agents based Intrusion Detection System just had two components namely data collection and single centralized analyzer. The disadvantage of this type of intrusion detection is if connection to the analyzer fails, the entire system will become useless. In this work, we propose novel hybrid model for Mobile Agent based Distributed Intrusion Detection System to overcome the current problem. The proposed model has new features such as robustness, capability of detecting intrusion against the IDS itself and capability of updating itself to detect new pattern of intrusions. In addition, our proposed model is also capable of tackling some of the weaknesses of centralized Intrusion Detection System models.

Keywords: Distributed Intrusion Detection System, Mobile Agents, Network Security.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1738
7830 Communication and Quality in Distributed Agile Development: An Empirical Case Study

Authors: R. Green, T. Mazzuchi, S. Sarkani

Abstract:

Through inward perceptions, we intuitively expect distributed software development to increase the risks associated with achieving cost, schedule, and quality goals. To compound this problem, agile software development (ASD) insists one of the main ingredients of its success is cohesive communication attributed to collocation of the development team. The following study identified the degree of communication richness needed to achieve comparable software quality (reduce pre-release defects) between distributed and collocated teams. This paper explores the relevancy of communication richness in various development phases and its impact on quality. Through examination of a large distributed agile development project, this investigation seeks to understand the levels of communication required within each ASD phase to produce comparable quality results achieved by collocated teams. Obviously, a multitude of factors affects the outcome of software projects. However, within distributed agile software development teams, the mode of communication is one of the critical components required to achieve team cohesiveness and effectiveness. As such, this study constructs a distributed agile communication model (DAC-M) for potential application to similar distributed agile development efforts using the measurement of the suitable level of communication. The results of the study show that less rich communication methods, in the appropriate phase, might be satisfactory to achieve equivalent quality in distributed ASD efforts.

Keywords: agile software development (ASD), distributedsoftware teams, media richness theory, software development.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2115
7829 An Innovation of Travel Information Gathering Framework

Authors: Pairaya J., Buddhagarn R., Sukree S., Punthumadee K.

Abstract:

Application of Information Technology (IT) has revolutionized the functioning of business all over the world. Its impact has been felt mostly among the information of dependent industries. Tourism is one of such industry. The conceptual framework in this study represents an innovation of travel information searching system on mobile devices which is used as tools to deliver travel information (such as hotels, restaurants, tourist attractions and souvenir shops) for each user by travelers segmentation based on data mining technique to segment the tourists- behavior patterns then match them with tourism products and services. This system innovation is designed to be a knowledge incremental learning. It is a marketing strategy to support business to respond traveler-s demand effectively.

Keywords: Tourism, Innovation, Information Searching, Data Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1828
7828 A Hybrid Distributed Vision System for Robot Localization

Authors: Hsiang-Wen Hsieh, Chin-Chia Wu, Hung-Hsiu Yu, Shu-Fan Liu

Abstract:

Localization is one of the critical issues in the field of robot navigation. With an accurate estimate of the robot pose, robots will be capable of navigating in the environment autonomously and efficiently. In this paper, a hybrid Distributed Vision System (DVS) for robot localization is presented. The presented approach integrates odometry data from robot and images captured from overhead cameras installed in the environment to help reduce possibilities of fail localization due to effects of illumination, encoder accumulated errors, and low quality range data. An odometry-based motion model is applied to predict robot poses, and robot images captured by overhead cameras are then used to update pose estimates with HSV histogram-based measurement model. Experiment results show the presented approach could localize robots in a global world coordinate system with localization errors within 100mm.

Keywords: Distributed Vision System, Localization, Measurement model, Motion model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1302
7827 Distributed e-Learning System with Client-Server and P2P Hybrid Architecture

Authors: Kazunari Meguro, Shinichi Motomura, Takao Kawamura, Kazunori Sugahara

Abstract:

We have developed a distributed asynchronous Web based training system. In order to improve the scalability and robustness of this system, all contents and a function are realized on mobile agents. These agents are distributed to computers, and they can use a Peer to Peer network that modified Content-Addressable Network. In this system, all computers offer the function and exercise by themselves. However, the system that all computers do the same behavior is not realistic. In this paper, as a solution of this issue, we present an e-Learning system that is composed of computers of different participation types. Enabling the computer of different participation types will improve the convenience of the system.

Keywords: Distributed Multimedia Systems, e-Learning, P2P, Mobile Agen

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2291
7826 Selection of Relevant Servers in Distributed Information Retrieval System

Authors: Benhamouda Sara, Guezouli Larbi

Abstract:

Nowadays, the dissemination of information touches the distributed world, where selecting the relevant servers to a user request is an important problem in distributed information retrieval. During the last decade, several research studies on this issue have been launched to find optimal solutions and many approaches of collection selection have been proposed. In this paper, we propose a new collection selection approach that takes into consideration the number of documents in a collection that contains terms of the query and the weights of those terms in these documents. We tested our method and our studies show that this technique can compete with other state-of-the-art algorithms that we choose to test the performance of our approach.

Keywords: Distributed information retrieval, relevance, server selection, collection selection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1326
7825 Online Forums Hotspot Detection and Analysis Using Aging Theory

Authors: K. Nirmala Devi, V. Murali Bhaskaran

Abstract:

The exponential growth of social media arouses much attention on public opinion information. The online forums, blogs, micro blogs are proving to be extremely valuable resources and are having bulk volume of information. However, most of the social media data is unstructured and semi structured form. So that it is more difficult to decipher automatically. Therefore, it is very much essential to understand and analyze those data for making a right decision. The online forums hotspot detection is a promising research field in the web mining and it guides to motivate the user to take right decision in right time. The proposed system consist of a novel approach to detect a hotspot forum for any given time period. It uses aging theory to find the hot terms and E-K-means for detecting the hotspot forum. Experimental results demonstrate that the proposed approach outperforms k-means for detecting the hotspot forums with the improved accuracy.

Keywords: Hotspot forums, Micro blog, Blog, Sentiment Analysis, Opinion Mining, Social media, Twitter, Web mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2138
7824 A Study of Soil Heavy Metal Pollution in the Manganese Mining in Drama, Greece

Authors: A. Argiri, A. Molla, Tzouvalekas, E. Skoufogianni, N. Danalatos

Abstract:

The release of heavy metals into the environment has increased over the last years. In this study, 25 soil samples (0-15 cm) from the fields near the mining area in Drama region were selected. The samples were analyzed in the laboratory for their physicochemical properties and for seven “pseudo-total’’ heavy metals content, namely Pb, Zn, Cd, Cr, Cu, Ni, and Mn. The total metal concentrations (Pb, Zn, Cd, Cr, Cu, Ni and Mn) in digests were determined by using the atomic absorption spectrophotometer. According to the results, the mean concentration of the listed heavy metals in 25 soil samples are Cd 1.1 mg/kg, Cr 15 mg/kg, Cu 21.7 mg/kg, Ni 30.1 mg/kg, Pd 50.8 mg/kg, Zn 99.5 mg/kg and Mn 815.3 mg/kg. The results show that the heavy metals remain in the soil even if the mining closed many years ago.

Keywords: Greece, heavy metals, mining, pollution

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 510
7823 Investigation of Chord Protocol in Peer to Peer-Wireless Mesh Network with Mobility

Authors: P. Prasanna Murali Krishna, M. V. Subramanyam, K. Satya Prasad

Abstract:

File sharing in networks is generally achieved using Peer-to-Peer (P2P) applications. Structured P2P approaches are widely used in adhoc networks due to its distributed and scalability features. Efficient mechanisms are required to handle the huge amount of data distributed to all peers. The intrinsic characteristics of P2P system makes for easier content distribution when compared to client-server architecture. All the nodes in a P2P network act as both client and server, thus, distributing data takes lesser time when compared to the client-server method. CHORD protocol is a resource routing based where nodes and data items are structured into a 1- dimensional ring. The structured lookup algorithm of Chord is advantageous for distributed P2P networking applications. However, structured approach improves lookup performance in a high bandwidth wired network it could contribute to unnecessary overhead in overlay networks leading to degradation of network performance. In this paper, the performance of existing CHORD protocol on Wireless Mesh Network (WMN) when nodes are static and dynamic is investigated.

Keywords: Wireless mesh network (WMN), structured P2P networks, peer to peer resource sharing, CHORD protocol, DHT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2087
7822 A Novel Approach to Improve Users Search Goal in Web Usage Mining

Authors: R. Lokeshkumar, P. Sengottuvelan

Abstract:

Web mining is to discover and extract useful Information. Different users may have different search goals when they search by giving queries and submitting it to a search engine. The inference and analysis of user search goals can be very useful for providing an experience result for a user search query. In this project, we propose a novel approach to infer user search goals by analyzing search web logs. First, we propose a novel approach to infer user search goals by analyzing search engine query logs, the feedback sessions are constructed from user click-through logs and it efficiently reflect the information needed for users. Second we propose a preprocessing technique to clean the unnecessary data’s from web log file (feedback session). Third we propose a technique to generate pseudo-documents to representation of feedback sessions for clustering. Finally we implement k-medoids clustering algorithm to discover different user search goals and to provide a more optimal result for a search query based on feedback sessions for the user.

Keywords: Data Preprocessing, Session Identification, Web log mining, Web Personalization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1983
7821 A Consistency Protocol Multi-Layer for Replicas Management in Large Scale Systems

Authors: Ghalem Belalem, Yahya Slimani

Abstract:

Large scale systems such as computational Grid is a distributed computing infrastructure that can provide globally available network resources. The evolution of information processing systems in Data Grid is characterized by a strong decentralization of data in several fields whose objective is to ensure the availability and the reliability of the data in the reason to provide a fault tolerance and scalability, which cannot be possible only with the use of the techniques of replication. Unfortunately the use of these techniques has a height cost, because it is necessary to maintain consistency between the distributed data. Nevertheless, to agree to live with certain imperfections can improve the performance of the system by improving competition. In this paper, we propose a multi-layer protocol combining the pessimistic and optimistic approaches conceived for the data consistency maintenance in large scale systems. Our approach is based on a hierarchical representation model with tree layers, whose objective is with double vocation, because it initially makes it possible to reduce response times compared to completely pessimistic approach and it the second time to improve the quality of service compared to an optimistic approach.

Keywords: Data Grid, replication, consistency, optimistic approach, pessimistic approach.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1526
7820 Q-Map: Clinical Concept Mining from Clinical Documents

Authors: Sheikh Shams Azam, Manoj Raju, Venkatesh Pagidimarri, Vamsi Kasivajjala

Abstract:

Over the past decade, there has been a steep rise in the data-driven analysis in major areas of medicine, such as clinical decision support system, survival analysis, patient similarity analysis, image analytics etc. Most of the data in the field are well-structured and available in numerical or categorical formats which can be used for experiments directly. But on the opposite end of the spectrum, there exists a wide expanse of data that is intractable for direct analysis owing to its unstructured nature which can be found in the form of discharge summaries, clinical notes, procedural notes which are in human written narrative format and neither have any relational model nor any standard grammatical structure. An important step in the utilization of these texts for such studies is to transform and process the data to retrieve structured information from the haystack of irrelevant data using information retrieval and data mining techniques. To address this problem, the authors present Q-Map in this paper, which is a simple yet robust system that can sift through massive datasets with unregulated formats to retrieve structured information aggressively and efficiently. It is backed by an effective mining technique which is based on a string matching algorithm that is indexed on curated knowledge sources, that is both fast and configurable. The authors also briefly examine its comparative performance with MetaMap, one of the most reputed tools for medical concepts retrieval and present the advantages the former displays over the latter.

Keywords: Information retrieval (IR), unified medical language system (UMLS), Syntax Based Analysis, natural language processing (NLP), medical informatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 727
7819 Quantification of GHGs Emissions from Electricity and Diesel Fuel Consumption in Basalt Mining Industry in Thailand

Authors: S. Kittipongvises, A. Dubsok

Abstract:

The mineral and mining industry is necessary for countries to have an adequate and reliable supply of materials to meet their socio-economic development. Despite its importance, the environmental impacts from mineral exploration are hugely significant. This study aimed to investigate and quantify the amount of GHGs emissions emitted from both electricity and diesel vehicle fuel consumption in basalt mining in Thailand. Plant A, located in the northeastern region of Thailand, was selected as a case study. Results indicated that total GHGs emissions from basalt mining and operation (Plant A) were approximately 2,501,086 kgCO2e and 1,997,412 kgCO2e in 2014 and 2015, respectively. The estimated carbon intensity ranged between 1.824 kgCO2e to 2.284 kgCO2e per ton of rock product. Scope 1 (direct emissions) was the dominant driver of its total GHGs compared to scope 2 (indirect emissions). As such, transport related combustion of diesel fuels generated the highest GHGs emission (65%) compared to emissions from purchased electricity (35%). Some of the potential implications for mining entities were also presented.

Keywords: Basalt mining, diesel fuel, electricity, GHGs emissions, Thailand.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1014
7818 Composite Kernels for Public Emotion Recognition from Twitter

Authors: Chien-Hung Chen, Yan-Chun Hsing, Yung-Chun Chang

Abstract:

The Internet has grown into a powerful medium for information dispersion and social interaction that leads to a rapid growth of social media which allows users to easily post their emotions and perspectives regarding certain topics online. Our research aims at using natural language processing and text mining techniques to explore the public emotions expressed on Twitter by analyzing the sentiment behind tweets. In this paper, we propose a composite kernel method that integrates tree kernel with the linear kernel to simultaneously exploit both the tree representation and the distributed emotion keyword representation to analyze the syntactic and content information in tweets. The experiment results demonstrate that our method can effectively detect public emotion of tweets while outperforming the other compared methods.

Keywords: Public emotion recognition, natural language processing, composite kernel, sentiment analysis, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 716
7817 Using the Combined Model of PROMETHEE and Fuzzy Analytic Network Process for Determining Question Weights in Scientific Exams through Data Mining Approach

Authors: Hassan Haleh, Amin Ghaffari, Parisa Farahpour

Abstract:

Need for an appropriate system of evaluating students- educational developments is a key problem to achieve the predefined educational goals. Intensity of the related papers in the last years; that tries to proof or disproof the necessity and adequacy of the students assessment; is the corroborator of this matter. Some of these studies tried to increase the precision of determining question weights in scientific examinations. But in all of them there has been an attempt to adjust the initial question weights while the accuracy and precision of those initial question weights are still under question. Thus In order to increase the precision of the assessment process of students- educational development, the present study tries to propose a new method for determining the initial question weights by considering the factors of questions like: difficulty, importance and complexity; and implementing a combined method of PROMETHEE and fuzzy analytic network process using a data mining approach to improve the model-s inputs. The result of the implemented case study proves the development of performance and precision of the proposed model.

Keywords: Assessing students, Analytic network process, Clustering, Data mining, Fuzzy sets, Multi-criteria decision making, and Preference function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1538
7816 A Distributed Group Mutual Exclusion Algorithm for Soft Real Time Systems

Authors: Abhishek Swaroop, Awadhesh Kumar Singh

Abstract:

The group mutual exclusion (GME) problem is an interesting generalization of the mutual exclusion problem. Several solutions of the GME problem have been proposed for message passing distributed systems. However, none of these solutions is suitable for real time distributed systems. In this paper, we propose a token-based distributed algorithms for the GME problem in soft real time distributed systems. The algorithm uses the concepts of priority queue, dynamic request set and the process state. The algorithm uses first come first serve approach in selecting the next session type between the same priority levels and satisfies the concurrent occupancy property. The algorithm allows all n processors to be inside their CS provided they request for the same session. The performance analysis and correctness proof of the algorithm has also been included in the paper.

Keywords: Concurrency, Group mutual exclusion, Priority, Request set, Token.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1668
7815 BIDENS: Iterative Density Based Biclustering Algorithm With Application to Gene Expression Analysis

Authors: Mohamed A. Mahfouz, M. A. Ismail

Abstract:

Biclustering is a very useful data mining technique for identifying patterns where different genes are co-related based on a subset of conditions in gene expression analysis. Association rules mining is an efficient approach to achieve biclustering as in BIMODULE algorithm but it is sensitive to the value given to its input parameters and the discretization procedure used in the preprocessing step, also when noise is present, classical association rules miners discover multiple small fragments of the true bicluster, but miss the true bicluster itself. This paper formally presents a generalized noise tolerant bicluster model, termed as μBicluster. An iterative algorithm termed as BIDENS based on the proposed model is introduced that can discover a set of k possibly overlapping biclusters simultaneously. Our model uses a more flexible method to partition the dimensions to preserve meaningful and significant biclusters. The proposed algorithm allows discovering biclusters that hard to be discovered by BIMODULE. Experimental study on yeast, human gene expression data and several artificial datasets shows that our algorithm offers substantial improvements over several previously proposed biclustering algorithms.

Keywords: Machine learning, biclustering, bi-dimensional clustering, gene expression analysis, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1913
7814 Coordinated Voltage Control using Multiple Regulators in Distribution System with Distributed Generators

Authors: R. Shivarudraswamy, D. N. Gaonkar

Abstract:

The continued interest in the use of distributed generation in recent years is leading to the growth in number of distributed generators connected to distribution networks. Steady state voltage rise resulting from the connection of these generators can be a major obstacle to their connection at lower voltage levels. The present electric distribution network is designed to keep the customer voltage within tolerance limit. This may require a reduction in connectable generation capacity, under utilization of appropriate generation sites. Thus distribution network operators need a proper voltage regulation method to allow the significant integration of distributed generation systems to existing network. In this work a voltage rise problem in a typical distribution system has been studied. A method for voltage regulation of distribution system with multiple DG system by coordinated operation distributed generator, capacitor and OLTC has been developed. A sensitivity based analysis has been carried out to determine the priority for individual generators in multiple DG environment. The effectiveness of the developed method has been evaluated under various cases through simulation results.

Keywords: Distributed generation, voltage control, sensitivity factor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2540
7813 Distributed Architecture of an Autonomous Four Rotor Mini-Rotorcraft based on Multi-Agent System

Authors: H. Ifassiouen, H. Medromi, N. E. Radhy

Abstract:

In this paper, we present the recently implemented approach allowing dynamics systems to plan its actions, taking into account the environment perception changes, and to control their execution when uncertainty and incomplete knowledge are the major characteristics of the situated environment [1],[2],[3],[4]. The control distributed architecture has three modules and the approach is related to hierarchical planning: the plan produced by the planner is further refined at the control layer that in turn supervises its execution by a functional level. We propose a new intelligent distributed architecture constituted by: Multi-Agent subsystem of the sensor, of the interpretation and representation of environment [9], of the dynamic localization and of the action. We tested this distributed architecture with dynamic system in the known environment. The autonomous for Rotor Mini Rotorcraft task is described by the primitive actions. The distributed controlbased on multi-agent system is in charge of achieving each task in the best possible way taking into account the context and sensory feedback.

Keywords: Autonomous four rotors helicopter, Control system, Hierarchical planning, Intelligent Distributed Architecture.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1589