Search results for: mining activities
1813 Auto Classification for Search Intelligence
Authors: Lilac A. E. Al-Safadi
Abstract:
This paper proposes an auto-classification algorithm of Web pages using Data mining techniques. We consider the problem of discovering association rules between terms in a set of Web pages belonging to a category in a search engine database, and present an auto-classification algorithm for solving this problem that are fundamentally based on Apriori algorithm. The proposed technique has two phases. The first phase is a training phase where human experts determines the categories of different Web pages, and the supervised Data mining algorithm will combine these categories with appropriate weighted index terms according to the highest supported rules among the most frequent words. The second phase is the categorization phase where a web crawler will crawl through the World Wide Web to build a database categorized according to the result of the data mining approach. This database contains URLs and their categories.Keywords: Information Processing on the Web, Data Mining, Document Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16191812 A Note on Metallurgy at Khanak: An Indus Site in Tosham Mining Area, Haryana
Authors: Ravindra N. Singh, Dheerendra P. Singh
Abstract:
Recent discoveries of Bronze Age artefacts, tin slag, furnaces and crucibles, together with new geological evidence on tin deposits in Tosham area of Bhiwani district in Haryana (India) provide the opportunity to survey the evidence for possible sources of tin and the use of bronze in the Harappan sites of north western India. Earlier, Afghanistan emerged as the most promising eastern source of tin utilized by Indus Civilization copper-smiths. Our excavations conducted at Khanak near Tosham mining area during 2014 and 2016 revealed ample evidence of metallurgical activities as attested by the occurrence of slag, ores and evidences of ashes and fragments of furnaces in addition to the bronze objects. We have conducted petrological, XRD, EDAX, TEM, SEM and metallography on the slag, ores, crucible fragments and bronze objects samples recovered from Khanak excavations. This has given positive indication of mining and metallurgy of poly-mettalic Tin at the site; however, it can only be ascertained after the detailed scientific examination of the materials which is underway. In view of the importance of site, we intend to excavate the site horizontally in future so as to obtain more samples for scientific studies.
Keywords: Archaeometallurgy, problem of tin, metallography, Indus civilization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20121811 Semantically Enriched Web Usage Mining for Personalization
Authors: Suresh Shirgave, Prakash Kulkarni, José Borges
Abstract:
The continuous growth in the size of the World Wide Web has resulted in intricate Web sites, demanding enhanced user skills and more sophisticated tools to help the Web user to find the desired information. In order to make Web more user friendly, it is necessary to provide personalized services and recommendations to the Web user. For discovering interesting and frequent navigation patterns from Web server logs many Web usage mining techniques have been applied. The recommendation accuracy of usage based techniques can be improved by integrating Web site content and site structure in the personalization process.
Herein, we propose semantically enriched Web Usage Mining method for Personalization (SWUMP), an extension to solely usage based technique. This approach is a combination of the fields of Web Usage Mining and Semantic Web. In the proposed method, we envisage enriching the undirected graph derived from usage data with rich semantic information extracted from the Web pages and the Web site structure. The experimental results show that the SWUMP generates accurate recommendations and is able to achieve 10-20% better accuracy than the solely usage based model. The SWUMP addresses the new item problem inherent to solely usage based techniques.
Keywords: Prediction, Recommendation, Semantic Web Usage Mining, Web Usage Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30231810 An Improved Data Mining Method Applied to the Search of Relationship between Metabolic Syndrome and Lifestyles
Authors: Yi Chao Huang, Yu Ling Liao, Chiu Shuang Lin
Abstract:
A data cutting and sorting method (DCSM) is proposed to optimize the performance of data mining. DCSM reduces the calculation time by getting rid of redundant data during the data mining process. In addition, DCSM minimizes the computational units by splitting the database and by sorting data with support counts. In the process of searching for the relationship between metabolic syndrome and lifestyles with the health examination database of an electronics manufacturing company, DCSM demonstrates higher search efficiency than the traditional Apriori algorithm in tests with different support counts.Keywords: Data mining, Data cutting and sorting method, Apriori algorithm, Metabolic syndrome
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15881809 An Application of the Data Mining Methods with Decision Rule
Authors: Xun Ge, Jianhua Gong
Abstract:
ankings for output of Chinese main agricultural commodity in the world for 1978, 1980, 1990, 2000, 2006, 2007 and 2008 have been released in United Nations FAO Database. Unfortunately, where the ranking of output of Chinese cotton lint in the world for 2008 was missed. This paper uses sequential data mining methods with decision rules filling this gap. This new data mining method will be help to give a further improvement for United Nations FAO Database.
Keywords: Ranking, output of the main agricultural commodity, gross domestic product, decision table, information system, data mining, decision rule
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17101808 Application and Limitation of Parallel Modelingin Multidimensional Sequential Pattern
Authors: Mahdi Esmaeili, Mansour Tarafdar
Abstract:
The goal of data mining algorithms is to discover useful information embedded in large databases. One of the most important data mining problems is discovery of frequently occurring patterns in sequential data. In a multidimensional sequence each event depends on more than one dimension. The search space is quite large and the serial algorithms are not scalable for very large datasets. To address this, it is necessary to study scalable parallel implementations of sequence mining algorithms. In this paper, we present a model for multidimensional sequence and describe a parallel algorithm based on data parallelism. Simulation experiments show good load balancing and scalable and acceptable speedup over different processors and problem sizes and demonstrate that our approach can works efficiently in a real parallel computing environment.Keywords: Sequential Patterns, Data Mining, ParallelAlgorithm, Multidimensional Sequence Data
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14761807 Influence of Non-Structural Elements on Dynamic Response of Multi-Storey Rc Building to Mining Shock
Authors: Joanna M. Dulińska, Maria Fabijańska
Abstract:
In the paper the results of calculations of the dynamic response of a multi-storey reinforced concrete building to a strong mining shock originated from the main region of mining activity in Poland (i.e. the Legnica-Glogow Copper District) are presented. The representative time histories of accelerations registered in three directions were used as ground motion data in calculations of the dynamic response of the structure. Two variants of a numerical model were applied: the model including only structural elements of the building and the model including both structural and non-structural elements (i.e. partition walls and ventilation ducts made of brick). It turned out that non-structural elements of multi-storey RC buildings have a small impact of about 10 % on natural frequencies of these structures. It was also proved that the dynamic response of building to mining shock obtained in case of inclusion of all non-structural elements in the numerical model is about 20 % smaller than in case of consideration of structural elements only. The principal stresses obtained in calculations of dynamic response of multi-storey building to strong mining shock are situated on the level of about 30% of values obtained from static analysis (dead load).Keywords: Dynamic characteristics of buildings, mining shocks, dynamic response of buildings, non-structural elements
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18871806 Artificial Intelligence Applications in Aggregate Quarries: A Reality
Authors: J. E. Ortiz, P. Plaza, J. Herrero, I. Cabria, J. L. Blanco, J. Gavilanes, J. I. Escavy, I. López-Cilla, V. Yagüe, C. Pérez, S. Rodríguez, J. Rico, C. Serrano, J. Bernat
Abstract:
The development of Artificial Intelligence services in mining processes, specifically in aggregate quarries, is facilitating automation and improving numerous aspects of operations. Ultimately, AI is transforming the mining industry by improving efficiency, safety and sustainability. With the ability to analyze large amounts of data and make autonomous decisions, AI offers great opportunities to optimize mining operations and maximize the economic and social benefits of this vital industry. Within the framework of the European DIGIECOQUARRY project, various services were developed for the identification of material quality, production estimation, detection of anomalies and prediction of consumption and production automatically with good results.
Keywords: Aggregates, artificial intelligence, automatization, mining operations.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 261805 Risk-Management by Numerical Pattern Analysis in Data-Mining
Authors: M. Kargar, R. Mirmiran, F. Fartash, T. Saderi
Abstract:
In this paper a new method is suggested for risk management by the numerical patterns in data-mining. These patterns are designed using probability rules in decision trees and are cared to be valid, novel, useful and understandable. Considering a set of functions, the system reaches to a good pattern or better objectives. The patterns are analyzed through the produced matrices and some results are pointed out. By using the suggested method the direction of the functionality route in the systems can be controlled and best planning for special objectives be done.Keywords: Analysis, Data-mining, Pattern, Risk Management.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12701804 Predicting Groundwater Areas Using Data Mining Techniques: Groundwater in Jordan as Case Study
Authors: Faisal Aburub, Wael Hadi
Abstract:
Data mining is the process of extracting useful or hidden information from a large database. Extracted information can be used to discover relationships among features, where data objects are grouped according to logical relationships; or to predict unseen objects to one of the predefined groups. In this paper, we aim to investigate four well-known data mining algorithms in order to predict groundwater areas in Jordan. These algorithms are Support Vector Machines (SVMs), Naïve Bayes (NB), K-Nearest Neighbor (kNN) and Classification Based on Association Rule (CBA). The experimental results indicate that the SVMs algorithm outperformed other algorithms in terms of classification accuracy, precision and F1 evaluation measures using the datasets of groundwater areas that were collected from Jordanian Ministry of Water and Irrigation.Keywords: Classification, data mining, evaluation measures, groundwater.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25951803 A Recommender System Fusing Collaborative Filtering and User’s Review Mining
Authors: Seulbi Choi, Hyunchul Ahn
Abstract:
Collaborative filtering (CF) algorithm has been popularly used for recommender systems in both academic and practical applications. It basically generates recommendation results using users’ numeric ratings. However, the additional use of the information other than user ratings may lead to better accuracy of CF. Considering that a lot of people are likely to share their honest opinion on the items they purchased recently due to the advent of the Web 2.0, user's review can be regarded as the new informative source for identifying user's preference with accuracy. Under this background, this study presents a hybrid recommender system that fuses CF and user's review mining. Our system adopts conventional memory-based CF, but it is designed to use both user’s numeric ratings and his/her text reviews on the items when calculating similarities between users.Keywords: Recommender system, collaborative filtering, text mining, review mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15871802 Business-Intelligence Mining of Large Decentralized Multimedia Datasets with a Distributed Multi-Agent System
Authors: Karima Qayumi, Alex Norta
Abstract:
The rapid generation of high volume and a broad variety of data from the application of new technologies pose challenges for the generation of business-intelligence. Most organizations and business owners need to extract data from multiple sources and apply analytical methods for the purposes of developing their business. Therefore, the recently decentralized data management environment is relying on a distributed computing paradigm. While data are stored in highly distributed systems, the implementation of distributed data-mining techniques is a challenge. The aim of this technique is to gather knowledge from every domain and all the datasets stemming from distributed resources. As agent technologies offer significant contributions for managing the complexity of distributed systems, we consider this for next-generation data-mining processes. To demonstrate agent-based business intelligence operations, we use agent-oriented modeling techniques to develop a new artifact for mining massive datasets.
Keywords: Agent-oriented modeling, business Intelligence management, distributed data mining, multi-agent system.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13741801 A Text Mining Technique Using Association Rules Extraction
Authors: Hany Mahgoub, Dietmar Rösner, Nabil Ismail, Fawzy Torkey
Abstract:
This paper describes text mining technique for automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association rules amongst keywords labeling the documents. In this work, the EART system ignores the order in which the words occur, but instead focusing on the words and their statistical distributions in documents. The main contributions of the technique are that it integrates XML technology with Information Retrieval scheme (TFIDF) (for keyword/feature selection that automatically selects the most discriminative keywords for use in association rules generation) and use Data Mining technique for association rules discovery. It consists of three phases: Text Preprocessing phase (transformation, filtration, stemming and indexing of the documents), Association Rule Mining (ARM) phase (applying our designed algorithm for Generating Association Rules based on Weighting scheme GARW) and Visualization phase (visualization of results). Experiments applied on WebPages news documents related to the outbreak of the bird flu disease. The extracted association rules contain important features and describe the informative news included in the documents collection. The performance of the EART system compared with another system that uses the Apriori algorithm throughout the execution time and evaluating extracted association rules.
Keywords: Text mining, data mining, association rule mining
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 44371800 Using Data Mining for Learning and Clustering FCM
Authors: Somayeh Alizadeh, Mehdi Ghazanfari, Mohammad Fathian
Abstract:
Fuzzy Cognitive Maps (FCMs) have successfully been applied in numerous domains to show relations between essential components. In some FCM, there are more nodes, which related to each other and more nodes means more complex in system behaviors and analysis. In this paper, a novel learning method used to construct FCMs based on historical data and by using data mining and DEMATEL method, a new method defined to reduce nodes number. This method cluster nodes in FCM based on their cause and effect behaviors.Keywords: Clustering, Data Mining, Fuzzy Cognitive Map(FCM), Learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20161799 Data Mining in Oral Medicine Using Decision Trees
Authors: Fahad Shahbaz Khan, Rao Muhammad Anwer, Olof Torgersson, Göran Falkman
Abstract:
Data mining has been used very frequently to extract hidden information from large databases. This paper suggests the use of decision trees for continuously extracting the clinical reasoning in the form of medical expert-s actions that is inherent in large number of EMRs (Electronic Medical records). In this way the extracted data could be used to teach students of oral medicine a number of orderly processes for dealing with patients who represent with different problems within the practice context over time.Keywords: Data mining, Oral Medicine, Decision Trees, WEKA.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25011798 Genetic Mining: Using Genetic Algorithm for Topic based on Concept Distribution
Authors: S. M. Khalessizadeh, R. Zaefarian, S.H. Nasseri, E. Ardil
Abstract:
Today, Genetic Algorithm has been used to solve wide range of optimization problems. Some researches conduct on applying Genetic Algorithm to text classification, summarization and information retrieval system in text mining process. This researches show a better performance due to the nature of Genetic Algorithm. In this paper a new algorithm for using Genetic Algorithm in concept weighting and topic identification, based on concept standard deviation will be explored.Keywords: Genetic Algorithm, Text Mining, Term Weighting, Concept Extraction, Concept Distribution.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 37141797 FCA-based Conceptual Knowledge Discovery in Folksonomy
Authors: Yu-Kyung Kang, Suk-Hyung Hwang, Kyoung-Mo Yang
Abstract:
The tagging data of (users, tags and resources) constitutes a folksonomy that is the user-driven and bottom-up approach to organizing and classifying information on the Web. Tagging data stored in the folksonomy include a lot of very useful information and knowledge. However, appropriate approach for analyzing tagging data and discovering hidden knowledge from them still remains one of the main problems on the folksonomy mining researches. In this paper, we have proposed a folksonomy data mining approach based on FCA for discovering hidden knowledge easily from folksonomy. Also we have demonstrated how our proposed approach can be applied in the collaborative tagging system through our experiment. Our proposed approach can be applied to some interesting areas such as social network analysis, semantic web mining and so on.
Keywords: Folksonomy data mining, formal concept analysis, collaborative tagging, conceptual knowledge discovery, classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20281796 Numerical Modeling of Artisanal and Small-Scale Mining of Coltan in the African Great Lakes Region
Authors: Sergio Perez Rodriguez
Abstract:
Findings of a production model of Artisanal and Small-Scale Mining (ASM) of coltan ore by an average Democratic Republic of Congo (DRC) mineworker are presented in this paper. These can be used as a reference for a similar characterization of the daily labor of counterparts from other countries in the Africa's Great Lakes region. To that end, the Fundamental Equation of Mineral Production has been applied in this paper, considering a miner's average daily output of coltan, estimated in the base of gross statistical data gathered from reputable sources. Results indicate daily yields of individual miners in the order of 300 g of coltan ore, with hourly peaks of production in the range of 30 to 40 g of the mineral. Yields are expected to be in the order of 5 g or less during the least productive hours. These outputs are expected to be achieved during the halves of the eight to 10 hours of daily working sessions that these artisanal laborers can attend during the mining season.
Keywords: Coltan, mineral production, Production to Reserve ratio, artisanal mining, small-scale mining, ASM, human work, Great Lakes region, Democratic Republic of Congo.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1941795 Hybrid Knowledge Approach for Determining Health Care Provider Specialty from Patient Diagnoses
Authors: Erin Lynne Plettenberg, Jeremy Vickery
Abstract:
In an access-control situation, the role of a user determines whether a data request is appropriate. This paper combines vetted web mining and logic modeling to build a lightweight system for determining the role of a health care provider based only on their prior authorized requests. The model identifies provider roles with 100% recall from very little data. This shows the value of vetted web mining in AI systems, and suggests the impact of the ICD classification on medical practice.
Keywords: Ontology, logic modeling, electronic medical records, information extraction, vetted web mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9361794 A Network Traffic Prediction Algorithm Based On Data Mining Technique
Authors: D. Prangchumpol
Abstract:
This paper is a description approach to predict incoming and outgoing data rate in network system by using association rule discover, which is one of the data mining techniques. Information of incoming and outgoing data in each times and network bandwidth are network performance parameters, which needed to solve in the traffic problem. Since congestion and data loss are important network problems. The result of this technique can predicted future network traffic. In addition, this research is useful for network routing selection and network performance improvement.
Keywords: Traffic prediction, association rule, data mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 36691793 A Simplified and Effective Algorithm Used to Mine Similar Processes: An Illustrated Example
Authors: Min-Hsun Kuo, Yun-Shiow Chen
Abstract:
The running logs of a process hold valuable information about its executed activity behavior and generated activity logic structure. Theses informative logs can be extracted, analyzed and utilized to improve the efficiencies of the process's execution and conduction. One of the techniques used to accomplish the process improvement is called as process mining. To mine similar processes is such an improvement mission in process mining. Rather than directly mining similar processes using a single comparing coefficient or a complicate fitness function, this paper presents a simplified heuristic process mining algorithm with two similarity comparisons that are able to relatively conform the activity logic sequences (traces) of mining processes with those of a normalized (regularized) one. The relative process conformance is to find which of the mining processes match the required activity sequences and relationships, further for necessary and sufficient applications of the mined processes to process improvements. One similarity presented is defined by the relationships in terms of the number of similar activity sequences existing in different processes; another similarity expresses the degree of the similar (identical) activity sequences among the conforming processes. Since these two similarities are with respect to certain typical behavior (activity sequences) occurred in an entire process, the common problems, such as the inappropriateness of an absolute comparison and the incapability of an intrinsic information elicitation, which are often appeared in other process conforming techniques, can be solved by the relative process comparison presented in this paper. To demonstrate the potentiality of the proposed algorithm, a numerical example is illustrated.Keywords: process mining, process similarity, artificial intelligence, process conformance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14431792 Mine Production Index (MPI): New Method to Evaluate Effectiveness of Mining Machinery
Authors: Amol Lanke, Hadi Hoseinie, Behzad Ghodrati
Abstract:
OEE has been used in many industries as measure of performance. However due to limitations of original OEE, it has been modified by various researchers. OEE for mining application is special version of classic equation, carries these limitation over. In this paper it has been aimed to modify the OEE for mining application by introducing the weights to the elements of it and termed as Mine Production index (MPi). As a special application of new index MPishovel has been developed by authors. This can be used for evaluating the shovel effectiveness. Based on analysis, utilization followed by performance and availability were ranked in this order. To check the applicability of this index, a case study was done on four electrical and one hydraulic shovel in a Swedish mine. The results shows that MPishovel can evaluate production effectiveness of shovels and can determine effectiveness values in optimistic view compared to OEE. MPi with calculation not only give the effectiveness but also can predict which elements should be focused for improving the productivity.
Keywords: Mining, Overall equipment efficiency (OEE), Mine Production index, Shovels.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 47441791 Web Content Mining: A Solution to Consumer's Product Hunt
Authors: Syed Salman Ahmed, Zahid Halim, Rauf Baig, Shariq Bashir
Abstract:
With the rapid growth in business size, today's businesses orient towards electronic technologies. Amazon.com and e-bay.com are some of the major stakeholders in this regard. Unfortunately the enormous size and hugely unstructured data on the web, even for a single commodity, has become a cause of ambiguity for consumers. Extracting valuable information from such an everincreasing data is an extremely tedious task and is fast becoming critical towards the success of businesses. Web content mining can play a major role in solving these issues. It involves using efficient algorithmic techniques to search and retrieve the desired information from a seemingly impossible to search unstructured data on the Internet. Application of web content mining can be very encouraging in the areas of Customer Relations Modeling, billing records, logistics investigations, product cataloguing and quality management. In this paper we present a review of some very interesting, efficient yet implementable techniques from the field of web content mining and study their impact in the area specific to business user needs focusing both on the customer as well as the producer. The techniques we would be reviewing include, mining by developing a knowledge-base repository of the domain, iterative refinement of user queries for personalized search, using a graphbased approach for the development of a web-crawler and filtering information for personalized search using website captions. These techniques have been analyzed and compared on the basis of their execution time and relevance of the result they produced against a particular search.
Keywords: Data mining, web mining, search engines, knowledge discovery.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20531790 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine
Authors: Djamila Benhaddouche, Abdelkader Benyettou
Abstract:
In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.
Keywords: A classifier, Algorithms decision tree, knowledge extraction, Support Vector Machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18701789 Social and Economic Effects of Mining Industry Restructuring in Romania -Case Studies
Authors: Andra Costache, Gica Pehoiu
Abstract:
As in other countries from Central and Eastern Europe, the economic restructuring occurred in the last decade of the twentieth century affected the mining industry in Romania, an oversize and heavily subsidized sector before 1989. After more than a decade since the beginning of mining restructuring, an evaluation of current social implications of the process it is required, together with an efficiency analysis of the adaptation mechanisms developed at governmental level. This article aims to provide an insight into these issues through case studies conducted in the most important coal basin of Romania, Petroşani Depression.Keywords: case studies, government programs, miningrestructuring, social effects.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26601788 Text-Mining Approach for Evaluation of Affective Management Practices
Authors: Masaaki Saito, Qin Tang, Hiroyuki Umemuro
Abstract:
The purpose of this paper is to propose a text mining approach to evaluate companies- practices on affective management. Affective management argues that it is critical to take stakeholders- affects into consideration during decision-making process, along with the traditional numerical and rational indices. CSR reports published by companies were collected as source information. Indices were proposed based on the frequency and collocation of words relevant to affective management concept using text mining approach to analyze the text information of CSR reports. In addition, the relationships between the results obtained using proposed indices and traditional indicators of business performance were investigated using correlation analysis. Those correlations were also compared between manufacturing and non-manufacturing companies. The results of this study revealed the possibility to evaluate affective management practices of companies based on publicly available text documents.Keywords: Affective management, Affect, Stakeholder, Text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18451787 Mining of Interesting Prediction Rules with Uniform Two-Level Genetic Algorithm
Authors: Bilal Alatas, Ahmet Arslan
Abstract:
The main goal of data mining is to extract accurate, comprehensible and interesting knowledge from databases that may be considered as large search spaces. In this paper, a new, efficient type of Genetic Algorithm (GA) called uniform two-level GA is proposed as a search strategy to discover truly interesting, high-level prediction rules, a difficult problem and relatively little researched, rather than discovering classification knowledge as usual in the literatures. The proposed method uses the advantage of uniform population method and addresses the task of generalized rule induction that can be regarded as a generalization of the task of classification. Although the task of generalized rule induction requires a lot of computations, which is usually not satisfied with the normal algorithms, it was demonstrated that this method increased the performance of GAs and rapidly found interesting rules.
Keywords: Classification rule mining, data mining, genetic algorithms.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15941786 Automata Theory Approach for Solving Frequent Pattern Discovery Problems
Authors: Renáta Iváncsy, István Vajk
Abstract:
The various types of frequent pattern discovery problem, namely, the frequent itemset, sequence and graph mining problems are solved in different ways which are, however, in certain aspects similar. The main approach of discovering such patterns can be classified into two main classes, namely, in the class of the levelwise methods and in that of the database projection-based methods. The level-wise algorithms use in general clever indexing structures for discovering the patterns. In this paper a new approach is proposed for discovering frequent sequences and tree-like patterns efficiently that is based on the level-wise issue. Because the level-wise algorithms spend a lot of time for the subpattern testing problem, the new approach introduces the idea of using automaton theory to solve this problem.Keywords: Frequent pattern discovery, graph mining, pushdownautomaton, sequence mining, state machine, tree mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16281785 Mining Correlated Bicluster from Web Usage Data Using Discrete Firefly Algorithm Based Biclustering Approach
Authors: K. Thangavel, R. Rathipriya
Abstract:
For the past one decade, biclustering has become popular data mining technique not only in the field of biological data analysis but also in other applications like text mining, market data analysis with high-dimensional two-way datasets. Biclustering clusters both rows and columns of a dataset simultaneously, as opposed to traditional clustering which clusters either rows or columns of a dataset. It retrieves subgroups of objects that are similar in one subgroup of variables and different in the remaining variables. Firefly Algorithm (FA) is a recently-proposed metaheuristic inspired by the collective behavior of fireflies. This paper provides a preliminary assessment of discrete version of FA (DFA) while coping with the task of mining coherent and large volume bicluster from web usage dataset. The experiments were conducted on two web usage datasets from public dataset repository whereby the performance of FA was compared with that exhibited by other population-based metaheuristic called binary Particle Swarm Optimization (PSO). The results achieved demonstrate the usefulness of DFA while tackling the biclustering problem.
Keywords: Biclustering, Binary Particle Swarm Optimization, Discrete Firefly Algorithm, Firefly Algorithm, Usage profile Web usage mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21331784 An Approach to Concerns and Aspects Mining for Web Applications
Authors: Carlo Bellettini, Alessandro Marchetto, Andrea Trentini
Abstract:
Web applications have become very complex and crucial, especially when combined with areas such as CRM (Customer Relationship Management) and BPR (Business Process Reengineering), the scientific community has focused attention to Web applications design, development, analysis, and testing, by studying and proposing methodologies and tools. This paper proposes an approach to automatic multi-dimensional concern mining for Web Applications, based on concepts analysis, impact analysis, and token-based concern identification. This approach lets the user to analyse and traverse Web software relevant to a particular concern (concept, goal, purpose, etc.) via multi-dimensional separation of concerns, to document, understand and test Web applications. This technique was developed in the context of WAAT (Web Applications Analysis and Testing) project. A semi-automatic tool to support this technique is currently under development.Keywords: Aspect Mining, Concepts Analysis, Concerns Mining, Multi-Dimensional Separation of Concerns, Impact Analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1513