Search results for: sequence mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1052

Search results for: sequence mining

902 Video Matting based on Background Estimation

Authors: J.-H. Moon, D.-O Kim, R.-H. Park

Abstract:

This paper presents a video matting method, which extracts the foreground and alpha matte from a video sequence. The objective of video matting is finding the foreground and compositing it with the background that is different from the one in the original image. By finding the motion vectors (MVs) using a sliced block matching algorithm (SBMA), we can extract moving regions from the video sequence under the assumption that the foreground is moving and the background is stationary. In practice, foreground areas are not moving through all frames in an image sequence, thus we accumulate moving regions through the image sequence. The boundaries of moving regions are found by Canny edge detector and the foreground region is separated in each frame of the sequence. Remaining regions are defined as background regions. Extracted backgrounds in each frame are combined and reframed as an integrated single background. Based on the estimated background, we compute the frame difference (FD) of each frame. Regions with the FD larger than the threshold are defined as foreground regions, boundaries of foreground regions are defined as unknown regions and the rest of regions are defined as backgrounds. Segmentation information that classifies an image into foreground, background, and unknown regions is called a trimap. Matting process can extract an alpha matte in the unknown region using pixel information in foreground and background regions, and estimate the values of foreground and background pixels in unknown regions. The proposed video matting approach is adaptive and convenient to extract a foreground automatically and to composite a foreground with a background that is different from the original background.

Keywords: Background estimation, Object segmentation, Blockmatching algorithm, Video matting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1801
901 Web Content Mining: A Solution to Consumer's Product Hunt

Authors: Syed Salman Ahmed, Zahid Halim, Rauf Baig, Shariq Bashir

Abstract:

With the rapid growth in business size, today's businesses orient towards electronic technologies. Amazon.com and e-bay.com are some of the major stakeholders in this regard. Unfortunately the enormous size and hugely unstructured data on the web, even for a single commodity, has become a cause of ambiguity for consumers. Extracting valuable information from such an everincreasing data is an extremely tedious task and is fast becoming critical towards the success of businesses. Web content mining can play a major role in solving these issues. It involves using efficient algorithmic techniques to search and retrieve the desired information from a seemingly impossible to search unstructured data on the Internet. Application of web content mining can be very encouraging in the areas of Customer Relations Modeling, billing records, logistics investigations, product cataloguing and quality management. In this paper we present a review of some very interesting, efficient yet implementable techniques from the field of web content mining and study their impact in the area specific to business user needs focusing both on the customer as well as the producer. The techniques we would be reviewing include, mining by developing a knowledge-base repository of the domain, iterative refinement of user queries for personalized search, using a graphbased approach for the development of a web-crawler and filtering information for personalized search using website captions. These techniques have been analyzed and compared on the basis of their execution time and relevance of the result they produced against a particular search.

Keywords: Data mining, web mining, search engines, knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2035
900 The Orlicz Space of the Entire Sequence Fuzzy Numbers Defined by Infinite Matrices

Authors: N.Subramanian, C.Murugesan

Abstract:

This paper is devoted to the study of the general properties of Orlicz space of entire sequence of fuzzy numbers by using infinite matrices.

Keywords: Fuzzy numbers, infinite matrix, Orlicz space, entiresequence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1191
899 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine

Authors: Djamila Benhaddouche, Abdelkader Benyettou

Abstract:

In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.

Keywords: A classifier, Algorithms decision tree, knowledge extraction, Support Vector Machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1856
898 Computational Method for Annotation of Protein Sequence According to Gene Ontology Terms

Authors: Razib M. Othman, Safaai Deris, Rosli M. Illias

Abstract:

Annotation of a protein sequence is pivotal for the understanding of its function. Accuracy of manual annotation provided by curators is still questionable by having lesser evidence strength and yet a hard task and time consuming. A number of computational methods including tools have been developed to tackle this challenging task. However, they require high-cost hardware, are difficult to be setup by the bioscientists, or depend on time intensive and blind sequence similarity search like Basic Local Alignment Search Tool. This paper introduces a new method of assigning highly correlated Gene Ontology terms of annotated protein sequences to partially annotated or newly discovered protein sequences. This method is fully based on Gene Ontology data and annotations. Two problems had been identified to achieve this method. The first problem relates to splitting the single monolithic Gene Ontology RDF/XML file into a set of smaller files that can be easy to assess and process. Thus, these files can be enriched with protein sequences and Inferred from Electronic Annotation evidence associations. The second problem involves searching for a set of semantically similar Gene Ontology terms to a given query. The details of macro and micro problems involved and their solutions including objective of this study are described. This paper also describes the protein sequence annotation and the Gene Ontology. The methodology of this study and Gene Ontology based protein sequence annotation tool namely extended UTMGO is presented. Furthermore, its basic version which is a Gene Ontology browser that is based on semantic similarity search is also introduced.

Keywords: automatic clustering, bioinformatics tool, gene ontology, protein sequence annotation, semantic similarity search

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3114
897 Social and Economic Effects of Mining Industry Restructuring in Romania -Case Studies

Authors: Andra Costache, Gica Pehoiu

Abstract:

As in other countries from Central and Eastern Europe, the economic restructuring occurred in the last decade of the twentieth century affected the mining industry in Romania, an oversize and heavily subsidized sector before 1989. After more than a decade since the beginning of mining restructuring, an evaluation of current social implications of the process it is required, together with an efficiency analysis of the adaptation mechanisms developed at governmental level. This article aims to provide an insight into these issues through case studies conducted in the most important coal basin of Romania, Petroşani Depression.

Keywords: case studies, government programs, miningrestructuring, social effects.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2648
896 Fractal Analysis of 16S rRNA Gene Sequences in Archaea Thermophiles

Authors: T. Holden, G. Tremberger, Jr, E. Cheung, R. Subramaniam, R. Sullivan, N. Gadura, P. Schneider, P. Marchese, A. Flamholz, T. Cheung, D. Lieberman

Abstract:

A nucleotide sequence can be expressed as a numerical sequence when each nucleotide is assigned its proton number. A resulting gene numerical sequence can be investigated for its fractal dimension in terms of evolution and chemical properties for comparative studies. We have investigated such nucleotide fluctuation in the 16S rRNA gene of archaea thermophiles. The studied archaea thermophiles were archaeoglobus fulgidus, methanothermobacter thermautotrophicus, methanocaldococcus jannaschii, pyrococcus horikoshii, and thermoplasma acidophilum. The studied five archaea-euryarchaeota thermophiles have fractal dimension values ranging from 1.93 to 1.97. Computer simulation shows that random sequences would have an average of about 2 with a standard deviation about 0.015. The fractal dimension was found to correlate (negative correlation) with the thermophile-s optimal growth temperature with R2 value of 0.90 (N =5). The inclusion of two aracheae-crenarchaeota thermophiles reduces the R2 value to 0.66 (N = 7). Further inclusion of two bacterial thermophiles reduces the R2 value to 0.50 (N =9). The fractal dimension is correlated (positive) to the sequence GC content with an R2 value of 0.89 for the five archaea-euryarchaeota thermophiles (and 0.74 for the entire set of N = 9), although computer simulation shows little correlation. The highest correlation (positive) was found to be between the fractal dimension and di-nucleotide Shannon entropy. However Shannon entropy and sequence GC content were observed to correlate with optimal growth temperature having an R2 of 0.8 (negative), and 0.88 (positive), respectively, for the entire set of 9 thermophiles; thus the correlation lacks species specificity. Together with another correlation study of bacterial radiation dosage with RecA repair gene sequence fractal dimension, it is postulated that fractal dimension analysis is a sensitive tool for studying the relationship between genotype and phenotype among closely related sequences.

Keywords: Fractal dimension, archaea thermophiles, Shannon entropy, GC content

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1764
895 Extended Low Power Bus Binding Combined with Data Sequence Reordering

Authors: Jihyung Kim, Taejin Kim, Sungho Park, Jun-Dong Cho

Abstract:

In this paper, we address the problem of reducing the switching activity (SA) in on-chip buses through the use of a bus binding technique in high-level synthesis. While many binding techniques to reduce the SA exist, we present yet another technique for further reducing the switching activity. Our proposed method combines bus binding and data sequence reordering to explore a wider solution space. The problem is formulated as a multiple traveling salesman problem and solved using simulated annealing technique. The experimental results revealed that a binding solution obtained with the proposed method reduces 5.6-27.2% (18.0% on average) and 2.6-12.7% (6.8% on average) of the switching activity when compared with conventional binding-only and hybrid binding-encoding methods, respectively.

Keywords: low power, bus binding, switching activity, multiple traveling salesman problem, data sequence reordering

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1322
894 Forest Risk and Vulnerability Assessment: A Case Study from East Bokaro Coal Mining Area in India

Authors: Sujata Upgupta, Prasoon Kumar Singh

Abstract:

The expansion of large scale coal mining into forest areas is a potential hazard for the local biodiversity and wildlife. The objective of this study is to provide a picture of the threat that coal mining poses to the forests of the East Bokaro landscape. The vulnerable forest areas at risk have been assessed and the priority areas for conservation have been presented. The forested areas at risk in the current scenario have been assessed and compared with the past conditions using classification and buffer based overlay approach. Forest vulnerability has been assessed using an analytical framework based on systematic indicators and composite vulnerability index values. The results indicate that more than 4 km2 of forests have been lost from 1973 to 2016. Large patches of forests have been diverted for coal mining projects. Forests in the northern part of the coal field within 1-3 km radius around the coal mines are at immediate risk. The original contiguous forests have been converted into fragmented and degraded forest patches. Most of the collieries are located within or very close to the forests thus threatening the biodiversity and hydrology of the surrounding regions. Based on the vulnerability values estimated, it was concluded that more than 90% of the forested grids in East Bokaro are highly vulnerable to mining. The forests in the sub-districts of Bermo and Chandrapura have been identified as the most vulnerable to coal mining activities. This case study would add to the capacity of the forest managers and mine managers to address the risk and vulnerability of forests at a small landscape level in order to achieve sustainable development.

Keywords: Coal mining, forest, indicators, vulnerability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1151
893 Text-Mining Approach for Evaluation of Affective Management Practices

Authors: Masaaki Saito, Qin Tang, Hiroyuki Umemuro

Abstract:

The purpose of this paper is to propose a text mining approach to evaluate companies- practices on affective management. Affective management argues that it is critical to take stakeholders- affects into consideration during decision-making process, along with the traditional numerical and rational indices. CSR reports published by companies were collected as source information. Indices were proposed based on the frequency and collocation of words relevant to affective management concept using text mining approach to analyze the text information of CSR reports. In addition, the relationships between the results obtained using proposed indices and traditional indicators of business performance were investigated using correlation analysis. Those correlations were also compared between manufacturing and non-manufacturing companies. The results of this study revealed the possibility to evaluate affective management practices of companies based on publicly available text documents.

Keywords: Affective management, Affect, Stakeholder, Text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1832
892 Mining of Interesting Prediction Rules with Uniform Two-Level Genetic Algorithm

Authors: Bilal Alatas, Ahmet Arslan

Abstract:

The main goal of data mining is to extract accurate, comprehensible and interesting knowledge from databases that may be considered as large search spaces. In this paper, a new, efficient type of Genetic Algorithm (GA) called uniform two-level GA is proposed as a search strategy to discover truly interesting, high-level prediction rules, a difficult problem and relatively little researched, rather than discovering classification knowledge as usual in the literatures. The proposed method uses the advantage of uniform population method and addresses the task of generalized rule induction that can be regarded as a generalization of the task of classification. Although the task of generalized rule induction requires a lot of computations, which is usually not satisfied with the normal algorithms, it was demonstrated that this method increased the performance of GAs and rapidly found interesting rules.

Keywords: Classification rule mining, data mining, genetic algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1583
891 Finite Element Analysis of Composite Frames in Wheelchair under Upward Loading

Authors: Thomas Jin-Chee Liu, Jin-Wei Liang, Wei-Long Chen, Teng-Hui Chen

Abstract:

The finite element analysis is adopted in this primary study. Using the Tsai-Wu criterion and delamination criterion, the stacking sequence [45/04/-454/904]s is the final optimal design for the wheelchair frame. On the contrary, the uni-directional laminates, i.e. [9013]s, [4513]s and [-4513]s, are bad designs due to the higher failure indexes.

Keywords: Wheelchair frame, stacking sequence, failure index, finite element.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3747
890 Adaptive and Personalizing Learning Sequence Using Modified Roulette Wheel Selection Algorithm

Authors: Melvin A. Ballera

Abstract:

Prior literature in the field of adaptive and personalized learning sequence in e-learning have proposed and implemented various mechanisms to improve the learning process such as individualization and personalization, but complex to implement due to expensive algorithmic programming and need of extensive and prior data. The main objective of personalizing learning sequence is to maximize learning by dynamically selecting the closest teaching operation in order to achieve the learning competency of learner. In this paper, a revolutionary technique has been proposed and tested to perform individualization and personalization using modified reversed roulette wheel selection algorithm that runs at O(n). The technique is simpler to implement and is algorithmically less expensive compared to other revolutionary algorithms since it collects the dynamic real time performance matrix such as examinations, reviews, and study to form the RWSA single numerical fitness value. Results show that the implemented system is capable of recommending new learning sequences that lessens time of study based on student's prior knowledge and real performance matrix.

Keywords: E-learning, fitness value, personalized learning sequence, reversed roulette wheel selection algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2010
889 Mining Correlated Bicluster from Web Usage Data Using Discrete Firefly Algorithm Based Biclustering Approach

Authors: K. Thangavel, R. Rathipriya

Abstract:

For the past one decade, biclustering has become popular data mining technique not only in the field of biological data analysis but also in other applications like text mining, market data analysis with high-dimensional two-way datasets. Biclustering clusters both rows and columns of a dataset simultaneously, as opposed to traditional clustering which clusters either rows or columns of a dataset. It retrieves subgroups of objects that are similar in one subgroup of variables and different in the remaining variables. Firefly Algorithm (FA) is a recently-proposed metaheuristic inspired by the collective behavior of fireflies. This paper provides a preliminary assessment of discrete version of FA (DFA) while coping with the task of mining coherent and large volume bicluster from web usage dataset. The experiments were conducted on two web usage datasets from public dataset repository whereby the performance of FA was compared with that exhibited by other population-based metaheuristic called binary Particle Swarm Optimization (PSO). The results achieved demonstrate the usefulness of DFA while tackling the biclustering problem.

Keywords: Biclustering, Binary Particle Swarm Optimization, Discrete Firefly Algorithm, Firefly Algorithm, Usage profile Web usage mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2108
888 An Approach to Concerns and Aspects Mining for Web Applications

Authors: Carlo Bellettini, Alessandro Marchetto, Andrea Trentini

Abstract:

Web applications have become very complex and crucial, especially when combined with areas such as CRM (Customer Relationship Management) and BPR (Business Process Reengineering), the scientific community has focused attention to Web applications design, development, analysis, and testing, by studying and proposing methodologies and tools. This paper proposes an approach to automatic multi-dimensional concern mining for Web Applications, based on concepts analysis, impact analysis, and token-based concern identification. This approach lets the user to analyse and traverse Web software relevant to a particular concern (concept, goal, purpose, etc.) via multi-dimensional separation of concerns, to document, understand and test Web applications. This technique was developed in the context of WAAT (Web Applications Analysis and Testing) project. A semi-automatic tool to support this technique is currently under development.

Keywords: Aspect Mining, Concepts Analysis, Concerns Mining, Multi-Dimensional Separation of Concerns, Impact Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1497
887 A Sequential Pattern Mining Method Based On Sequential Interestingness

Authors: Shigeaki Sakurai, Youichi Kitahara, Ryohei Orihara

Abstract:

Sequential mining methods efficiently discover all frequent sequential patterns included in sequential data. These methods use the support, which is the previous criterion that satisfies the Apriori property, to evaluate the frequency. However, the discovered patterns do not always correspond to the interests of analysts, because the patterns are common and the analysts cannot get new knowledge from the patterns. The paper proposes a new criterion, namely, the sequential interestingness, to discover sequential patterns that are more attractive for the analysts. The paper shows that the criterion satisfies the Apriori property and how the criterion is related to the support. Also, the paper proposes an efficient sequential mining method based on the proposed criterion. Lastly, the paper shows the effectiveness of the proposed method by applying the method to two kinds of sequential data.

Keywords: Sequential mining, Support, Confidence, Apriori property

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1262
886 Concepts Extraction from Discharge Notes using Association Rule Mining

Authors: Basak Oguz Yolcular

Abstract:

A large amount of valuable information is available in plain text clinical reports. New techniques and technologies are applied to extract information from these reports. In this study, we developed a domain based software system to transform 600 Otorhinolaryngology discharge notes to a structured form for extracting clinical data from the discharge notes. In order to decrease the system process time discharge notes were transformed into a data table after preprocessing. Several word lists were constituted to identify common section in the discharge notes, including patient history, age, problems, and diagnosis etc. N-gram method was used for discovering terms co-Occurrences within each section. Using this method a dataset of concept candidates has been generated for the validation step, and then Predictive Apriori algorithm for Association Rule Mining (ARM) was applied to validate candidate concepts.

Keywords: association rule mining, otorhinolaryngology, predictive apriori, text mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1603
885 Multiple-Level Sequential Pattern Discovery from Customer Transaction Databases

Authors: An Chen, Huilin Ye

Abstract:

Mining sequential patterns from large customer transaction databases has been recognized as a key research topic in database systems. However, the previous works more focused on mining sequential patterns at a single concept level. In this study, we introduced concept hierarchies into this problem and present several algorithms for discovering multiple-level sequential patterns based on the hierarchies. An experiment was conducted to assess the performance of the proposed algorithms. The performances of the algorithms were measured by the relative time spent on completing the mining tasks on two different datasets. The experimental results showed that the performance depends on the characteristics of the datasets and the pre-defined threshold of minimal support for each level of the concept hierarchy. Based on the experimental results, some suggestions were also given for how to select appropriate algorithm for a certain datasets.

Keywords: Data Mining, Multiple-Level Sequential Pattern, Concept Hierarchy, Customer Transaction Database.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1440
884 Generator of Hypotheses an Approach of Data Mining Based on Monotone Systems Theory

Authors: Rein Kuusik, Grete Lind

Abstract:

Generator of hypotheses is a new method for data mining. It makes possible to classify the source data automatically and produces a particular enumeration of patterns. Pattern is an expression (in a certain language) describing facts in a subset of facts. The goal is to describe the source data via patterns and/or IF...THEN rules. Used evaluation criteria are deterministic (not probabilistic). The search results are trees - form that is easy to comprehend and interpret. Generator of hypotheses uses very effective algorithm based on the theory of monotone systems (MS) named MONSA (MONotone System Algorithm).

Keywords: data mining, monotone systems, pattern, rule.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1244
883 Performance of Chaotic Lu System in CDMA Satellites Communications Systems

Authors: K. Kemih, M. Benslama

Abstract:

This paper investigates the problem of spreading sequence and receiver code synchronization techniques for satellite based CDMA communications systems. The performance of CDMA system depends on the autocorrelation and cross-correlation properties of the used spreading sequences. In this paper we propose the uses of chaotic Lu system to generate binary sequences for spreading codes in a direct sequence spread CDMA system. To minimize multiple access interference (MAI) we propose the use of genetic algorithm for optimum selection of chaotic spreading sequences. To solve the problem of transmitter-receiver synchronization, we use the passivity controls. The concept of semipassivity is defined to find simple conditions which ensure boundedness of the solutions of coupled Lu systems. Numerical results are presented to show the effectiveness of the proposed approach.

Keywords: About Chaotic Lu system, synchronization, Spreading sequence, Genetic Algorithm. Passive System

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1727
882 Exons and Introns Classification in Human and Other Organisms

Authors: Benjamin Y. M. Kwan, Jennifer Y. Y. Kwan, Hon Keung Kwan

Abstract:

In the paper, the relative performances on spectral classification of short exon and intron sequences of the human and eleven model organisms is studied. In the simulations, all combinations of sixteen one-sequence numerical representations, four threshold values, and four window lengths are considered. Sequences of 150-base length are chosen and for each organism, a total of 16,000 sequences are used for training and testing. Results indicate that an appropriate combination of one-sequence numerical representation, threshold value, and window length is essential for arriving at top spectral classification results. For fixed-length sequences, the precisions on exon and intron classification obtained for different organisms are not the same because of their genomic differences. In general, precision increases as sequence length increases.

Keywords: Exons and introns classification, Human genome, Model organism genome, Spectral analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2042
881 Data Mining Using Learning Automata

Authors: M. R. Aghaebrahimi, S. H. Zahiri, M. Amiri

Abstract:

In this paper a data miner based on the learning automata is proposed and is called LA-miner. The LA-miner extracts classification rules from data sets automatically. The proposed algorithm is established based on the function optimization using learning automata. The experimental results on three benchmarks indicate that the performance of the proposed LA-miner is comparable with (sometimes better than) the Ant-miner (a data miner algorithm based on the Ant Colony optimization algorithm) and CNZ (a well-known data mining algorithm for classification).

Keywords: Data mining, Learning automata, Classification rules, Knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1921
880 Dose due the Incorporation of Radionuclides Using Teeth as Bioindicators nearby Caetité Uranium Mines

Authors: Viviane S. Guimarães, Ícaro M. M. Brasil, Simara S. Campos, Roseli F. Gennari, Márcia R. P. Attie, Susana O. Souza.

Abstract:

Uranium mining and processing in Brazil occur in a northeastern area near to Caetité-BA. Several Non-Governmental Organizations claim that uranium mining in this region is a pollutant causing health risks to the local population,but those in charge of the complex extraction and production of“yellow cake" for generating fuel to the nuclear power plants reject these allegations. This study aimed at identifying potential problems caused by mining to the population of Caetité. In this, work,the concentrations of 238U, 232Th and 40K radioisotopes in the teeth of the Caetité population were determined by ICP-MS. Teeth are used as bioindicators of incorporated radionuclides. Cumulative radiation doses in the skeleton were also determined. The concentration values were below 0.008 ppm, and annual effective dose due to radioisotopes are below to the reference values. Therefore, it is not possible to state that the mining process in Caetité increases pollution or radiation exposure in a meaningful way.

Keywords: bioindicators, radiation dose, radioisotopesincorporation, uranium.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4096
879 A Hybrid Approach for Quantification of Novelty in Rule Discovery

Authors: Vasudha Bhatnagar, Ahmed Sultan Al-Hegami, Naveen Kumar

Abstract:

Rule Discovery is an important technique for mining knowledge from large databases. Use of objective measures for discovering interesting rules lead to another data mining problem, although of reduced complexity. Data mining researchers have studied subjective measures of interestingness to reduce the volume of discovered rules to ultimately improve the overall efficiency of KDD process. In this paper we study novelty of the discovered rules as a subjective measure of interestingness. We propose a hybrid approach that uses objective and subjective measures to quantify novelty of the discovered rules in terms of their deviations from the known rules. We analyze the types of deviation that can arise between two rules and categorize the discovered rules according to the user specified threshold. We implement the proposed framework and experiment with some public datasets. The experimental results are quite promising.

Keywords: Knowledge Discovery in Databases (KDD), Data Mining, Rule Discovery, Interestingness, Subjective Measures, Novelty Measure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1342
878 A New blaVIM Gene in a Pseudomonas putida Isolated from ENT Units in Sulaimani Hospitals

Authors: Dalanya Asaad Mohammed, Dara Abdul Razaq

Abstract:

A total of twenty tensile biopsies were collected from children undergoing tonsillectomy from teaching hospital ENT department and Kurdistan private hospital in sulaimani city. All biopsies were homogenized and cultured; the obtained bacterial isolates were purified and identified by biochemical tests and VITEK 2 compact system. Among the twenty studied samples, only one Pseudomonas putida with probability of 99% was isolated. Antimicrobial susceptibility was carried out by disk diffusion method, Pseudomonas putida showed resistance to all antibiotics used except vancomycin. The isolate further subjected to PCR and DNA sequence analysis of blaVIM gene using different set of primers for different regions of VIM gene. The results were found to be PCR positive for the blaVIM gene. To determine the sequence of blaVIM gene, DNA sequencing performed. Sequence alignment of blaVIM gene with previously recorded blaVIM gene in NCBI- database showed that P. putida isolate have different blaVIM gene.

Keywords: Clinical isolates, Putida, Sulaimani, Vim gene.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1645
877 Introducing Sequence-Order Constraint into Prediction of Protein Binding Sites with Automatically Extracted Templates

Authors: Yi-Zhong Weng, Chien-Kang Huang, Yu-Feng Huang, Chi-Yuan Yu, Darby Tien-Hao Chang

Abstract:

Search for a tertiary substructure that geometrically matches the 3D pattern of the binding site of a well-studied protein provides a solution to predict protein functions. In our previous work, a web server has been built to predict protein-ligand binding sites based on automatically extracted templates. However, a drawback of such templates is that the web server was prone to resulting in many false positive matches. In this study, we present a sequence-order constraint to reduce the false positive matches of using automatically extracted templates to predict protein-ligand binding sites. The binding site predictor comprises i) an automatically constructed template library and ii) a local structure alignment algorithm for querying the library. The sequence-order constraint is employed to identify the inconsistency between the local regions of the query protein and the templates. Experimental results reveal that the sequence-order constraint can largely reduce the false positive matches and is effective for template-based binding site prediction.

Keywords: Protein structure, binding site, functional prediction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1452
876 Elimination of Redundant Links in Web Pages– Mathematical Approach

Authors: G. Poonkuzhali, K.Thiagarajan, K.Sarukesi

Abstract:

With the enormous growth on the web, users get easily lost in the rich hyper structure. Thus developing user friendly and automated tools for providing relevant information without any redundant links to the users to cater to their needs is the primary task for the website owners. Most of the existing web mining algorithms have concentrated on finding frequent patterns while neglecting the less frequent one that are likely to contain the outlying data such as noise, irrelevant and redundant data. This paper proposes new algorithm for mining the web content by detecting the redundant links from the web documents using set theoretical(classical mathematics) such as subset, union, intersection etc,. Then the redundant links is removed from the original web content to get the required information by the user..

Keywords: Web documents, Web content mining, redundantlink, outliers, set theory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1997
875 A Tree Based Association Rule Approach for XML Data with Semantic Integration

Authors: D. Sasikala, K. Premalatha

Abstract:

The use of eXtensible Markup Language (XML) in web, business and scientific databases lead to the development of methods, techniques and systems to manage and analyze XML data. Semi-structured documents suffer due to its heterogeneity and dimensionality. XML structure and content mining represent convergence for research in semi-structured data and text mining. As the information available on the internet grows drastically, extracting knowledge from XML documents becomes a harder task. Certainly, documents are often so large that the data set returned as answer to a query may also be very big to convey the required information. To improve the query answering, a Semantic Tree Based Association Rule (STAR) mining method is proposed. This method provides intentional information by considering the structure, content and the semantics of the content. The method is applied on Reuter’s dataset and the results show that the proposed method outperforms well.

Keywords: Semi--structured Document, Tree based Association Rule (TAR), Semantic Association Rule Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2332
874 Redesigning Business Processes: A Method Based on Simulation and Process Mining Techniques

Authors: Zahra Mohammadnazari, Fateme Rostambeygi, Fatemeh Dehrouyeh, Hwang Ki-Soon, Amir Aghsami

Abstract:

Corporations have always prioritized efforts to examine and improve processes. Various metrics, such as the cost and time required to implement the process and can be specified in this regard. Process improvement can be defined as an improvement of these indicators. This is accomplished by looking at prospective adjustments to the current executive process model or the resources allotted to it. Research has been conducted in this paper to the improve the procurement process and aims to explore assessment prospects in the project using a combination of process mining and simulation (benefiting from Play-In and Play-Out methodologies). To run the simulation, we will need to complete the control flow diagram, institution settings, resource settings, and activity settings. The process of mining event logs yields the process control flow. However, both the entry of institutions and the distribution of resources must be modeled. The rate of admission of institutions and the distribution of time for the implementation of activities will be determined in the next step.

Keywords: Business reengineering, Petri net, process-based simulation, process mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 452
873 Actionable Rules: Issues and New Directions

Authors: Harleen Kaur

Abstract:

Knowledge Discovery in Databases (KDD) is the process of extracting previously unknown, hidden and interesting patterns from a huge amount of data stored in databases. Data mining is a stage of the KDD process that aims at selecting and applying a particular data mining algorithm to extract an interesting and useful knowledge. It is highly expected that data mining methods will find interesting patterns according to some measures, from databases. It is of vital importance to define good measures of interestingness that would allow the system to discover only the useful patterns. Measures of interestingness are divided into objective and subjective measures. Objective measures are those that depend only on the structure of a pattern and which can be quantified by using statistical methods. While, subjective measures depend only on the subjectivity and understandability of the user who examine the patterns. These subjective measures are further divided into actionable, unexpected and novel. The key issues that faces data mining community is how to make actions on the basis of discovered knowledge. For a pattern to be actionable, the user subjectivity is captured by providing his/her background knowledge about domain. Here, we consider the actionability of the discovered knowledge as a measure of interestingness and raise important issues which need to be addressed to discover actionable knowledge.

Keywords: Data Mining Community, Knowledge Discovery inDatabases (KDD), Interestingness, Subjective Measures, Actionability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1933