Search results for: maximal data sets
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7730

Search results for: maximal data sets

7730 A Comparative Study between Discrete Wavelet Transform and Maximal Overlap Discrete Wavelet Transform for Testing Stationarity

Authors: Amel Abdoullah Ahmed Dghais, Mohd Tahir Ismail

Abstract:

In this paper the core objective is to apply discrete wavelet transform and maximal overlap discrete wavelet transform functions namely Haar, Daubechies2, Symmlet4, Coiflet2 and discrete approximation of the Meyer wavelets in non stationary financial time series data from Dow Jones index (DJIA30) of US stock market. The data consists of 2048 daily data of closing index from December 17, 2004 to October 23, 2012. Unit root test affirms that the data is non stationary in the level. A comparison between the results to transform non stationary data to stationary data using aforesaid transforms is given which clearly shows that the decomposition stock market index by discrete wavelet transform is better than maximal overlap discrete wavelet transform for original data.

Keywords: Discrete wavelet transform, maximal overlap discrete wavelet transform, stationarity, autocorrelation function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4670
7729 Eliciting and Confirming Data, Information, Knowledge and Wisdom in a Specialist Health Care Setting: The WICKED Method

Authors: S. Impey, D. Berry, S. Furtado, M. Galvin, L. Grogan, O. Hardiman, L. Hederman, M. Heverin, V. Wade, L. Douris, D. O'Sullivan, G. Stephens

Abstract:

Healthcare is a knowledge-rich environment. This knowledge, while valuable, is not always accessible outside the borders of individual clinics. This research aims to address part of this problem (at a study site) by constructing a maximal data set (knowledge artefact) for motor neurone disease (MND). This data set is proposed as an initial knowledge base for a concurrent project to develop an MND patient data platform. It represents the domain knowledge at the study site for the duration of the research (12 months). A knowledge elicitation method was also developed from the lessons learned during this process - the WICKED method. WICKED is an anagram of the words: eliciting and confirming data, information, knowledge, wisdom. But it is also a reference to the concept of wicked problems, which are complex and challenging, as is eliciting expert knowledge. The method was evaluated at a second site, and benefits and limitations were noted. Benefits include that the method provided a systematic way to manage data, information, knowledge and wisdom (DIKW) from various sources, including healthcare specialists and existing data sets. Limitations surrounded the time required and how the data set produced only represents DIKW known during the research period. Future work is underway to address these limitations.

Keywords: Healthcare, knowledge acquisition, maximal data sets, action design science.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 381
7728 Properties and Approximation Distribution Reductions in Multigranulation Rough Set Model

Authors: Properties, Approximation Distribution Reductions in Multigranulation Rough Set Model

Abstract:

Some properties of approximation sets are studied in multi-granulation optimist model in rough set theory using maximal compatible classes. The relationships between or among lower and upper approximations in single and multiple granulation are compared and discussed. Through designing Boolean functions and discernibility matrices in incomplete information systems, the lower and upper approximation sets and reduction in multi-granulation environments can be found. By using examples, the correctness of computation approach is consolidated. The related conclusions obtained are suitable for further investigating in multiple granulation RSM.

Keywords: Incomplete information system, maximal compatible class, multi-granulation rough set model, reduction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 811
7727 Multi-labeled Data Expressed by a Set of Labels

Authors: Tetsuya Furukawa, Masahiro Kuzunishi

Abstract:

Collected data must be organized to be utilized efficiently, and hierarchical classification of data is efficient approach to organize data. When data is classified to multiple categories or annotated with a set of labels, users request multi-labeled data by giving a set of labels. There are several interpretations of the data expressed by a set of labels. This paper discusses which data is expressed by a set of labels by introducing orders for sets of labels and shows that there are four types of orders, which are characterized by whether the labels of expressed data includes every label of the given set of labels within the range of the set. Desirable properties of the orders, data is also expressed by the higher set of labels and different sets of labels express different data, are discussed for the orders.

Keywords: Classification Hierarchies, Multi-labeled Data, Multiple Classificaiton, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1256
7726 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analyzing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2725
7725 Solution of Fuzzy Maximal Flow Problems Using Fuzzy Linear Programming

Authors: Amit Kumar, Manjot Kaur

Abstract:

In this paper, the fuzzy linear programming formulation of fuzzy maximal flow problems are proposed and on the basis of the proposed formulation a method is proposed to find the fuzzy optimal solution of fuzzy maximal flow problems. In the proposed method all the parameters are represented by triangular fuzzy numbers. By using the proposed method the fuzzy optimal solution of fuzzy maximal flow problems can be easily obtained. To illustrate the proposed method a numerical example is solved and the obtained results are discussed.

Keywords: Fuzzy linear programming, Fuzzy maximal flow problem, Ranking function, Triangular fuzzy number

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1931
7724 Integral Domains and Their Algebras: Topological Aspects

Authors: Shai Sarussi

Abstract:

Let S be an integral domain with field of fractions F and let A be an F-algebra. An S-subalgebra R of A is called S-nice if R∩F = S and the localization of R with respect to S \{0} is A. Denoting by W the set of all S-nice subalgebras of A, and defining a notion of open sets on W, one can view W as a T0-Alexandroff space. Thus, the algebraic structure of W can be viewed from the point of view of topology. It is shown that every nonempty open subset of W has a maximal element in it, which is also a maximal element of W. Moreover, a supremum of an irreducible subset of W always exists. As a notable connection with valuation theory, one considers the case in which S is a valuation domain and A is an algebraic field extension of F; if S is indecomposed in A, then W is an irreducible topological space, and W contains a greatest element.

Keywords: Algebras over integral domains, Alexandroff topology, valuation domains, integral domains.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 446
7723 Meta Random Forests

Authors: Praveen Boinee, Alessandro De Angelis, Gian Luca Foresti

Abstract:

Leo Breimans Random Forests (RF) is a recent development in tree based classifiers and quickly proven to be one of the most important algorithms in the machine learning literature. It has shown robust and improved results of classifications on standard data sets. Ensemble learning algorithms such as AdaBoost and Bagging have been in active research and shown improvements in classification results for several benchmarking data sets with mainly decision trees as their base classifiers. In this paper we experiment to apply these Meta learning techniques to the random forests. We experiment the working of the ensembles of random forests on the standard data sets available in UCI data sets. We compare the original random forest algorithm with their ensemble counterparts and discuss the results.

Keywords: Random Forests [RF], ensembles, UCI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2644
7722 REDUCER – An Architectural Design Pattern for Reducing Large and Noisy Data Sets

Authors: Apkar Salatian

Abstract:

To relieve the burden of reasoning on a point to point basis, in many domains there is a need to reduce large and noisy data sets into trends for qualitative reasoning. In this paper we propose and describe a new architectural design pattern called REDUCER for reducing large and noisy data sets that can be tailored for particular situations. REDUCER consists of 2 consecutive processes: Filter which takes the original data and removes outliers, inconsistencies or noise; and Compression which takes the filtered data and derives trends in the data. In this seminal article we also show how REDUCER has successfully been applied to 3 different case studies.

Keywords: Design Pattern, filtering, compression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1448
7721 Heterogeneous Attribute Reduction in Noisy System based on a Generalized Neighborhood Rough Sets Model

Authors: Siyuan Jing, Kun She

Abstract:

Neighborhood Rough Sets (NRS) has been proven to be an efficient tool for heterogeneous attribute reduction. However, most of researches are focused on dealing with complete and noiseless data. Factually, most of the information systems are noisy, namely, filled with incomplete data and inconsistent data. In this paper, we introduce a generalized neighborhood rough sets model, called VPTNRS, to deal with the problem of heterogeneous attribute reduction in noisy system. We generalize classical NRS model with tolerance neighborhood relation and the probabilistic theory. Furthermore, we use the neighborhood dependency to evaluate the significance of a subset of heterogeneous attributes and construct a forward greedy algorithm for attribute reduction based on it. Experimental results show that the model is efficient to deal with noisy data.

Keywords: attribute reduction, incomplete data, inconsistent data, tolerance neighborhood relation, rough sets

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1544
7720 Structure of Covering-based Rough Sets

Authors: Shiping Wang, Peiyong Zhu, William Zhu

Abstract:

Rough set theory is a very effective tool to deal with granularity and vagueness in information systems. Covering-based rough set theory is an extension of classical rough set theory. In this paper, firstly we present the characteristics of the reducible element and the minimal description covering-based rough sets through downsets. Then we establish lattices and topological spaces in coveringbased rough sets through down-sets and up-sets. In this way, one can investigate covering-based rough sets from algebraic and topological points of view.

Keywords: Covering, poset, down-set, lattice, topological space, topological base.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1804
7719 Minimizing Mutant Sets by Equivalence and Subsumption

Authors: Samia Alblwi, Amani Ayad

Abstract:

Mutation testing is the art of generating syntactic variations of a base program and checking whether a candidate test suite can identify all the mutants that are not semantically equivalent to the base; this technique can be used to assess the quality of test suite. One of the main obstacles to the widespread use of mutation testing is cost, as even small programs (a few dozen lines of code) can give rise to a large number of mutants (up to hundreds); this has created an incentive to seek to reduce the number of mutants while preserving their collective effectiveness. Two criteria have been used to reduce the size of mutant sets: equivalence, which aims to partition the set of mutants into equivalence classes modulo semantic equivalence, and selecting one representative per class; and, subsumption, which aims to define a partial ordering among mutants that ranks mutants by effectiveness and seeks to select maximal elements in this ordering. In this paper, we analyze these two policies using analytical and empirical criteria.

Keywords: Mutation testing, mutant sets, mutant equivalence, mutant subsumption, mutant set minimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 99
7718 A Genetic Algorithm for Clustering on Image Data

Authors: Qin Ding, Jim Gasvoda

Abstract:

Clustering is the process of subdividing an input data set into a desired number of subgroups so that members of the same subgroup are similar and members of different subgroups have diverse properties. Many heuristic algorithms have been applied to the clustering problem, which is known to be NP Hard. Genetic algorithms have been used in a wide variety of fields to perform clustering, however, the technique normally has a long running time in terms of input set size. This paper proposes an efficient genetic algorithm for clustering on very large data sets, especially on image data sets. The genetic algorithm uses the most time efficient techniques along with preprocessing of the input data set. We test our algorithm on both artificial and real image data sets, both of which are of large size. The experimental results show that our algorithm outperforms the k-means algorithm in terms of running time as well as the quality of the clustering.

Keywords: Clustering, data mining, genetic algorithm, image data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1997
7717 Regular Generalized Star Star closed sets in Bitopological Spaces

Authors: K. Kannan, D. Narasimhan, K. Chandrasekhara Rao, R. Ravikumar

Abstract:

The aim of this paper is to introduce the concepts of τ1τ2-regular generalized star star closed sets , τ1τ2-regular generalized star star open sets and study their basic properties in bitopological spaces.

Keywords: τ1τ2-regular closed sets, τ1τ2-regular open sets, τ1τ2-regular generalized closed sets, τ1τ2-regular generalized star closed sets, τ1τ2-regular generalized star star closed sets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2164
7716 Covering-based Rough sets Based on the Refinement of Covering-element

Authors: Jianguo Tang, Kun She, William Zhu

Abstract:

Covering-based rough sets is an extension of rough sets and it is based on a covering instead of a partition of the universe. Therefore it is more powerful in describing some practical problems than rough sets. However, by extending the rough sets, covering-based rough sets can increase the roughness of each model in recognizing objects. How to obtain better approximations from the models of a covering-based rough sets is an important issue. In this paper, two concepts, determinate elements and indeterminate elements in a universe, are proposed and given precise definitions respectively. This research makes a reasonable refinement of the covering-element from a new viewpoint. And the refinement may generate better approximations of covering-based rough sets models. To prove the theory above, it is applied to eight major coveringbased rough sets models which are adapted from other literature. The result is, in all these models, the lower approximation increases effectively. Correspondingly, in all models, the upper approximation decreases with exceptions of two models in some special situations. Therefore, the roughness of recognizing objects is reduced. This research provides a new approach to the study and application of covering-based rough sets.

Keywords: Determinate element, indeterminate element, refinementof covering-element, refinement of covering, covering-basedrough sets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1276
7715 On Fuzzy Weakly-Closed Sets

Authors: J. Mahanta, P.K. Das

Abstract:

A new class of fuzzy closed sets, namely fuzzy weakly closed set in a fuzzy topological space is introduced and it is established that this class of fuzzy closed sets lies between fuzzy closed sets and fuzzy generalized closed sets. Alongwith the study of fundamental results of such closed sets, we define and characterize fuzzy weakly compact space and fuzzy weakly closed space.

Keywords: Fuzzy weakly-closed set, fuzzy weakly-closed space, fuzzy weakly-compactness, MSC: 54A40, 54D30.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1723
7714 (T1, T2)*- Semi Star Generalized Locally Closed Sets

Authors: M. Sundararaman, K. Chandrasekhara Rao

Abstract:

The aim of this paper is to continue the study of (T1, T2)-semi star generalized closed sets by introducing the concepts of (T1, T2)-semi star generalized locally closed sets and study their basic properties in bitopological spaces.

Keywords: (T1, T2)*-semi star generalized locally closed sets, T1T2-semi star generalized closed sets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1425
7713 A New Condition for Conflicting Bifuzzy Sets Based On Intuitionistic Evaluation

Authors: Imran C.T., Syibrah M.N., Mohd Lazim A.

Abstract:

Fuzzy sets theory affirmed that the linguistic value for every contraries relation is complementary. It was stressed in the intuitionistic fuzzy sets (IFS) that the conditions for contraries relations, which are the fuzzy values, cannot be greater than one. However, complementary in two contradict phenomena are not always true. This paper proposes a new idea condition for conflicting bifuzzy sets by relaxing the condition of intuitionistic fuzzy sets. Here, we will critically forward examples using triangular fuzzy number in formulating a new condition for conflicting bifuzzy sets (CBFS). Evaluation of positive and negative in conflicting phenomena were calculated concurrently by relaxing the condition in IFS. The hypothetical illustration showed the applicability of the new condition in CBFS for solving non-complement contraries intuitionistic evaluation. This approach can be applied to any decision making where conflicting is very much exist.

Keywords: Conflicting bifuzzy set, conflicting degree, fuzzy sets, fuzzy numbers.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1635
7712 Comparison of Imputation Techniques for Efficient Prediction of Software Fault Proneness in Classes

Authors: Geeta Sikka, Arvinder Kaur Takkar, Moin Uddin

Abstract:

Missing data is a persistent problem in almost all areas of empirical research. The missing data must be treated very carefully, as data plays a fundamental role in every analysis. Improper treatment can distort the analysis or generate biased results. In this paper, we compare and contrast various imputation techniques on missing data sets and make an empirical evaluation of these methods so as to construct quality software models. Our empirical study is based on NASA-s two public dataset. KC4 and KC1. The actual data sets of 125 cases and 2107 cases respectively, without any missing values were considered. The data set is used to create Missing at Random (MAR) data Listwise Deletion(LD), Mean Substitution(MS), Interpolation, Regression with an error term and Expectation-Maximization (EM) approaches were used to compare the effects of the various techniques.

Keywords: Missing data, Imputation, Missing Data Techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1624
7711 Generalized Maximal Ratio Combining as a Supra-optimal Receiver Diversity Scheme

Authors: Jean-Pierre Dubois, Rania Minkara, Rafic Ayoubi

Abstract:

Maximal Ratio Combining (MRC) is considered the most complex combining technique as it requires channel coefficients estimation. It results in the lowest bit error rate (BER) compared to all other combining techniques. However the BER starts to deteriorate as errors are introduced in the channel coefficients estimation. A novel combining technique, termed Generalized Maximal Ratio Combining (GMRC) with a polynomial kernel, yields an identical BER as MRC with perfect channel estimation and a lower BER in the presence of channel estimation errors. We show that GMRC outperforms the optimal MRC scheme in general and we hereinafter introduce it to the scientific community as a new “supraoptimal" algorithm. Since diversity combining is especially effective in small femto- and pico-cells, internet-associated wireless peripheral systems are to benefit most from GMRC. As a result, many spinoff applications can be made to IP-based 4th generation networks.

Keywords: Bit error rate, femto-internet cells, generalized maximal ratio combining, signal-to-scattering noise ratio.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2109
7710 Fuzzy Multiple Criteria Decision Making for Unmanned Combat Aircraft Selection Using Proximity Measure Method

Authors: C. Ardil

Abstract:

Intuitionistic fuzzy sets (IFS), Pythagorean fuzzy sets (PyFS), Picture fuzzy sets (PFS), q-rung orthopair fuzzy sets (q-ROF), Spherical fuzzy sets (SFS), T-spherical FS, and Neutrosophic sets (NS) are reviewed as multidimensional extensions of fuzzy sets in order to more explicitly and informatively describe the opinions of decision-making experts under uncertainty. To handle operations with standard fuzzy sets (SFS), the necessary operators; weighted arithmetic mean (WAM), weighted geometric mean (WGM), and Minkowski distance function are defined. The algorithm of the proposed proximity measure method (PMM) is provided with a multiple criteria group decision making method (MCDM) for use in a standard fuzzy set environment. To demonstrate the feasibility of the proposed method, the problem of selecting the best drone for an Air Force procurement request is used. The proximity measure method (PMM) based multidimensional standard fuzzy sets (SFS) is introduced to demonstrate its use with an issue involving unmanned combat aircraft selection.

Keywords: standard fuzzy sets (SFS), unmanned combat aircraft selection, multiple criteria decision making (MCDM), proximity measure method (PMM).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 279
7709 Effect of Submaximal Eccentric versus Maximal Isometric Contraction on Delayed Onset Muscle Soreness

Authors: Mohamed M. Ragab, Neveen A. Abdel Raoof, Reham H. Diab

Abstract:

Background: Delayed onset muscle soreness (DOMS) is the most common symptom when ordinary individuals and athletes are exposed to unaccustomed physical activity, especially eccentric contraction which impairs athletic performance, ordinary people work ability and physical functioning. Multitudes of methods have been investigated to reduce DOMS. One of the valuable methods to control DOMS is repeated bout effect (RBE) as a prophylactic method. Purpose: To compare the repeated bout effect of submaximal eccentric with maximal isometric contraction on induced DOMS. Methods: Sixty normal male volunteers were assigned randomly into three equal groups: Group A (first study group): 20 subjects received submaximal eccentric contraction on non-dominant elbow flexors as a prophylactic exercise. Group B (second study group): 20 subjects received maximal isometric contraction on nondominant elbow flexors as a prophylactic exercise. Group C (control group): 20 subjects did not receive any prophylactic exercises. Maximal isometric peak torque of elbow flexors and patient related elbow evaluation (PREE) scale were measured for each subject 3 times before, immediately after, and 48 hours after induction of DOMS. Results: Post-hoc test for maximal isometric peak torque and PREE scale immediately and 48 hours after induction of DOMS revealed that group (A) and group (B) resulted in significant decrease in maximal isometric strength loss and elbow pain and disability rather than control group (C), but submaximal eccentric group (A) was more effective than maximal isometric group (B) as it showed more rapid recovery of functional strength and less degrees of elbow pain and disability. Conclusion: Both submaximal eccentric contraction and maximal isometric contraction were effective in prevention of DOMS but submaximal eccentric contraction produced a greater protective effect against muscle damage induced by maximal eccentric exercise performed 2 days later.

Keywords: Delayed onset muscle soreness, maximal isometric peak torque, patient related elbow evaluation scale, repeated bout effect.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2041
7708 Investigations on Some Operations of Soft Sets

Authors: Xun Ge, Songlin Yang

Abstract:

Soft set theory was initiated by Molodtsov in 1999. In the past years, this theory had been applied to many branches of mathematics, information science and computer science. In 2003, Maji et al. introduced some operations of soft sets and gave some operational rules. Recently, some of these operational rules are pointed out to be not true. Furthermore, Ali et al., in their paper, introduced and discussed some new operations of soft sets. In this paper, we further investigate these operational rules given by Maji et al. and Ali et al.. We obtain some sufficient-necessary conditions such that corresponding operational rules hold and give correct forms for some operational rules. These results will be help for us to use rightly operational rules of soft sets in research and application of soft set theory.

Keywords: Soft sets, union, intersection, complement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1649
7707 Comparative Study of Decision Trees and Rough Sets Theory as Knowledge ExtractionTools for Design and Control of Industrial Processes

Authors: Marcin Perzyk, Artur Soroczynski

Abstract:

General requirements for knowledge representation in the form of logic rules, applicable to design and control of industrial processes, are formulated. Characteristic behavior of decision trees (DTs) and rough sets theory (RST) in rules extraction from recorded data is discussed and illustrated with simple examples. The significance of the models- drawbacks was evaluated, using simulated and industrial data sets. It is concluded that performance of DTs may be considerably poorer in several important aspects, compared to RST, particularly when not only a characterization of a problem is required, but also detailed and precise rules are needed, according to actual, specific problems to be solved.

Keywords: Knowledge extraction, decision trees, rough setstheory, industrial processes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1588
7706 Proposing an Efficient Method for Frequent Pattern Mining

Authors: Vaibhav Kant Singh, Vijay Shah, Yogendra Kumar Jain, Anupam Shukla, A.S. Thoke, Vinay KumarSingh, Chhaya Dule, Vivek Parganiha

Abstract:

Data mining, which is the exploration of knowledge from the large set of data, generated as a result of the various data processing activities. Frequent Pattern Mining is a very important task in data mining. The previous approaches applied to generate frequent set generally adopt candidate generation and pruning techniques for the satisfaction of the desired objective. This paper shows how the different approaches achieve the objective of frequent mining along with the complexities required to perform the job. This paper will also look for hardware approach of cache coherence to improve efficiency of the above process. The process of data mining is helpful in generation of support systems that can help in Management, Bioinformatics, Biotechnology, Medical Science, Statistics, Mathematics, Banking, Networking and other Computer related applications. This paper proposes the use of both upward and downward closure property for the extraction of frequent item sets which reduces the total number of scans required for the generation of Candidate Sets.

Keywords: Data Mining, Candidate Sets, Frequent Item set, Pruning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1638
7705 Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering

Authors: Yunus Doğan, Ahmet Durap

Abstract:

Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.

Keywords: Clustering algorithms, coastal engineering, data mining, data summarization, statistical methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1187
7704 Frequent Itemset Mining Using Rough-Sets

Authors: Usman Qamar, Younus Javed

Abstract:

Frequent pattern mining is the process of finding a pattern (a set of items, subsequences, substructures, etc.) that occurs frequently in a data set. It was proposed in the context of frequent itemsets and association rule mining. Frequent pattern mining is used to find inherent regularities in data. What products were often purchased together? Its applications include basket data analysis, cross-marketing, catalog design, sale campaign analysis, Web log (click stream) analysis, and DNA sequence analysis. However, one of the bottlenecks of frequent itemset mining is that as the data increase the amount of time and resources required to mining the data increases at an exponential rate. In this investigation a new algorithm is proposed which can be uses as a pre-processor for frequent itemset mining. FASTER (FeAture SelecTion using Entropy and Rough sets) is a hybrid pre-processor algorithm which utilizes entropy and roughsets to carry out record reduction and feature (attribute) selection respectively. FASTER for frequent itemset mining can produce a speed up of 3.1 times when compared to original algorithm while maintaining an accuracy of 71%.

Keywords: Rough-sets, Classification, Feature Selection, Entropy, Outliers, Frequent itemset mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2390
7703 Effects of Sprint Training on Athletic Performance Related Physiological, Cardiovascular, and Neuromuscular Parameters

Authors: Asim Cengiz, Dede Basturk, Hakan Ozalp

Abstract:

Practicing recurring resistance workout such as may cause changes in human muscle. These changes may be because combination if several factors determining physical fitness. Thus, it is important to identify these changes. Several studies were reviewed to investigate these changes. As a result, the changes included positive modifications in amplified citrate synthase (CS) maximal activity, increased capacity for pyruvate oxidation, improvement on molecular signaling on human performance, amplified resting muscle glycogen and whole GLUT4 protein content, better health outcomes such as enhancement in cardiorespiratory fitness. Sprint training also have numerous long long-term changes inhuman body such as better enzyme action, changes in muscle fiber and oxidative ability. This is important because SV is the critical factor influencing maximal cardiac output and therefore oxygen delivery and maximal aerobic power.

Keywords: Sprint, training, performance, exercise.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 860
7702 A New Objective Weight on Interval Type-2 Fuzzy Sets

Authors: Nurnadiah Z., Lazim A.

Abstract:

The design of weight is one of the important parts in fuzzy decision making, as it would have a deep effect on the evaluation results. Entropy is one of the weight measure based on objective evaluation. Non--probabilistic-type entropy measures for fuzzy set and interval type-2 fuzzy sets (IT2FS) have been developed and applied to weight measure. Since the entropy for (IT2FS) for decision making yet to be explored, this paper proposes a new objective weight method by using entropy weight method for multiple attribute decision making (MADM). This paper utilizes the nature of IT2FS concept in the evaluation process to assess the attribute weight based on the credibility of data. An example was presented to demonstrate the feasibility of the new method in decision making. The entropy measure of interval type-2 fuzzy sets yield flexible judgment and could be applied in decision making environment.

Keywords: Objective weight, entropy weight, multiple attributedecision making, type-2 fuzzy sets, interval type-2 fuzzy sets

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1616
7701 Lithofacies Classification from Well Log Data Using Neural Networks, Interval Neutrosophic Sets and Quantification of Uncertainty

Authors: Pawalai Kraipeerapun, Chun Che Fung, Kok Wai Wong

Abstract:

This paper proposes a novel approach to the question of lithofacies classification based on an assessment of the uncertainty in the classification results. The proposed approach has multiple neural networks (NN), and interval neutrosophic sets (INS) are used to classify the input well log data into outputs of multiple classes of lithofacies. A pair of n-class neural networks are used to predict n-degree of truth memberships and n-degree of false memberships. Indeterminacy memberships or uncertainties in the predictions are estimated using a multidimensional interpolation method. These three memberships form the INS used to support the confidence in results of multiclass classification. Based on the experimental data, our approach improves the classification performance as compared to an existing technique applied only to the truth membership. In addition, our approach has the capability to provide a measure of uncertainty in the problem of multiclass classification.

Keywords: Multiclass classification, feed-forward backpropagation neural network, interval neutrosophic sets, uncertainty.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1594