Search results for: Content mining
2000 Applying Fuzzy FP-Growth to Mine Fuzzy Association Rules
Authors: Chien-Hua Wang, Wei-Hsuan Lee, Chin-Tzong Pang
Abstract:
In data mining, the association rules are used to find for the associations between the different items of the transactions database. As the data collected and stored, rules of value can be found through association rules, which can be applied to help managers execute marketing strategies and establish sound market frameworks. This paper aims to use Fuzzy Frequent Pattern growth (FFP-growth) to derive from fuzzy association rules. At first, we apply fuzzy partition methods and decide a membership function of quantitative value for each transaction item. Next, we implement FFP-growth to deal with the process of data mining. In addition, in order to understand the impact of Apriori algorithm and FFP-growth algorithm on the execution time and the number of generated association rules, the experiment will be performed by using different sizes of databases and thresholds. Lastly, the experiment results show FFPgrowth algorithm is more efficient than other existing methods.Keywords: Data mining, association rule, fuzzy frequent patterngrowth.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18001999 Applications of Genetic Programming in Data Mining
Authors: Saleh Mesbah Elkaffas, Ahmed A. Toony
Abstract:
This paper details the application of a genetic programming framework for induction of useful classification rules from a database of income statements, balance sheets, and cash flow statements for North American public companies. Potentially interesting classification rules are discovered. Anomalies in the discovery process merit further investigation of the application of genetic programming to the dataset for the problem domain.Keywords: Genetic programming, data mining classification rule.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15451998 Effect of Moisture Content and Loading Rate on Mechanical Strength of Brown Rice Varieties
Authors: I. Bagheri, M.B. Dehpour
Abstract:
The effect of moisture content and loading rate on mechanical strength of 12 brown rice grain varieties was determined. The results showed that the rupture force of brown rice grain decreased by increasing the moisture content and loading rate. The highest rupture force values was obtained at the moisture content of 8% (w.b.) and loading rate of 10 mm/min; while the lowest rupture force corresponded to the moisture content of 14% (w.b.) and loading rate of 15 mm/min. The 12 varieties were divided into three groups, namely local short grain varieties, local long grain varieties and improved long grain varieties. It was observed that the rupture strength of the three groups were statistically different from each other (P<0.01). It was revealed that the brown rice rupture at lower levels of moisture content was in the form of sudden failure with less deformation; while at higher levels of moisture content the grain rupture was in the form of gradually crushing with more deformation.Keywords: Brown rice, loading rate, moisture content, ruptureforce
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14911997 Novelty as a Measure of Interestingness in Knowledge Discovery
Authors: Vasudha Bhatnagar, Ahmed Sultan Al-Hegami, Naveen Kumar
Abstract:
Rule Discovery is an important technique for mining knowledge from large databases. Use of objective measures for discovering interesting rules leads to another data mining problem, although of reduced complexity. Data mining researchers have studied subjective measures of interestingness to reduce the volume of discovered rules to ultimately improve the overall efficiency of KDD process. In this paper we study novelty of the discovered rules as a subjective measure of interestingness. We propose a hybrid approach based on both objective and subjective measures to quantify novelty of the discovered rules in terms of their deviations from the known rules (knowledge). We analyze the types of deviation that can arise between two rules and categorize the discovered rules according to the user specified threshold. We implement the proposed framework and experiment with some public datasets. The experimental results are promising.Keywords: Knowledge Discovery in Databases (KDD), Interestingness, Subjective Measures, Novelty Index.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18071996 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning
Authors: Walid Cherif
Abstract:
Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.
Keywords: Data mining, knowledge discovery, machine learning, similarity measurement, supervised classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15271995 Proximate and Mineral Composition of Chicken Giblets from Vojvodina (Northern Serbia)
Authors: M. R. Jokanović, V. M. Tomović, M. T. Jović, S. B. Škaljac, B. V. Šojić, P. M. Ikonić, T. A. Tasić
Abstract:
Proximate (moisture, protein, total fat, total ash) and mineral (K, P, Na, Mg, Ca, Zn, Fe, Cu and Mn) composition of chicken giblets (heart, liver and gizzard) were investigated. Phosphorous content, as well as proximate composition, were determined according to recommended ISO methods. The content of all elements, except phosphorus, of the giblets tissues were determined using inductively coupled plasma-optical emission spectrometry (ICP-OES), after dry ashing mineralization. Regarding proximate composition heart was the highest in total fat content, and the lowest in protein content. Liver was the highest in protein and total ash content, while gizzard was the highest in moisture and the lowest in total fat content. Regarding mineral composition liver was the highest for K, P, Ca, Mg, Fe, Zn, Cu, and Mn, while heart was the highest for Na content. The contents of almost all investigated minerals in analysed giblets tissues of chickens from Vojvodina were similar to values reported in the literature, i.e. in national food composition databases of other countries.
Keywords: Chicken giblets, proximate composition, mineral composition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27041994 A Cumulative Learning Approach to Data Mining Employing Censored Production Rules (CPRs)
Authors: Rekha Kandwal, Kamal K.Bharadwaj
Abstract:
Knowledge is indispensable but voluminous knowledge becomes a bottleneck for efficient processing. A great challenge for data mining activity is the generation of large number of potential rules as a result of mining process. In fact sometimes result size is comparable to the original data. Traditional data mining pruning activities such as support do not sufficiently reduce the huge rule space. Moreover, many practical applications are characterized by continual change of data and knowledge, thereby making knowledge voluminous with each change. The most predominant representation of the discovered knowledge is the standard Production Rules (PRs) in the form If P Then D. Michalski & Winston proposed Censored Production Rules (CPRs), as an extension of production rules, that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: If P Then D Unless C, where C (Censor) is an exception to the rule. Such rules are employed in situations in which the conditional statement 'If P Then D' holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence, are tight or there is simply no information available as to whether it holds or not. Thus the 'If P Then D' part of the CPR expresses important information while the Unless C part acts only as a switch changes the polarity of D to ~D. In this paper a scheme based on Dempster-Shafer Theory (DST) interpretation of a CPR is suggested for discovering CPRs from the discovered flat PRs. The discovery of CPRs from flat rules would result in considerable reduction of the already discovered rules. The proposed scheme incrementally incorporates new knowledge and also reduces the size of knowledge base considerably with each episode. Examples are given to demonstrate the behaviour of the proposed scheme. The suggested cumulative learning scheme would be useful in mining data streams.
Keywords: Censored production rules, cumulative learning, data mining, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14851993 Effect of Germination on Proximate, Available Phenol and Flavonoid Content, and Antioxidant Activities of African Yam Bean (Sphenostylis stenocarpa)
Authors: Nneka N. Uchegbu, Ndidi F. Amulu
Abstract:
The work studied the effect of germination on proximate, phenol and flavonoid content and antioxidant activities (AOA) of African Yam been (AYB). Germination was done in controlled dark chamber (100% RH, 28oC). The proximate, phenol and flavonoid content and antioxidant activities before and after germination were investigated. The crude protein, moisture, and crude fiber content of germinated AYB were significantly higher (P<0.05) than that of ungermianated seed, while the fat, Ash and carbohydrate content of ungerminated were higher than the germinated seed. Germination increased the phenol and flavoniod content by 19.14% and 14.53% respectively. The results of AOA assay showed that the DPPH, reducing power and FRAP of germinated AYB seed gave high values: 48.92 ±1.22 μg/ml, 0.75± 0.15μg/ml and 98.60±0.04 μmol/g while that of ungerminated seed were: 31.33μ/ml, 0.56±1.52μg/ml and 96.11±1.13μmol/g respectively. Germinated AYB has phytochemicals with potential AOA for disease prevention.
Keywords: Antioxidant, flovonoid, germination, phenol.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23561992 Dynamical Analysis of Circadian Gene Expression
Authors: Carla Layana Luis Diambra
Abstract:
Microarrays technique allows the simultaneous measurements of the expression levels of thousands of mRNAs. By mining this data one can identify the dynamics of the gene expression time series. By recourse of principal component analysis, we uncover the circadian rhythmic patterns underlying the gene expression profiles from Cyanobacterium Synechocystis. We applied PCA to reduce the dimensionality of the data set. Examination of the components also provides insight into the underlying factors measured in the experiments. Our results suggest that all rhythmic content of data can be reduced to three main components.
Keywords: circadian rhythms, clustering, gene expression, PCA.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15921991 Classifier Based Text Mining for Neural Network
Authors: M. Govindarajan, R. M. Chandrasekaran
Abstract:
Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge discovery in text (KDT), or Text data mining or Text Mining. In Neural Network that address classification problems, training set, testing set, learning rate are considered as key tasks. That is collection of input/output patterns that are used to train the network and used to assess the network performance, set the rate of adjustments. This paper describes a proposed back propagation neural net classifier that performs cross validation for original Neural Network. In order to reduce the optimization of classification accuracy, training time. The feasibility the benefits of the proposed approach are demonstrated by means of five data sets like contact-lenses, cpu, weather symbolic, Weather, labor-nega-data. It is shown that , compared to exiting neural network, the training time is reduced by more than 10 times faster when the dataset is larger than CPU or the network has many hidden units while accuracy ('percent correct') was the same for all datasets but contact-lences, which is the only one with missing attributes. For contact-lences the accuracy with Proposed Neural Network was in average around 0.3 % less than with the original Neural Network. This algorithm is independent of specify data sets so that many ideas and solutions can be transferred to other classifier paradigms.Keywords: Back propagation, classification accuracy, textmining, time complexity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 42181990 Text Mining Technique for Data Mining Application
Authors: M. Govindarajan
Abstract:
Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge discovery in text (KDT), or Text data mining or Text Mining. In decision tree approach is most useful in classification problem. With this technique, tree is constructed to model the classification process. There are two basic steps in the technique: building the tree and applying the tree to the database. This paper describes a proposed C5.0 classifier that performs rulesets, cross validation and boosting for original C5.0 in order to reduce the optimization of error ratio. The feasibility and the benefits of the proposed approach are demonstrated by means of medial data set like hypothyroid. It is shown that, the performance of a classifier on the training cases from which it was constructed gives a poor estimate by sampling or using a separate test file, either way, the classifier is evaluated on cases that were not used to build and evaluate the classifier are both are large. If the cases in hypothyroid.data and hypothyroid.test were to be shuffled and divided into a new 2772 case training set and a 1000 case test set, C5.0 might construct a different classifier with a lower or higher error rate on the test cases. An important feature of see5 is its ability to classifiers called rulesets. The ruleset has an error rate 0.5 % on the test cases. The standard errors of the means provide an estimate of the variability of results. One way to get a more reliable estimate of predictive is by f-fold –cross- validation. The error rate of a classifier produced from all the cases is estimated as the ratio of the total number of errors on the hold-out cases to the total number of cases. The Boost option with x trials instructs See5 to construct up to x classifiers in this manner. Trials over numerous datasets, large and small, show that on average 10-classifier boosting reduces the error rate for test cases by about 25%.Keywords: C5.0, Error Ratio, text mining, training data, test data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24891989 Comparison of Bioactive Compound Content in Egg Yolk Oil Extracted from Eggs Obtained from Different Laying Hen Housing Systems
Authors: Aleksandrs Kovalcuks
Abstract:
Egg yolk oil is a natural source of bioactive compounds such as unsaturated fatty acids, oil soluble vitamins, pigments and others. Bioactive compound content in egg yolk oil depends from its content in eggs, from which oil was extracted. Many studies show that bioactive compound content in egg is correlated to the content of these compounds in hen feed, but there is also an opinion that hen housing systems also have influence on egg chemical content. The aim of this study was to determine which factor, laying hen housing system or hen diet, has a primary influence on bioactive compound content in egg yolk oil. The egg yolk oil was extracted from eggs obtained from 4 different hen housing systems: cage, barn and two groups of free range. All hens were fed with commercially produced compound feed except one group of free range hens which get free diet – pastured hens. Extracted egg yolk oils were analyzed for fatty acids, oil soluble vitamins and β-carotene content. α-tocopherol, ergocalcipherol and polyunsaturated fatty acid content in egg yolk oil was higher from eggs obtained from all housing systems where hens were fed with commercial compound feed. β-carotene and retinol content in egg yolk oils from free range free diet eggs was significantly (p>0.05) higher that from other eggs because hens have access to green forage. Hen physical activity in free range housing systems decreases content of some bioactive compound in egg yolk oil.
Keywords: Egg yolk oil, vitamins, caged eggs, free range.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28441988 Analysis of a Population of Diabetic Patients Databases with Classifiers
Authors: Murat Koklu, Yavuz Unal
Abstract:
Data mining can be called as a technique to extract information from data. It is the process of obtaining hidden information and then turning it into qualified knowledge by statistical and artificial intelligence technique. One of its application areas is medical area to form decision support systems for diagnosis just by inventing meaningful information from given medical data. In this study a decision support system for diagnosis of illness that make use of data mining and three different artificial intelligence classifier algorithms namely Multilayer Perceptron, Naive Bayes Classifier and J.48. Pima Indian dataset of UCI Machine Learning Repository was used. This dataset includes urinary and blood test results of 768 patients. These test results consist of 8 different feature vectors. Obtained classifying results were compared with the previous studies. The suggestions for future studies were presented.
Keywords: Artificial Intelligence, Classifiers, Data Mining, Diabetic Patients.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 54311987 Multi-Dimensional Concerns Mining for Web Applications via Concept-Analysis
Authors: Carlo Bellettini, Alessandro Marchetto, Andrea Trentini
Abstract:
Web applications have become very complex and crucial, especially when combined with areas such as CRM (Customer Relationship Management) and BPR (Business Process Reengineering), the scientific community has focused attention to Web applications design, development, analysis, and testing, by studying and proposing methodologies and tools. This paper proposes an approach to automatic multi-dimensional concern mining for Web Applications, based on concepts analysis, impact analysis, and token-based concern identification. This approach lets the user to analyse and traverse Web software relevant to a particular concern (concept, goal, purpose, etc.) via multi-dimensional separation of concerns, to document, understand and test Web applications. This technique was developed in the context of WAAT (Web Applications Analysis and Testing) project. A semi-automatic tool to support this technique is currently under development.Keywords: Concepts Analysis, Concerns Mining, Multi-Dimensional Separation of Concerns, Impact Analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14731986 Ensemble Approach for Predicting Student's Academic Performance
Authors: L. A. Muhammad, M. S. Argungu
Abstract:
Educational data mining (EDM) has recorded substantial considerations. Techniques of data mining in one way or the other have been proposed to dig out out-of-sight knowledge in educational data. The result of the study got assists academic institutions in further enhancing their process of learning and methods of passing knowledge to students. Consequently, the performance of students boasts and the educational products are by no doubt enhanced. This study adopted a student performance prediction model premised on techniques of data mining with Students' Essential Features (SEF). SEF are linked to the learner's interactivity with the e-learning management system. The performance of the student's predictive model is assessed by a set of classifiers, viz. Bayes Network, Logistic Regression, and Reduce Error Pruning Tree (REP). Consequently, ensemble methods of Bagging, Boosting, and Random Forest (RF) are applied to improve the performance of these single classifiers. The study reveals that the result shows a robust affinity between learners' behaviors and their academic attainment. Result from the study shows that the REP Tree and its ensemble record the highest accuracy of 83.33% using SEF. Hence, in terms of the Receiver Operating Curve (ROC), boosting method of REP Tree records 0.903, which is the best. This result further demonstrates the dependability of the proposed model.
Keywords: Ensemble, bagging, Random Forest, boosting, data mining, classifiers, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7611985 Arabic Light Stemmer for Better Search Accuracy
Authors: Sahar Khedr, Dina Sayed, Ayman Hanafy
Abstract:
Arabic is one of the most ancient and critical languages in the world. It has over than 250 million Arabic native speakers and more than twenty countries having Arabic as one of its official languages. In the past decade, we have witnessed a rapid evolution in smart devices, social network and technology sector which led to the need to provide tools and libraries that properly tackle the Arabic language in different domains. Stemming is one of the most crucial linguistic fundamentals. It is used in many applications especially in information extraction and text mining fields. The motivation behind this work is to enhance the Arabic light stemmer to serve the data mining industry and leverage it in an open source community. The presented implementation works on enhancing the Arabic light stemmer by utilizing and enhancing an algorithm that provides an extension for a new set of rules and patterns accompanied by adjusted procedure. This study has proven a significant enhancement for better search accuracy with an average 10% improvement in comparison with previous works.Keywords: Arabic data mining, Arabic Information extraction, Arabic Light stemmer, Arabic stemmer.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14961984 Mining Network Data for Intrusion Detection through Naïve Bayesian with Clustering
Authors: Dewan Md. Farid, Nouria Harbi, Suman Ahmmed, Md. Zahidur Rahman, Chowdhury Mofizur Rahman
Abstract:
Network security attacks are the violation of information security policy that received much attention to the computational intelligence society in the last decades. Data mining has become a very useful technique for detecting network intrusions by extracting useful knowledge from large number of network data or logs. Naïve Bayesian classifier is one of the most popular data mining algorithm for classification, which provides an optimal way to predict the class of an unknown example. It has been tested that one set of probability derived from data is not good enough to have good classification rate. In this paper, we proposed a new learning algorithm for mining network logs to detect network intrusions through naïve Bayesian classifier, which first clusters the network logs into several groups based on similarity of logs, and then calculates the prior and conditional probabilities for each group of logs. For classifying a new log, the algorithm checks in which cluster the log belongs and then use that cluster-s probability set to classify the new log. We tested the performance of our proposed algorithm by employing KDD99 benchmark network intrusion detection dataset, and the experimental results proved that it improves detection rates as well as reduces false positives for different types of network intrusions.Keywords: Clustering, detection rate, false positive, naïveBayesian classifier, network intrusion detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 55361983 Investigating Crime Hotspot Places and their Implication to Urban Environmental Design: A Geographic Visualization and Data Mining Approach
Authors: Donna R. Tabangin, Jacqueline C. Flores, Nelson F. Emperador
Abstract:
Information is power. Geographical information is an emerging science that is advancing the development of knowledge to further help in the understanding of the relationship of “place" with other disciplines such as crime. The researchers used crime data for the years 2004 to 2007 from the Baguio City Police Office to determine the incidence and actual locations of crime hotspots. Combined qualitative and quantitative research methodology was employed through extensive fieldwork and observation, geographic visualization with Geographic Information Systems (GIS) and Global Positioning Systems (GPS), and data mining. The paper discusses emerging geographic visualization and data mining tools and methodologies that can be used to generate baseline data for environmental initiatives such as urban renewal and rejuvenation. The study was able to demonstrate that crime hotspots can be computed and were seen to be occurring to some select places in the Central Business District (CBD) of Baguio City. It was observed that some characteristics of the hotspot places- physical design and milieu may play an important role in creating opportunities for crime. A list of these environmental attributes was generated. This derived information may be used to guide the design or redesign of the urban environment of the City to be able to reduce crime and at the same time improve it physically.Keywords: Crime mapping, data mining, environmental design, geographic visualization, GIS.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26231982 Knowledge-Driven Decision Support System Based on Knowledge Warehouse and Data Mining by Improving Apriori Algorithm with Fuzzy Logic
Authors: Pejman Hosseinioun, Hasan Shakeri, Ghasem Ghorbanirostam
Abstract:
In recent years, we have seen an increasing importance of research and study on knowledge source, decision support systems, data mining and procedure of knowledge discovery in data bases and it is considered that each of these aspects affects the others. In this article, we have merged information source and knowledge source to suggest a knowledge based system within limits of management based on storing and restoring of knowledge to manage information and improve decision making and resources. In this article, we have used method of data mining and Apriori algorithm in procedure of knowledge discovery one of the problems of Apriori algorithm is that, a user should specify the minimum threshold for supporting the regularity. Imagine that a user wants to apply Apriori algorithm for a database with millions of transactions. Definitely, the user does not have necessary knowledge of all existing transactions in that database, and therefore cannot specify a suitable threshold. Our purpose in this article is to improve Apriori algorithm. To achieve our goal, we tried using fuzzy logic to put data in different clusters before applying the Apriori algorithm for existing data in the database and we also try to suggest the most suitable threshold to the user automatically.
Keywords: Decision support system, data mining, knowledge discovery, data discovery, fuzzy logic.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21321981 Discovering Complex Regularities by Adaptive Self Organizing Classification
Authors: A. Faro, D. Giordano, F. Maiorana
Abstract:
Data mining uses a variety of techniques each of which is useful for some particular task. It is important to have a deep understanding of each technique and be able to perform sophisticated analysis. In this article we describe a tool built to simulate a variation of the Kohonen network to perform unsupervised clustering and support the entire data mining process up to results visualization. A graphical representation helps the user to find out a strategy to optmize classification by adding, moving or delete a neuron in order to change the number of classes. The tool is also able to automatically suggest a strategy for number of classes optimization.The tool is used to classify macroeconomic data that report the most developed countries? import and export. It is possible to classify the countries based on their economic behaviour and use an ad hoc tool to characterize the commercial behaviour of a country in a selected class from the analysis of positive and negative features that contribute to classes formation.
Keywords: Unsupervised classification, Kohonen networks, macroeconomics, Visual data mining, cluster interpretation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15631980 Digital Content Strategy: Detailed Review of the Key Content Components
Authors: Oksana Razina, Shakeel Ahmad, Jessie Qun Ren, Olufemi Isiaq
Abstract:
The modern life of businesses is categorically reliant on their established position online, where digital (and particularly website) content plays a significant role as the first point of information. Digital content, therefore, becomes essential – from making the first impression through to the building and development of client relationships. Despite a number of valuable papers suggesting a strategic approach when dealing with digital data, other sources often do not view or accept the approach to digital content as a holistic or continuous process. Associations are frequently made with merely a one-off marketing campaign or similar. The challenge is in establishing an agreed definition for the notion of Digital Content Strategy (DCS), which currently does not exist, as it is viewed from an excessive number of angles. A strategic approach to content, nonetheless, is required, both practically and contextually. We, therefore, aimed at attempting to identify the key content components, comprising a DCS, to ensure all the aspects were covered and strategically applied – from the company’s understanding of the content value to the ability to display flexibility of content and advances in technology. This conceptual project evaluated existing literature on the topic of DCS and related aspects, using PRISMA Systematic Review Method, Document Analysis, Inclusion and Exclusion Criteria, Scoping Review, Snow-Balling Technique and Thematic Analysis. The data were collected from academic and statistical sources, government and relevant trade publications. Based on the suggestions from academics and trading sources, related to the issues discussed, we revealed the key actions for content creation and attempted to define the notion of DCS. The major finding of the study presented Key Content Components of DCS and can be considered for implementation in a business retail setting.
Keywords: Digital content strategy, digital marketing strategy, key content components, websites.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2281979 Digital Content Strategy: Detailed Review of the Key Content Components
Authors: Oksana Razina, Shakeel Ahmad, Jessie Qun Ren, Olufemi Isiaq
Abstract:
The modern life of businesses is categorically reliant on their established position online, where digital (and particularly website) content plays a significant role as the first point of information. Digital content, therefore, becomes essential – from making the first impression through to the building and development of client relationships. Despite a number of valuable papers suggesting a strategic approach when dealing with digital data, other sources often do not view or accept the approach to digital content as a holistic or continuous process. Associations are frequently made with merely a one-off marketing campaign or similar. The challenge is in establishing an agreed definition for the notion of Digital Content Strategy (DCS), which currently does not exist, as it is viewed from an excessive number of angles. A strategic approach to content, nonetheless, is required, both practically and contextually. We, therefore, aimed at attempting to identify the key content components, comprising a DCS, to ensure all the aspects were covered and strategically applied – from the company’s understanding of the content value to the ability to display flexibility of content and advances in technology. This conceptual project evaluated existing literature on the topic of DCS and related aspects, using PRISMA Systematic Review Method, Document Analysis, Inclusion and Exclusion Criteria, Scoping Review, Snow-Balling Technique and Thematic Analysis. The data were collected from academic and statistical sources, government and relevant trade publications. Based on the suggestions from academics and trading sources, related to the issues discussed, we revealed the key actions for content creation and attempted to define the notion of DCS. The major finding of the study presented Key Content Components of DCS and can be considered for implementation in a business retail setting.
Keywords: Digital content strategy, digital marketing strategy, key content components, websites.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2141978 Crude Protein and Ash Content in Different Coloured Phaseolus coccineus L.
Authors: Liene Strauta, Sandra Muizniece-Brasava, Ina Alsina
Abstract:
Phaseolus coccineus L. is the third most important cultivated Phaseolus species in the world. It is widely grown in Latvia due to its earliness, good taste and uniform and qualitative yield. Experiments were carried out in the laboratories of Department of Food Technology and Agronomical Analysis Scientific Laboratory at Latvia Universityof Agriculture. Beans (Phaseolus coccineus L.) crude protein, crude ash content as well as colour measurements were analyzed. Results show, that brown coloured beans have less crude protein content than others, and ash content have significant differences.
Keywords: Phaseoluscoccineus, protein, ash, colour.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 33351977 Hierarchical Clustering Algorithms in Data Mining
Authors: Z. Abdullah, A. R. Hamdan
Abstract:
Clustering is a process of grouping objects and data into groups of clusters to ensure that data objects from the same cluster are identical to each other. Clustering algorithms in one of the area in data mining and it can be classified into partition, hierarchical, density based and grid based. Therefore, in this paper we do survey and review four major hierarchical clustering algorithms called CURE, ROCK, CHAMELEON and BIRCH. The obtained state of the art of these algorithms will help in eliminating the current problems as well as deriving more robust and scalable algorithms for clustering.Keywords: Clustering, method, algorithm, hierarchical, survey.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 33761976 A Heuristics Approach for Fast Detecting Suspicious Money Laundering Cases in an Investment Bank
Authors: Nhien-An Le-Khac, Sammer Markos, M-Tahar Kechadi
Abstract:
Today, money laundering (ML) poses a serious threat not only to financial institutions but also to the nation. This criminal activity is becoming more and more sophisticated and seems to have moved from the cliché of drug trafficking to financing terrorism and surely not forgetting personal gain. Most international financial institutions have been implementing anti-money laundering solutions (AML) to fight investment fraud. However, traditional investigative techniques consume numerous man-hours. Recently, data mining approaches have been developed and are considered as well-suited techniques for detecting ML activities. Within the scope of a collaboration project for the purpose of developing a new solution for the AML Units in an international investment bank, we proposed a data mining-based solution for AML. In this paper, we present a heuristics approach to improve the performance for this solution. We also show some preliminary results associated with this method on analysing transaction datasets.Keywords: data mining, anti money laundering, clustering, heuristics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 35851975 A Study on Finding Similar Document with Multiple Categories
Authors: R. Saraçoğlu, N. Allahverdi
Abstract:
Searching similar documents and document management subjects have important place in text mining. One of the most important parts of similar document research studies is the process of classifying or clustering the documents. In this study, a similar document search approach that includes discussion of out the case of belonging to multiple categories (multiple categories problem) has been carried. The proposed method that based on Fuzzy Similarity Classification (FSC) has been compared with Rocchio algorithm and naive Bayes method which are widely used in text mining. Empirical results show that the proposed method is quite successful and can be applied effectively. For the second stage, multiple categories vector method based on information of categories regarding to frequency of being seen together has been used. Empirical results show that achievement is increased almost two times, when proposed method is compared with classical approach.
Keywords: Document similarity, Fuzzy classification, Multiple categories, Text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17071974 Lead and Cadmium Spatial Pattern and Risk Assessment around Coal Mine in Hyrcanian Forest, North Iran
Authors: Mahsa Tavakoli, Seyed Mohammad Hojjati, Yahya Kooch
Abstract:
In this study, the effect of coal mining activities on lead and cadmium concentrations and distribution in soil was investigated in Hyrcanian forest, North Iran. 16 plots (20×20 m2) were established by systematic-randomly (60×60 m2) in an area of 4 ha (200×200 m2-mine entrance placed at center). An area adjacent to the mine was not affected by the mining activity; considered as the controlled area. In order to investigate soil lead and cadmium concentration, one sample was taken from the 0-10 cm in each plot. To study the spatial pattern of soil properties and lead and cadmium concentrations in the mining area, an area of 80×80m2 (the mine as the center) was considered and 80 soil samples were systematic-randomly taken (10 m intervals). Geostatistical analysis was performed via Kriging method and GS+ software (version 5.1). In order to estimate the impact of coal mining activities on soil quality, pollution index was measured. Lead and cadmium concentrations were significantly higher in mine area (Pb: 10.97±0.30, Cd: 184.47±6.26 mg.kg-1) in comparison to control area (Pb: 9.42±0.17, Cd: 131.71±15.77 mg.kg-1). The mean values of the PI index indicate that Pb (1.16) and Cd (1.77) presented slightly polluted. Results of the NIPI index showed that Pb (1.44) and Cd (2.52) presented slight pollution and moderate pollution respectively. Results of variography and kriging method showed that it is possible to prepare interpolation maps of lead and cadmium around the mining areas in Hyrcanian forest. According to results of pollution and risk assessments, forest soil was contaminated by heavy metals (lead and cadmium); therefore, using reclamation and remediation techniques in these areas is necessary.
Keywords: Traditional coal mining, heavy metals, pollution indicators, geostatistics, caspian forest.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10511973 Parallel and Distributed Mining of Association Rule on Knowledge Grid
Authors: U. Sakthi, R. Hemalatha, R. S. Bhuvaneswaran
Abstract:
In Virtual organization, Knowledge Discovery (KD) service contains distributed data resources and computing grid nodes. Computational grid is integrated with data grid to form Knowledge Grid, which implements Apriori algorithm for mining association rule on grid network. This paper describes development of parallel and distributed version of Apriori algorithm on Globus Toolkit using Message Passing Interface extended with Grid Services (MPICHG2). The creation of Knowledge Grid on top of data and computational grid is to support decision making in real time applications. In this paper, the case study describes design and implementation of local and global mining of frequent item sets. The experiments were conducted on different configurations of grid network and computation time was recorded for each operation. We analyzed our result with various grid configurations and it shows speedup of computation time is almost superlinear.Keywords: Association rule, Grid computing, Knowledge grid, Mobility prediction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21821972 GIS-Based Spatial Distribution and Evaluation of Selected Heavy Metals Contamination in Topsoil around Ecton Mining Area, Derbyshire, UK
Authors: Zahid O. Alibrahim, Craig D. Williams, Clive L. Roberts
Abstract:
The study area (Ecton mining area) is located in the southern part of the Peak District in Derbyshire, England. It is bounded by the River Manifold from the west. This area has been mined for a long period. As a result, huge amounts of potentially toxic metals were released into the surrounding area and are most likely to be a significant source of heavy metal contamination to the local soil, water and vegetation. In order to appraise the potential heavy metal pollution in this area, 37 topsoil samples (5-20 cm depth) were collected and analysed for their total content of Cu, Pb, Zn, Mn, Cr, Ni and V using ICP (Inductively Coupled Plasma) optical emission spectroscopy. Multivariate Geospatial analyses using the GIS technique were utilised to draw geochemical maps of the metals of interest over the study area. A few hotspot points, areas of elevated concentrations of metals, were specified, which are presumed to be the results of anthropogenic activities. In addition, the soil’s environmental quality was evaluated by calculating the Mullers’ Geoaccumulation index (I geo), which suggests that the degree of contamination of the investigated heavy metals has the following trend: Pb > Zn > Cu > Mn > Ni = Cr = V. Furthermore, the potential ecological risk, using the enrichment factor (EF), was also specified. On the basis of the calculated amount or the EF, the levels of pollution for the studied metals in the study area have the following order: Pb>Zn>Cu>Cr>V>Ni>Mn.
Keywords: Heavy metals, GIS, multivariate analysis, geoaccumulation index, enrichment factor.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12411971 Analysis of Sequence Moves in Successful Chess Openings Using Data Mining with Association Rules
Authors: R.M.Rani
Abstract:
Chess is one of the indoor games, which improves the level of human confidence, concentration, planning skills and knowledge. The main objective of this paper is to help the chess players to improve their chess openings using data mining techniques. Budding Chess Players usually do practices by analyzing various existing openings. When they analyze and correlate thousands of openings it becomes tedious and complex for them. The work done in this paper is to analyze the best lines of Blackmar- Diemer Gambit(BDG) which opens with White D4... using data mining analysis. It is carried out on the collection of winning games by applying association rules. The first step of this analysis is assigning variables to each different sequence moves. In the second step, the sequence association rules were generated to calculate support and confidence factor which help us to find the best subsequence chess moves that may lead to winning position.Keywords: Blackmar-Diemer Gambit(BDG), Confidence, sequence Association Rules, Support.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3091