Search results for: Web content mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2176

Search results for: Web content mining

1276 Inferring User Preference Using Distance Dependent Chinese Restaurant Process and Weighted Distribution for a Content Based Recommender System

Authors: Bagher Rahimpour Cami, Hamid Hassanpour, Hoda Mashayekhi

Abstract:

Nowadays websites provide a vast number of resources for users. Recommender systems have been developed as an essential element of these websites to provide a personalized environment for users. They help users to retrieve interested resources from large sets of available resources. Due to the dynamic feature of user preference, constructing an appropriate model to estimate the user preference is the major task of recommender systems. Profile matching and latent factors are two main approaches to identify user preference. In this paper, we employed the latent factor and profile matching to cluster the user profile and identify user preference, respectively. The method uses the Distance Dependent Chines Restaurant Process as a Bayesian nonparametric framework to extract the latent factors from the user profile. These latent factors are mapped to user interests and a weighted distribution is used to identify user preferences. We evaluate the proposed method using a real-world data-set that contains news tweets of a news agency (BBC). The experimental results and comparisons show the superior recommendation accuracy of the proposed approach related to existing methods, and its ability to effectively evolve over time.

Keywords: Content-based recommender systems, dynamic user modeling, extracting user interests, predicting user preference.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 806
1275 Context-aware Recommender Systems using Data Mining Techniques

Authors: Kyoung-jae Kim, Hyunchul Ahn, Sangwon Jeong

Abstract:

This study proposes a novel recommender system to provide the advertisements of context-aware services. Our proposed model is designed to apply a modified collaborative filtering (CF) algorithm with regard to the several dimensions for the personalization of mobile devices – location, time and the user-s needs type. In particular, we employ a classification rule to understand user-s needs type using a decision tree algorithm. In addition, we collect primary data from the mobile phone users and apply them to the proposed model to validate its effectiveness. Experimental results show that the proposed system makes more accurate and satisfactory advertisements than comparative systems.

Keywords: Location-based advertisement, Recommender system, Collaborative filtering, User needs type, Mobile user.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2168
1274 Effect of Pre-drying Treatments on Quality Characteristics of Dehydrated Tomato Slices

Authors: Sharareh Mohseni, Reihaneh Ahmadzadeh Ghavidel

Abstract:

Tomato powder has good potential as substitute of tomato paste and other tomato products. In order to protect physicochemical properties and nutritional quality of tomato during dehydration process, investigation was carried out using different drying methods and pretreatments. Solar drier and continuous conveyor (tunnel) drier were used for dehydration where as calcium chloride (CaCl2), potassium metabisulphite (KMS), calcium chloride and potassium metabisulphite (CaCl2 +KMS), and sodium chloride (NaCl) selected for treatment.. lycopene content, dehydration ratio, rehydration ratio and non-enzymatic browning in addition to moisture, sugar and titrable acidity were studied. Results show that pre-treatment with CaCl2 and NaCl increased water removal and moisture mobility in tomato slices during drying of tomatoes. Where CaCl2 used along with KMS the NEB was recorded the least compared to other treatments and the best results were obtained while using the two chemicals in combination form. Storage studies in LDPE polymeric and metalized polyesters films showed less changes in the products packed in metallized polyester pouches and even after 6 months lycopene content did not decrease more than 20% as compared to the control sample and provide extension of shelf life in acceptable condition for 6 months. In most of the quality characteristics tunnel drier samples presented better values in comparison to solar drier.

Keywords: Dehydration, Tomato powder, Lycopene, Browning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2314
1273 Genetic Programming Approach to Hierarchical Production Rule Discovery

Authors: Basheer M. Al-Maqaleh, Kamal K. Bharadwaj

Abstract:

Automated discovery of hierarchical structures in large data sets has been an active research area in the recent past. This paper focuses on the issue of mining generalized rules with crisp hierarchical structure using Genetic Programming (GP) approach to knowledge discovery. The post-processing scheme presented in this work uses flat rules as initial individuals of GP and discovers hierarchical structure. Suitable genetic operators are proposed for the suggested encoding. Based on the Subsumption Matrix(SM), an appropriate fitness function is suggested. Finally, Hierarchical Production Rules (HPRs) are generated from the discovered hierarchy. Experimental results are presented to demonstrate the performance of the proposed algorithm.

Keywords: Genetic Programming, Hierarchy, Knowledge Discovery in Database, Subsumption Matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1441
1272 Hygrothermal Assessment of Internally Insulated Prefabricated Concrete Wall in Polish Climatic Condition

Authors: D. Kaczorek

Abstract:

Internal insulation of external walls is often problematic due to increased moisture content in the wall and interstitial or surface condensation risk. In this paper, the hygrothermal performance of prefabricated, concrete, large panel, external wall typical for WK70 system, commonly used in Poland in the 70’s, with inside, additional insulation was investigated. Thermal insulation board made out of hygroscopic, natural materials with moisture buffer capacity and extruded polystyrene (EPS) board was used as interior insulation. Experience with this natural insulation is rare in Poland. The analysis was performed using WUFI software. First of all, the impact of various standard boundary conditions on the behavior of the different wall assemblies was tested. The comparison of results showed that the moisture class according to the EN ISO 13788 leads to too high values of total moisture content in the wall since the boundary condition according to the EN 15026 should be usually applied. Then, hygrothermal 1D-simulations were conducted by WUFI Pro for analysis of internally added insulation, and the weak point like the joint of the wall with the concrete ceiling was verified using 2D simulations. Results showed that, in the Warsaw climate and the indoor conditions adopted in accordance with EN 15026, in the tested wall assemblies, regardless of the type of interior insulation, there would not be any problems with moisture - inside the structure and on the interior surface.

Keywords: Concrete large panel wall, hygrothermal simulation, internal insulation, moisture related issues.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 710
1271 Towards Achieving Energy Efficiency in Kazakhstan

Authors: Aigerim Uyzbayeva, Valeriya Tyo, Nurlan Ibrayev

Abstract:

Kazakhstan is currently one of the dynamically developing states in its region. The stable growth in all sectors of the economy leads to a corresponding increase in energy consumption. Thus country consumes significant amount of energy due to the high level of industrialisation and the presence of energy-intensive manufacturing such as mining and metallurgy which in turn leads to low energy efficiency. With allowance for this the Government has set several priorities to adopt a transition of Republic of Kazakhstan to a “green economy”. This article provides an overview of Kazakhstan’s energy efficiency situation in for the period of 1991- 2014. First, the dynamics of production and consumption of conventional energy resources are given. Second, the potential of renewable energy sources is summarised followed by the description of GHG emissions trends in the country. Third, Kazakhstan’ national initiatives, policies and locally implemented projects in the field of energy efficiency are described.

Keywords: Energy efficiency in Kazakhstan, greenhouse gases, renewable energy, sustainable development.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3533
1270 Video Data Mining based on Information Fusion for Tamper Detection

Authors: Girija Chetty, Renuka Biswas

Abstract:

In this paper, we propose novel algorithmic models based on information fusion and feature transformation in crossmodal subspace for different types of residue features extracted from several intra-frame and inter-frame pixel sub-blocks in video sequences for detecting digital video tampering or forgery. An evaluation of proposed residue features – the noise residue features and the quantization features, their transformation in cross-modal subspace, and their multimodal fusion, for emulated copy-move tamper scenario shows a significant improvement in tamper detection accuracy as compared to single mode features without transformation in cross-modal subspace.

Keywords: image tamper detection, digital forensics, correlation features image fusion

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1887
1269 Heat and Mass Transfer Modelling of Industrial Sludge Drying at Different Pressures and Temperatures

Authors: L. Al Ahmad, C. Latrille, D. Hainos, D. Blanc, M. Clausse

Abstract:

A two-dimensional finite volume axisymmetric model is developed to predict the simultaneous heat and mass transfers during the drying of industrial sludge. The simulations were run using COMSOL-Multiphysics 3.5a. The input parameters of the numerical model were acquired from a preliminary experimental work. Results permit to establish correlations describing the evolution of the various parameters as a function of the drying temperature and the sludge water content. The selection and coupling of the equation are validated based on the drying kinetics acquired experimentally at a temperature range of 45-65 °C and absolute pressure range of 200-1000 mbar. The model, incorporating the heat and mass transfer mechanisms at different operating conditions, shows simulated values of temperature and water content. Simulated results are found concordant with the experimental values, only at the first and last drying stages where sludge shrinkage is insignificant. Simulated and experimental results show that sludge drying is favored at high temperatures and low pressure. As experimentally observed, the drying time is reduced by 68% for drying at 65 °C compared to 45 °C under 1 atm. At 65 °C, a 200-mbar absolute pressure vacuum leads to an additional reduction in drying time estimated by 61%. However, the drying rate is underestimated in the intermediate stage. This rate underestimation could be improved in the model by considering the shrinkage phenomena that occurs during sludge drying.

Keywords: Industrial sludge drying, heat transfer, mass transfer, mathematical modelling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 657
1268 Machine Learning Methods for Network Intrusion Detection

Authors: Mouhammad Alkasassbeh, Mohammad Almseidin

Abstract:

Network security engineers work to keep services available all the time by handling intruder attacks. Intrusion Detection System (IDS) is one of the obtainable mechanisms that is used to sense and classify any abnormal actions. Therefore, the IDS must be always up to date with the latest intruder attacks signatures to preserve confidentiality, integrity, and availability of the services. The speed of the IDS is a very important issue as well learning the new attacks. This research work illustrates how the Knowledge Discovery and Data Mining (or Knowledge Discovery in Databases) KDD dataset is very handy for testing and evaluating different Machine Learning Techniques. It mainly focuses on the KDD preprocess part in order to prepare a decent and fair experimental data set. The J48, MLP, and Bayes Network classifiers have been chosen for this study. It has been proven that the J48 classifier has achieved the highest accuracy rate for detecting and classifying all KDD dataset attacks, which are of type DOS, R2L, U2R, and PROBE.

Keywords: IDS, DDoS, MLP, KDD.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 719
1267 Effects of Fermentation Techniques on the Quality of Cocoa Beans

Authors: Monday O. Ale, Adebukola A. Akintade, Olasunbo O. Orungbemi

Abstract:

Fermentation as an important operation in the processing of cocoa beans is now affected by the recent climate change across the globe. The major requirement for effective fermentation is the ability of the material used to retain sufficient heat for the required microbial activities. Apart from the effects of climate on the rate of heat retention, the materials used for fermentation plays an important role. Most Farmers still restrict fermentation activities to the use of traditional methods. Improving on cocoa fermentation in this era of climate change makes it necessary to work on other materials that can be suitable for cocoa fermentation. Therefore, the objective of this study was to determine the effects of fermentation techniques on the quality of cocoa beans. The materials used in this fermentation research were heap-leaves (traditional), stainless steel, plastic tin, plastic basket and wooden box. The period of fermentation varies from zero days to 10 days. Physical and chemical tests were carried out for variables in quality determination in the samples. The weight per bean varied from 1.0-1.2 g after drying across the samples and the major color of the dry beans observed was brown except with the samples from stainless steel. The moisture content varied from 5.5-7%. The mineral content and the heavy metals decreased with increase in the fermentation period. A wooden box can conclusively be used as an alternative to heap-leaves as there was no significant difference in the physical features of the samples fermented with the two methods. The use of a wooden box as an alternative for cocoa fermentation is therefore recommended for cocoa farmers.

Keywords: Effects, fermentation, fermentation materials, period, quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1011
1266 Retrospective Reconstruction of Time Series Data for Integrated Waste Management

Authors: A. Buruzs, M. F. Hatwágner, A. Torma, L. T. Kóczy

Abstract:

The development, operation and maintenance of Integrated Waste Management Systems (IWMS) affects essentially the sustainable concern of every region. The features of such systems have great influence on all of the components of sustainability. In order to reach the optimal way of processes, a comprehensive mapping of the variables affecting the future efficiency of the system is needed such as analysis of the interconnections among the components and modeling of their interactions. The planning of a IWMS is based fundamentally on technical and economical opportunities and the legal framework. Modeling the sustainability and operation effectiveness of a certain IWMS is not in the scope of the present research. The complexity of the systems and the large number of the variables require the utilization of a complex approach to model the outcomes and future risks. This complex method should be able to evaluate the logical framework of the factors composing the system and the interconnections between them. The authors of this paper studied the usability of the Fuzzy Cognitive Map (FCM) approach modeling the future operation of IWMS’s. The approach requires two input data set. One is the connection matrix containing all the factors affecting the system in focus with all the interconnections. The other input data set is the time series, a retrospective reconstruction of the weights and roles of the factors. This paper introduces a novel method to develop time series by content analysis.

Keywords: Content analysis, factors, integrated waste management system, time series.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2013
1265 Does Practice Reflect Theory? An Exploratory Study of a Successful Knowledge Management System

Authors: Janet L. Kourik, Peter E. Maher

Abstract:

To investigate the correspondence of theory and practice, a successfully implemented Knowledge Management System (KMS) is explored through the lens of Alavi and Leidner-s proposed KMS framework for the analysis of an information system in knowledge management (Framework-AISKM). The applied KMS system was designed to manage curricular knowledge in a distributed university environment. The motivation for the KMS is discussed along with the types of knowledge necessary in an academic setting. Elements of the KMS involved in all phases of capturing and disseminating knowledge are described. As the KMS matures the resulting data stores form the precursor to and the potential for knowledge mining. The findings from this exploratory study indicate substantial correspondence between the successful KMS and the theory-based framework providing provisional confirmation for the framework while suggesting factors that contributed to the system-s success. Avenues for future work are described.

Keywords: Applied KMS, education, knowledge management (KM), KM framework, knowledge management system (KMS).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1028
1264 An Efficient and Generic Hybrid Framework for High Dimensional Data Clustering

Authors: Dharmveer Singh Rajput , P. K. Singh, Mahua Bhattacharya

Abstract:

Clustering in high dimensional space is a difficult problem which is recurrent in many fields of science and engineering, e.g., bioinformatics, image processing, pattern reorganization and data mining. In high dimensional space some of the dimensions are likely to be irrelevant, thus hiding the possible clustering. In very high dimensions it is common for all the objects in a dataset to be nearly equidistant from each other, completely masking the clusters. Hence, performance of the clustering algorithm decreases. In this paper, we propose an algorithmic framework which combines the (reduct) concept of rough set theory with the k-means algorithm to remove the irrelevant dimensions in a high dimensional space and obtain appropriate clusters. Our experiment on test data shows that this framework increases efficiency of the clustering process and accuracy of the results.

Keywords: High dimensional clustering, sub-space, k-means, rough set, discernibility matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1931
1263 Studies on Physiochemical Properties of Tomato Powder as Affected by Different Dehydration Methods and Pretreatments

Authors: Reihaneh Ahmadzadeh Ghavidel, Mehdi Ghiafeh Davoodi

Abstract:

Tomato powder has good potential as substitute of tomato paste and other tomato products. In order to protect physicochemical properties and nutritional quality of tomato during dehydration process, investigation was carried out using different drying methods and pretreatments. Solar drier and continuous conveyor (tunnel) drier were used for dehydration where as calcium chloride (CaCl2), potassium metabisulphite (KMS), calcium chloride and potassium metabisulphite (CaCl2 +KMS), and sodium chloride (NaCl) selected for treatment.. lycopene content, dehydration ratio, rehydration ratio and non-enzymatic browning in addition to moisture, sugar and titrable acidity were studied. Results show that pre-treatment with CaCl2 and NaCl increased water removal and moisture mobility in tomato slices during drying of tomatoes. Where CaCl2 used along with KMS the NEB was recorded the least compared to other treatments and the best results were obtained while using the two chemicals in combination form. Storage studies in LDPE polymeric and metalized polyesters films showed less changes in the products packed in metallized polyester pouches and even after 6 months lycopene content did not decrease more than 20% as compared to the control sample and provide extension of shelf life in acceptable condition for 6 months. In most of the quality characteristics tunnel drier samples presented better values in comparison to solar drier.

Keywords: Dehydration, Tomato powder, Lycopene, Browning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4018
1262 Knowledge Acquisition for the Construction of an Evolving Ontology: Application to Augmented Surgery

Authors: Nora Taleb, Sellami Mokhtar, Michel Simonet

Abstract:

This work concerns the evolution and the maintenance of an ontological resource in relation with the evolution of the corpus of texts from which it had been built. The knowledge forming a text corpus, especially in dynamic domains, is in continuous evolution. When a change in the corpus occurs, the domain ontology must evolve accordingly. Most methods manage ontology evolution independently from the corpus from which it is built; in addition, they treat evolution just as a process of knowledge addition, not considering other knowledge changes. We propose a methodology for managing an evolving ontology from a text corpus that evolves over time, while preserving the consistency and the persistence of this ontology. Our methodology is based on the changes made on the corpus to reflect the evolution of the considered domain - augmented surgery in our case. In this context, the results of text mining techniques, as well as the ARCHONTE method slightly modified, are used to support the evolution process.

Keywords: Corpus, Evolution, Ontology

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1433
1261 Hybridization and Evaluation of Jatropha (Jatropha curcas L.) to Improve High Yield Varieties in Indonesia

Authors: Rully D. Purwati, Tantri D. A. Anggraeni, Bambang Heliyanto, M. Machfud, Joko Hartono

Abstract:

Jatropha curcas L. is one of the crops producing non edible oil which is potential for bio-energy. Jatropha cultivation and development program in Indonesia is facing several problems especially low seed yield resulting in inefficient crop cultivation cost. To cope with the problem, development of high yielding varieties is necessary. Development of varieties to improve seed yield was conducted by hybridization and selection, and resulted in 14 potential genotypes. The yield potential of the 14 genotypes were evaluated and compared with two check varieties. The objective of the evaluation was to find Jatropha hybrids with some characters i.e. productivity higher than check varieties, oil content > 40% and harvesting age ≤ 110 days. Hybridization and individual plant selection were carried out from 2010 to 2014. Evaluation of high yield was conducted in Asembagus experimental station, Situbondo, East Java in three years (2015-2017). The experimental designed was Randomized Complete Block Design with three replication and plot size of 10 m x 8 m. The characters observed were number of capsules per plant, dry seed yield (kg/ha) and seed oil content (%). The results of this experiment indicated that all the hybrids evaluated have higher productivity than check variety IP-3A. There were two superior hybrids i.e. HS-49xSP-65/32 and HS-49xSP-19/28 with highest seed yield per hectare and number of capsules per plant during three years.

Keywords: Jatropha, biodiesel, hybrid, high seed yield.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 758
1260 Using Pattern Search Methods for Minimizing Clustering Problems

Authors: Parvaneh Shabanzadeh, Malik Hj Abu Hassan, Leong Wah June, Maryam Mohagheghtabar

Abstract:

Clustering is one of an interesting data mining topics that can be applied in many fields. Recently, the problem of cluster analysis is formulated as a problem of nonsmooth, nonconvex optimization, and an algorithm for solving the cluster analysis problem based on nonsmooth optimization techniques is developed. This optimization problem has a number of characteristics that make it challenging: it has many local minimum, the optimization variables can be either continuous or categorical, and there are no exact analytical derivatives. In this study we show how to apply a particular class of optimization methods known as pattern search methods to address these challenges. These methods do not explicitly use derivatives, an important feature that has not been addressed in previous studies. Results of numerical experiments are presented which demonstrate the effectiveness of the proposed method.

Keywords: Clustering functions, Non-smooth Optimization, Nonconvex Optimization, Pattern Search Method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1634
1259 Enhanced Planar Pattern Tracking for an Outdoor Augmented Reality System

Authors: L. Yu, W. K. Li, S. K. Ong, A. Y. C. Nee

Abstract:

In this paper, a scalable augmented reality framework for handheld devices is presented. The presented framework is enabled by using a server-client data communication structure, in which the search for tracking targets among a database of images is performed on the server-side while pixel-wise 3D tracking is performed on the client-side, which, in this case, is a handheld mobile device. Image search on the server-side adopts a residual-enhanced image descriptors representation that gives the framework a scalability property. The tracking algorithm on the client-side is based on a gravity-aligned feature descriptor which takes the advantage of a sensor-equipped mobile device and an optimized intensity-based image alignment approach that ensures the accuracy of 3D tracking. Automatic content streaming is achieved by using a key-frame selection algorithm, client working phase monitoring and standardized rules for content communication between the server and client. The recognition accuracy test performed on a standard dataset shows that the method adopted in the presented framework outperforms the Bag-of-Words (BoW) method that has been used in some of the previous systems. Experimental test conducted on a set of video sequences indicated the real-time performance of the tracking system with a frame rate at 15-30 frames per second. The presented framework is exposed to be functional in practical situations with a demonstration application on a campus walk-around.

Keywords: Augmented reality framework, server-client model, vision-based tracking, image search.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1134
1258 Impovement of a Label Extraction Method for a Risk Search System

Authors: Shigeaki Sakurai, Ryohei Orihara

Abstract:

This paper proposes an improvement method of classification efficiency in a classification model. The model is used in a risk search system and extracts specific labels from articles posted at bulletin board sites. The system can analyze the important discussions composed of the articles. The improvement method introduces ensemble learning methods that use multiple classification models. Also, it introduces expressions related to the specific labels into generation of word vectors. The paper applies the improvement method to articles collected from three bulletin board sites selected by users and verifies the effectiveness of the improvement method.

Keywords: Text mining, Risk search system, Corporate reputation, Bulletin board site, Ensemble learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1319
1257 An Improvement of Multi-Label Image Classification Method Based on Histogram of Oriented Gradient

Authors: Ziad Abdallah, Mohamad Oueidat, Ali El-Zaart

Abstract:

Image Multi-label Classification (IMC) assigns a label or a set of labels to an image. The big demand for image annotation and archiving in the web attracts the researchers to develop many algorithms for this application domain. The existing techniques for IMC have two drawbacks: The description of the elementary characteristics from the image and the correlation between labels are not taken into account. In this paper, we present an algorithm (MIML-HOGLPP), which simultaneously handles these limitations. The algorithm uses the histogram of gradients as feature descriptor. It applies the Label Priority Power-set as multi-label transformation to solve the problem of label correlation. The experiment shows that the results of MIML-HOGLPP are better in terms of some of the evaluation metrics comparing with the two existing techniques.

Keywords: Data mining, information retrieval system, multi-label, problem transformation, histogram of gradients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1309
1256 A Case-Based Reasoning-Decision Tree Hybrid System for Stock Selection

Authors: Yaojun Wang, Yaoqing Wang

Abstract:

Stock selection is an important decision-making problem. Many machine learning and data mining technologies are employed to build automatic stock-selection system. A profitable stock-selection system should consider the stock’s investment value and the market timing. In this paper, we present a hybrid system including both engage for stock selection. This system uses a case-based reasoning (CBR) model to execute the stock classification, uses a decision-tree model to help with market timing and stock selection. The experiments show that the performance of this hybrid system is better than that of other techniques regarding to the classification accuracy, the average return and the Sharpe ratio.

Keywords: Case-based reasoning, decision tree, stock selection, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1695
1255 An Intelligent Approach of Rough Set in Knowledge Discovery Databases

Authors: Hrudaya Ku. Tripathy, B. K. Tripathy, Pradip K. Das

Abstract:

Knowledge Discovery in Databases (KDD) has evolved into an important and active area of research because of theoretical challenges and practical applications associated with the problem of discovering (or extracting) interesting and previously unknown knowledge from very large real-world databases. Rough Set Theory (RST) is a mathematical formalism for representing uncertainty that can be considered an extension of the classical set theory. It has been used in many different research areas, including those related to inductive machine learning and reduction of knowledge in knowledge-based systems. One important concept related to RST is that of a rough relation. In this paper we presented the current status of research on applying rough set theory to KDD, which will be helpful for handle the characteristics of real-world databases. The main aim is to show how rough set and rough set analysis can be effectively used to extract knowledge from large databases.

Keywords: Data mining, Data tables, Knowledge discovery in database (KDD), Rough sets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2325
1254 Case-Based Reasoning: A Hybrid Classification Model Improved with an Expert's Knowledge for High-Dimensional Problems

Authors: Bruno Trstenjak, Dzenana Donko

Abstract:

Data mining and classification of objects is the process of data analysis, using various machine learning techniques, which is used today in various fields of research. This paper presents a concept of hybrid classification model improved with the expert knowledge. The hybrid model in its algorithm has integrated several machine learning techniques (Information Gain, K-means, and Case-Based Reasoning) and the expert’s knowledge into one. The knowledge of experts is used to determine the importance of features. The paper presents the model algorithm and the results of the case study in which the emphasis was put on achieving the maximum classification accuracy without reducing the number of features.

Keywords: Case based reasoning, classification, expert's knowledge, hybrid model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1411
1253 The Effect of Treated Waste-Water on Compaction and Compression of Fine Soil

Authors: M. Attom, F. Abed, M. Elemam, M. Nazal, N. ElMessalami

Abstract:

—The main objective of this paper is to study the effect of treated waste-water (TWW) on the compaction and compressibility properties of fine soil. Two types of fine soils (clayey soils) were selected for this study and classified as CH soil and Cl type of soil. Compaction and compressibility properties such as optimum water content, maximum dry unit weight, consolidation index and swell index, maximum past pressure and volume change were evaluated using both tap and treated waste water. It was found that the use of treated waste water affects all of these properties. The maximum dry unit weight increased for both soils and the optimum water content decreased as much as 13.6% for highly plastic soil. The significant effect was observed in swell index and swelling pressure of the soils. The swell indexed decreased by as much as 42% and 33% for highly plastic and low plastic soils, respectively, when TWW is used. Additionally, the swelling pressure decreased by as much as 16% for both soil types. The result of this research pointed out that the use of treated waste water has a positive effect on compaction and compression properties of clay soil and promise for potential use of this water in engineering applications. Keywords—Consolidation, proctor compaction, swell index, treated waste-water, volume change.

Keywords: Consolidation, proctor compaction, swell index, treated waste-water, volume change.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1629
1252 Growing Self Organising Map Based Exploratory Analysis of Text Data

Authors: Sumith Matharage, Damminda Alahakoon

Abstract:

Textual data plays an important role in the modern world. The possibilities of applying data mining techniques to uncover hidden information present in large volumes of text collections is immense. The Growing Self Organizing Map (GSOM) is a highly successful member of the Self Organising Map family and has been used as a clustering and visualisation tool across wide range of disciplines to discover hidden patterns present in the data. A comprehensive analysis of the GSOM’s capabilities as a text clustering and visualisation tool has so far not been published. These functionalities, namely map visualisation capabilities, automatic cluster identification and hierarchical clustering capabilities are presented in this paper and are further demonstrated with experiments on a benchmark text corpus.

Keywords: Text Clustering, Growing Self Organizing Map, Automatic Cluster Identification, Hierarchical Clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1987
1251 A Study on the Nostalgia Contents Analysis of Hometown Alumni in the Online Community

Authors: Heejin Yun, Juanjuan Zang

Abstract:

This study aims to analyze the text terms posted on an online community of people from the same hometown and to understand the topic and trend of nostalgia composed online. For this purpose, this study collected 144 writings which the natives of Yeongjong Island, Incheon, South-Korea have posted on an online community. And it analyzed association relations. As a result, online community texts means that just defining nostalgia as ‘a mind longing for hometown’ is not an enough explanation. Second, texts composed online have abstractness rather than persons’ individual stories. This study figured out the relationship that had the most critical and closest mutual association among the terms that constituted nostalgia through literature research and association rule concerning nostalgia. The result of this study has a characteristic that it summed up the core terms and emotions related to nostalgia.

Keywords: Nostalgia, cultural memory, data mining, online community.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1035
1250 Oncogene Identification using Filter based Approaches between Various Cancer Types in Lung

Authors: Michael Netzer, Michael Seger, Mahesh Visvanathan, Bernhard Pfeifer, Gerald H. Lushington, Christian Baumgartner

Abstract:

Lung cancer accounts for the most cancer related deaths for men as well as for women. The identification of cancer associated genes and the related pathways are essential to provide an important possibility in the prevention of many types of cancer. In this work two filter approaches, namely the information gain and the biomarker identifier (BMI) are used for the identification of different types of small-cell and non-small-cell lung cancer. A new method to determine the BMI thresholds is proposed to prioritize genes (i.e., primary, secondary and tertiary) using a k-means clustering approach. Sets of key genes were identified that can be found in several pathways. It turned out that the modified BMI is well suited for microarray data and therefore BMI is proposed as a powerful tool for the search for new and so far undiscovered genes related to cancer.

Keywords: lung cancer, micro arrays, data mining, feature selection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1746
1249 Fuzzy Clustering Analysis in Real Estate Companies in China

Authors: Jianfeng Li, Feng Jin, Xiaoyu Yang

Abstract:

This paper applies fuzzy clustering algorithm in classifying real estate companies in China according to some general financial indexes, such as income per share, share accumulation fund, net profit margins, weighted net assets yield and shareholders' equity. By constructing and normalizing initial partition matrix, getting fuzzy similar matrix with Minkowski metric and gaining the transitive closure, the dynamic fuzzy clustering analysis for real estate companies is shown clearly that different clustered result change gradually with the threshold reducing, and then, it-s shown there is the similar relationship with the prices of those companies in stock market. In this way, it-s great valuable in contrasting the real estate companies- financial condition in order to grasp some good chances of investment, and so on.

Keywords: Fuzzy clustering algorithm, data mining, real estate company, financial analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1909
1248 Engineering of E-Learning Content Creation: Case Study for African Countries

Authors: María-Dolores Afonso-Suárez, Nayra Pumar-Carreras, Juan Ruiz-Alzola

Abstract:

This research addresses the use of an e-Learning creation methodology for learning objects. Throughout the process, indicators are being gathered, to determine if it responds to the main objectives of an engineering discipline. These parameters will also indicate if it is necessary to review the creation cycle and readjust any phase. Within the project developed for this study, apart from the use of structured methods, there has been a central objective: the establishment of a learning atmosphere. A place where all the professionals involved are able to collaborate, plan, solve problems and determine guides to follow in order to develop creative and innovative solutions. It has been outlined as a blended learning program with an assessment plan that proposes face to face lessons, coaching, collaboration, multimedia and web based learning objects as well as support resources. The project has been drawn as a long term task, the pilot teaching actions designed provide the preliminary results object of study. This methodology is been used in the creation of learning content for the African countries of Senegal, Mauritania and Cape Verde. It has been developed within the framework of the MACbioIDi, an Interreg European project for the International cooperation and development. The educational area of this project is focused in the training and advice of professionals of the medicine as well as engineers in the use of applications of medical imaging technology, specifically the 3DSlicer application and the Open Anatomy Browser.

Keywords: Teaching contents engineering, e-learning, blended learning, international cooperation, 3DSlicer, open anatomy browser.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1030
1247 Natural Gas Dehydration Process Simulation and Optimization: A Case Study of Khurmala Field in Iraqi Kurdistan Region

Authors: R. Abdulrahman, I. Sebastine

Abstract:

Natural gas is the most popular fossil fuel in the current era and future as well. Natural gas is existed in underground reservoirs so it may contain many of non-hydrocarbon components for instance, hydrogen sulfide, nitrogen and water vapor. These impurities are undesirable compounds and cause several technical problems for example, corrosion and environment pollution. Therefore, these impurities should be reduce or removed from natural gas stream. Khurmala dome is located in southwest Erbil-Kurdistan region. The Kurdistan region government has paid great attention for this dome to provide the fuel for Kurdistan region. However, the Khurmala associated natural gas is currently flaring at the field. Moreover, nowadays there is a plan to recover and trade this gas and to use it either as feedstock to power station or to sell it in global market. However, the laboratory analysis has showed that the Khurmala sour gas has huge quantities of H2S about (5.3%) and CO2 about (4.4%). Indeed, Khurmala gas sweetening process has been removed in previous study by using Aspen HYSYS. However, Khurmala sweet gas still contents some quintets of water about 23 ppm in sweet gas stream. This amount of water should be removed or reduced. Indeed, water content in natural gas cause several technical problems such as hydrates and corrosion. Therefore, this study aims to simulate the prospective Khurmala gas dehydration process by using Aspen HYSYS V. 7.3 program. Moreover, the simulation process succeeded in reducing the water content to less than 0.1ppm. In addition, the simulation work is also achieved process optimization by using several desiccant types for example, TEG and DEG and it also study the relationship between absorbents type and its circulation rate with HCs losses from glycol regenerator tower.

Keywords: Aspen Hysys, Process simulation, gas dehydration, process optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8959