Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 1144

Search results for: Web Usage Mining

754 The Development and Future of Hong Kong Typography

Abstract:

Language usage and typography in Hong Kong are unique, as can be seen clearly on the streets of the city. In contrast to many other parts of the world, where there is only one language, in Hong Kong many signs and billboards display two languages: Chinese and English. The language usage on signage, fonts and types used, and the designs in magazines and advertisements all demonstrate the unique features of Hong Kong typographic design, which reflect the multicultural nature of Hong Kong society. This study is the first step in investigating the nature and development of Hong Kong typography. The preliminary research explored how the historical development of Hong Kong is reflected in its unique typography. Following a review of historical development, a quantitative study was designed: Local Hong Kong participants were invited to provide input on what makes the Hong Kong typographic style unique. Their input was collected and analyzed. This provided us with information about the characteristic criteria and features of Hong Kong typography, as recognized by the local people. The most significant typographic designs in Hong Kong were then investigated and the influence of Chinese and other cultures on Hong Kong typography was assessed. The research results provide an indication to local designers on how they can strengthen local design outcomes and promote the values and culture of their mother town.

Keywords: Typography, Hong Kong, historical developments, multiple cultures.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1530

753 The Effects of Sewage Sludge Usage and Manure on Some Heavy Metals Uptake in Savory (Satureja hortensis L.)

Authors: A. Hani

Abstract:

In recent decades with the development of technology and lack of food sources, sewage sludge in production of human foods is inevitable. Various sources of municipal and industrial sewage sludge that is produced can provide the requirement of plant nutrients. Soils in arid, semi-arid climate of central Iran that most affected by water drainage, iron and zinc deficiencies, using of sewage sludge is helpful. Therefore, the aim of this study is investigation of sewage sludge and manure application on Ni, Pb and Cd uptake by Savory. An experiment in a randomized complete block design with three replications was performed. Sewage sludge treatments consisted of four levels, control, 15, 30, 80 tons per hectares; the manure was used in four levels of control, 20, 40 and 80 tons per hectare. Results showed that the wet and dry weights was not affected by sewage sludge using, while, manure has significant effect on them. The effect of sewage sludge on the cadmium and lead concentrations were significant. Interactions of sewage sludge and manure on dry weight values were not significant. Compare mean analysis showed that increasing the amount of sewage sludge had no significant effect on cadmium concentration and it reduced when sewage sludge usage increased. This is probably due to increased plant growth and reduced concentrations of these elements in the plant.

Keywords: Savory, lead, cadmium, sewage sludge, manure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1665

752 Cluster Algorithm for Genetic Diversity

Authors: Manpreet Singh, Keerat Kaur, Bhavdeep Singh

Abstract:

With the hardware technology advancing, the cost of storing is decreasing. Thus there is an urgent need for new techniques and tools that can intelligently and automatically assist us in transferring this data into useful knowledge. Different techniques of data mining are developed which are helpful for handling these large size databases [7]. Data mining is also finding its role in the field of biotechnology. Pedigree means the associated ancestry of a crop variety. Genetic diversity is the variation in the genetic composition of individuals within or among species. Genetic diversity depends upon the pedigree information of the varieties. Parents at lower hierarchic levels have more weightage for predicting genetic diversity as compared to the upper hierarchic levels. The weightage decreases as the level increases. For crossbreeding, the two varieties should be more and more genetically diverse so as to incorporate the useful characters of the two varieties in the newly developed variety. This paper discusses the searching and analyzing of different possible pairs of varieties selected on the basis of morphological characters, Climatic conditions and Nutrients so as to obtain the most optimal pair that can produce the required crossbreed variety. An algorithm was developed to determine the genetic diversity between the selected wheat varieties. Cluster analysis technique is used for retrieving the results.

Keywords: Genetic diversity, pedigree, nutrients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1756

751 Increasing the Capacity of Plant Bottlenecks by Using of Improving the Ratio of Mean Time between Failures to Mean Time to Repair

Authors: Jalal Soleimannejad, Mohammad Asadizeidabadi, Mahmoud Koorki, Mojtaba Azarpira

Abstract:

A significant percentage of production costs is the maintenance costs, and analysis of maintenance costs could to achieve greater productivity and competitiveness. With this is mind, the maintenance of machines and installations is considered as an essential part of organizational functions and applying effective strategies causes significant added value in manufacturing activities. Organizations are trying to achieve performance levels on a global scale with emphasis on creating competitive advantage by different methods consist of RCM (Reliability-Center-Maintenance), TPM (Total Productivity Maintenance) etc. In this study, increasing the capacity of Concentration Plant of Golgohar Iron Ore Mining & Industrial Company (GEG) was examined by using of reliability and maintainability analyses. The results of this research showed that instead of increasing the number of machines (in order to solve the bottleneck problems), the improving of reliability and maintainability would solve bottleneck problems in the best way. It should be mention that in the abovementioned study, the data set of Concentration Plant of GEG as a case study, was applied and analyzed.

Keywords: Bottleneck, Golgohar Iron Ore Mining and Industrial Company, maintainability, maintenance costs, reliability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 920

750 A Review on the Usage of Ceramic Wastes in Concrete Production

Authors: O. Zimbili, W. Salim, M. Ndambuki

Abstract:

Construction and Demolition (C&D) wastes contribute the highest percentage of wastes worldwide (75%). Furthermore, ceramic materials contribute the highest percentage of wastes within the C&D wastes (54%). The current option for disposal of ceramic wastes is landfill. This is due to unavailability of standards, avoidance of risk, lack of knowledge and experience in using ceramic wastes in construction. The ability of ceramic wastes to act as a pozzolanic material in the production of cement has been effectively explored. The results proved that temperatures used in the manufacturing of these tiles (about 900⁰C) are sufficient to activate pozzolanic properties of clay. They also showed that, after optimization (11-14% substitution); the cement blend performs better, with no morphological difference between the cement blended with ceramic waste, and that blended with other pozzolanic materials. Sanitary ware and electrical insulator porcelain wastes are some wastes investigated for usage as aggregates in concrete production. When optimized, both produced good results, better than when natural aggregates are used. However, the research on ceramic wastes as partial substitute for fine aggregates or cement has not been overly exploited as the other areas. This review has been concluded with focus on investigating whether ceramic wall tile wastes used as partial substitute for cement and fine aggregates could prove to be beneficial since the two materials are the most high-priced during concrete production.

Keywords: Blended, morphological, pozzolanic properties, waste.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8720

749 Identification of Most Frequently Occurring Lexis in Winnings-announcing Unsolicited Bulke-mails

Authors: Jatinderkumar R. Saini, Apurva A. Desai

Abstract:

e-mail has become an important means of electronic communication but the viability of its usage is marred by Unsolicited Bulk e-mail (UBE) messages. UBE consists of many types like pornographic, virus infected and 'cry-for-help' messages as well as fake and fraudulent offers for jobs, winnings and medicines. UBE poses technical and socio-economic challenges to usage of e-mails. To meet this challenge and combat this menace, we need to understand UBE. Towards this end, the current paper presents a content-based textual analysis of nearly 3000 winnings-announcing UBE. Technically, this is an application of Text Parsing and Tokenization for an un-structured textual document and we approach it using Bag Of Words (BOW) and Vector Space Document Model techniques. We have attempted to identify the most frequently occurring lexis in the winnings-announcing UBE documents. The analysis of such top 100 lexis is also presented. We exhibit the relationship between occurrence of a word from the identified lexisset in the given UBE and the probability that the given UBE will be the one announcing fake winnings. To the best of our knowledge and survey of related literature, this is the first formal attempt for identification of most frequently occurring lexis in winningsannouncing UBE by its textual analysis. Finally, this is a sincere attempt to bring about alertness against and mitigate the threat of such luring but fake UBE.

Keywords: Lexis, Unsolicited Bulk e-mail (UBE), Vector SpaceDocument Model, Winnings, Lottery

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1487

748 An Improved K-Means Algorithm for Gene Expression Data Clustering

Authors: Billel Kenidra, Mohamed Benmohammed

Abstract:

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Keywords: Microarray data mining, biological pattern recognition, partitional clustering, k-means algorithm, centroid initialization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1223

747 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: Clustering, k-means, categorical datasets, pattern recognition, unsupervised learning, knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3484

746 Exploring the Correlation between Population Distribution and Urban Heat Island under Urban Data: Taking Shenzhen Urban Heat Island as an Example

Authors: Wang Yang

Abstract:

Shenzhen is a modern city of China's reform and opening-up policy, the development of urban morphology has been established on the administration of the Chinese government. This city`s planning paradigm is primarily affected by the spatial structure and human behavior. The subjective urban agglomeration center is divided into several groups and centers. In comparisons of this effect, the city development law has better to be neglected. With the continuous development of the internet, extensive data technology has been introduced in China. Data mining and data analysis has become important tools in municipal research. Data mining has been utilized to improve data cleaning such as receiving business data, traffic data and population data. Prior to data mining, government data were collected by traditional means, then were analyzed using city-relationship research, delaying the timeliness of urban development, especially for the contemporary city. Data update speed is very fast and based on the Internet. The city's point of interest (POI) in the excavation serves as data source affecting the city design, while satellite remote sensing is used as a reference object, city analysis is conducted in both directions, the administrative paradigm of government is broken and urban research is restored. Therefore, the use of data mining in urban analysis is very important. The satellite remote sensing data of the Shenzhen city in July 2018 were measured by the satellite Modis sensor and can be utilized to perform land surface temperature inversion, and analyze city heat island distribution of Shenzhen. This article acquired and classified the data from Shenzhen by using Data crawler technology. Data of Shenzhen heat island and interest points were simulated and analyzed in the GIS platform to discover the main features of functional equivalent distribution influence. Shenzhen is located in the east-west area of China. The city’s main streets are also determined according to the direction of city development. Therefore, it is determined that the functional area of the city is also distributed in the east-west direction. The urban heat island can express the heat map according to the functional urban area. Regional POI has correspondence. The research result clearly explains that the distribution of the urban heat island and the distribution of urban POIs are one-to-one correspondence. Urban heat island is primarily influenced by the properties of the underlying surface, avoiding the impact of urban climate. Using urban POIs as analysis object, the distribution of municipal POIs and population aggregation are closely connected, so that the distribution of the population corresponded with the distribution of the urban heat island.

Keywords: POI, satellite remote sensing, the population distribution, urban heat island thermal map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 853

745 Study on the Optimization of Completely Batch Water-using Network with Multiple Contaminants Considering Flow Change

Authors: Jian Du, Shui Hong Hong, Lu Meng, Qing Wei Meng

Abstract:

This work addresses the problem of optimizing completely batch water-using network with multiple contaminants where the flow change caused by mass transfer is taken into consideration for the first time. A mathematical technique for optimizing water-using network is proposed based on source-tank-sink superstructure. The task is to obtain the freshwater usage, recycle assignments among water-using units, wastewater discharge and a steady water-using network configuration by following steps. Firstly, operating sequences of water-using units are determined by time constraints. Next, superstructure is simplified by eliminating the reuse and recycle from water-using units with maximum concentration of key contaminants. Then, the non-linear programming model is solved by GAMS (General Algebra Model System) for minimum freshwater usage, maximum water recycle and minimum wastewater discharge. Finally, numbers of operating periods are calculated to acquire the steady network configuration. A case study is solved to illustrate the applicability of the proposed approach.

Keywords: Completely batch process, flow change, multiple contaminants, water-using network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1393

744 A Preference-Based Multi-Agent Data Mining Framework for Social Network Service Users' Decision Making

Authors: Ileladewa Adeoye Abiodun, Cheng Wai Khuen

Abstract:

Multi-Agent Systems (MAS) emerged in the pursuit to improve our standard of living, and hence can manifest complex human behaviors such as communication, decision making, negotiation and self-organization. The Social Network Services (SNSs) have attracted millions of users, many of whom have integrated these sites into their daily practices. The domains of MAS and SNS have lots of similarities such as architecture, features and functions. Exploring social network users- behavior through multiagent model is therefore our research focus, in order to generate more accurate and meaningful information to SNS users. An application of MAS is the e-Auction and e-Rental services of the Universiti Cyber AgenT(UniCAT), a Social Network for students in Universiti Tunku Abdul Rahman (UTAR), Kampar, Malaysia, built around the Belief- Desire-Intention (BDI) model. However, in spite of the various advantages of the BDI model, it has also been discovered to have some shortcomings. This paper therefore proposes a multi-agent framework utilizing a modified BDI model- Belief-Desire-Intention in Dynamic and Uncertain Situations (BDIDUS), using UniCAT system as a case study.

Keywords: Distributed Data Mining, Multi-Agent Systems, Preference-Based, SNS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1447

743 EGCL: An Extended G-Code Language with Flow Control, Functions and Mnemonic Variables

Authors: Oscar E. Ruiz, S. Arroyave, J. F. Cardona

Abstract:

In the context of computer numerical control (CNC) and computer aided manufacturing (CAM), the capabilities of programming languages such as symbolic and intuitive programming, program portability and geometrical portfolio have special importance. They allow to save time and to avoid errors during part programming and permit code re-usage. Our updated literature review indicates that the current state of art presents voids in parametric programming, program portability and programming flexibility. In response to this situation, this article presents a compiler implementation for EGCL (Extended G-code Language), a new, enriched CNC programming language which allows the use of descriptive variable names, geometrical functions and flow-control statements (if-then-else, while). Our compiler produces low-level generic, elementary ISO-compliant Gcode, thus allowing for flexibility in the choice of the executing CNC machine and in portability. Our results show that readable variable names and flow control statements allow a simplified and intuitive part programming and permit re-usage of the programs. Future work includes allowing the programmer to define own functions in terms of EGCL, in contrast to the current status of having them as library built-in functions.

Keywords: CNC Programming, Compiler, G-code Language, Numerically Controlled Machine-Tools.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2571

742 Landscape Assessment of the Dam and Motorway Networks that Provide Visual and Recreational Opportunities: Case Study of Artvin, Turkey

Authors: Banu Karaşah, Derya Sarı

Abstract:

Nature constantly changes as a result of human necessities. This change mostly feels in natural water sources which are reconstructed with an effect of dams and motorways. In other respects, visual quality of the landscape gets a new and different character during and after the construction of dams and motorways. Changing and specialization new landscapes will be very important to protection-usage balance to explore sustainable usage facilities. The main cause of the selection of Artvin city is that it has very important geographical location and one of the most attraction points in the World with its biodiversity, conservation areas and natural landscape characteristics. Many hydroelectric station and 7 dams are situated, 3 of them have already been built on the Çoruh River in the province of Artvin. As a result of dams, motorways route were reshaped and the ways which have already changed because of elevation is directly affected several of natural destruction. In contrast, many different reservoirs in Coruh Basin provide new vista point that has high visual quality. In this study, we would like to evaluate with sustainable landscape design in 76 km river corridor, which is mainly based on Deriner, Borçka and Muratlı Dams and determination of their basin-lakes recreational potential and opportunities. Lastly, we are going to give some suggestion about the potential of the corridor.

Keywords: Artvin, dam reservoirs, landscape assessment, river corridor, visual quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1916

741 A Review: Comparative Analysis of Different Categorical Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, M. Sharmila

Abstract:

Over the past epoch a rampant amount of work has been done in the data clustering research under the unsupervised learning technique in Data mining. Furthermore several algorithms and methods have been proposed focusing on clustering different data types, representation of cluster models, and accuracy rates of the clusters. However no single clustering algorithm proves to be the most efficient in providing best results. Accordingly in order to find the solution to this issue a new technique, called Cluster ensemble method was bloomed. This cluster ensemble is a good alternative approach for facing the cluster analysis problem. The main hope of the cluster ensemble is to merge different clustering solutions in such a way to achieve accuracy and to improve the quality of individual data clustering. Due to the substantial and unremitting development of new methods in the sphere of data mining and also the incessant interest in inventing new algorithms, makes obligatory to scrutinize a critical analysis of the existing techniques and the future novelty. This paper exposes the comparative study of different cluster ensemble methods along with their features, systematic working process and the average accuracy and error rates of each ensemble methods. Consequently this speculative and comprehensive analysis will be very useful for the community of clustering practitioners and also helps in deciding the most suitable one to rectify the problem in hand.

Keywords: Clustering, Cluster Ensemble methods, Co-association matrix, Consensus function, Median partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2553

740 Validation and Selection between Machine Learning Technique and Traditional Methods to Reduce Bullwhip Effects: a Data Mining Approach

Authors: Hamid R. S. Mojaveri, Seyed S. Mousavi, Mojtaba Heydar, Ahmad Aminian

Abstract:

The aim of this paper is to present a methodology in three steps to forecast supply chain demand. In first step, various data mining techniques are applied in order to prepare data for entering into forecasting models. In second step, the modeling step, an artificial neural network and support vector machine is presented after defining Mean Absolute Percentage Error index for measuring error. The structure of artificial neural network is selected based on previous researchers' results and in this article the accuracy of network is increased by using sensitivity analysis. The best forecast for classical forecasting methods (Moving Average, Exponential Smoothing, and Exponential Smoothing with Trend) is resulted based on prepared data and this forecast is compared with result of support vector machine and proposed artificial neural network. The results show that artificial neural network can forecast more precisely in comparison with other methods. Finally, forecasting methods' stability is analyzed by using raw data and even the effectiveness of clustering analysis is measured.

Keywords: Artificial Neural Networks (ANN), bullwhip effect, demand forecasting, Support Vector Machine (SVM).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1960

739 Development of Total Maximum Daily Load Using Water Quality Modelling as an Approach for Watershed Management in Malaysia

Authors: S. A. Che Osmi, W. M. F. Wan Ishak, H. Kim, M. A. Azman, M. A. Ramli

Abstract:

River is one of important water sources for many activities including industrial and domestic usage such as daily usage, transportation, power supply and recreational activities. However, increasing activities in a river has grown the sources of pollutant enters the water bodies, and degraded the water quality of the river. It becomes a challenge to develop an effective river management to ensure the water sources of the river are well managed and regulated. In Malaysia, several approaches for river management have been implemented such as Integrated River Basin Management (IRBM) program for coordinating the management of resources in a natural environment based on river basin to ensure their sustainability lead by Department of Drainage and Irrigation (DID), Malaysia. Nowadays, Total Maximum Daily Load (TMDL) is one of the best approaches for river management in Malaysia. TMDL implementation is regulated and implemented in the United States. A study on the development of TMDL in Malacca River has been carried out by doing water quality monitoring, the development of water quality model by using Environmental Fluid Dynamic Codes (EFDC), and TMDL implementation plan. The implementation of TMDL will help the stakeholders and regulators to control and improve the water quality of the river. It is one of the good approaches for river management in Malaysia.

Keywords: EFDC, river management, TMDL, water quality modelling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1493

738 A Proposed Optimized and Efficient Intrusion Detection System for Wireless Sensor Network

Authors: Abdulaziz Alsadhan, Naveed Khan

Abstract:

In recent years intrusions on computer network are the major security threat. Hence, it is important to impede such intrusions. The hindrance of such intrusions entirely relies on its detection, which is primary concern of any security tool like Intrusion detection system (IDS). Therefore, it is imperative to accurately detect network attack. Numerous intrusion detection techniques are available but the main issue is their performance. The performance of IDS can be improved by increasing the accurate detection rate and reducing false positive. The existing intrusion detection techniques have the limitation of usage of raw dataset for classification. The classifier may get jumble due to redundancy, which results incorrect classification. To minimize this problem, Principle component analysis (PCA), Linear Discriminant Analysis (LDA) and Local Binary Pattern (LBP) can be applied to transform raw features into principle features space and select the features based on their sensitivity. Eigen values can be used to determine the sensitivity. To further classify, the selected features greedy search, back elimination, and Particle Swarm Optimization (PSO) can be used to obtain a subset of features with optimal sensitivity and highest discriminatory power. This optimal feature subset is used to perform classification. For classification purpose, Support Vector Machine (SVM) and Multilayer Perceptron (MLP) are used due to its proven ability in classification. The Knowledge Discovery and Data mining (KDD’99) cup dataset was considered as a benchmark for evaluating security detection mechanisms. The proposed approach can provide an optimal intrusion detection mechanism that outperforms the existing approaches and has the capability to minimize the number of features and maximize the detection rates.

Keywords: Particle Swarm Optimization (PSO), Principle component analysis (PCA), Linear Discriminant Analysis (LDA), Local Binary Pattern (LBP), Support Vector Machine (SVM), Multilayer Perceptron (MLP).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2703

737 Optimization of Proton Exchange Membrane Fuel Cell Parameters Based on Modified Particle Swarm Algorithms

Authors: M. Dezvarei, S. Morovati

Abstract:

In recent years, increasing usage of electrical energy provides a widespread field for investigating new methods to produce clean electricity with high reliability and cost management. Fuel cells are new clean generations to make electricity and thermal energy together with high performance and no environmental pollution. According to the expansion of fuel cell usage in different industrial networks, the identification and optimization of its parameters is really significant. This paper presents optimization of a proton exchange membrane fuel cell (PEMFC) parameters based on modified particle swarm optimization with real valued mutation (RVM) and clonal algorithms. Mathematical equations of this type of fuel cell are presented as the main model structure in the optimization process. Optimized parameters based on clonal and RVM algorithms are compared with the desired values in the presence and absence of measurement noise. This paper shows that these methods can improve the performance of traditional optimization methods. Simulation results are employed to analyze and compare the performance of these methodologies in order to optimize the proton exchange membrane fuel cell parameters.

Keywords: Clonal algorithm, proton exchange membrane fuel cell, particle swarm optimization, real valued mutation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1135

736 Learning to Order Terms: Supervised Interestingness Measures in Terminology Extraction

Authors: Jérôme Azé, Mathieu Roche, Yves Kodratoff, Michèle Sebag

Abstract:

Term Extraction, a key data preparation step in Text Mining, extracts the terms, i.e. relevant collocation of words, attached to specific concepts (e.g. genetic-algorithms and decisiontrees are terms associated to the concept “Machine Learning" ). In this paper, the task of extracting interesting collocations is achieved through a supervised learning algorithm, exploiting a few collocations manually labelled as interesting/not interesting. From these examples, the ROGER algorithm learns a numerical function, inducing some ranking on the collocations. This ranking is optimized using genetic algorithms, maximizing the trade-off between the false positive and true positive rates (Area Under the ROC curve). This approach uses a particular representation for the word collocations, namely the vector of values corresponding to the standard statistical interestingness measures attached to this collocation. As this representation is general (over corpora and natural languages), generality tests were performed by experimenting the ranking function learned from an English corpus in Biology, onto a French corpus of Curriculum Vitae, and vice versa, showing a good robustness of the approaches compared to the state-of-the-art Support Vector Machine (SVM).

Keywords: Text-mining, Terminology Extraction, Evolutionary algorithm, ROC Curve.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1614

735 Application of a Similarity Measure for Graphs to Web-based Document Structures

Authors: Matthias Dehmer, Frank Emmert Streib, Alexander Mehler, Jürgen Kilian, Max Mühlhauser

Abstract:

Due to the tremendous amount of information provided by the World Wide Web (WWW) developing methods for mining the structure of web-based documents is of considerable interest. In this paper we present a similarity measure for graphs representing web-based hypertext structures. Our similarity measure is mainly based on a novel representation of a graph as linear integer strings, whose components represent structural properties of the graph. The similarity of two graphs is then defined as the optimal alignment of the underlying property strings. In this paper we apply the well known technique of sequence alignments for solving a novel and challenging problem: Measuring the structural similarity of generalized trees. In other words: We first transform our graphs considered as high dimensional objects in linear structures. Then we derive similarity values from the alignments of the property strings in order to measure the structural similarity of generalized trees. Hence, we transform a graph similarity problem to a string similarity problem for developing a efficient graph similarity measure. We demonstrate that our similarity measure captures important structural information by applying it to two different test sets consisting of graphs representing web-based document structures.

Keywords: Graph similarity, hierarchical and directed graphs, hypertext, generalized trees, web structure mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1842

734 Identification of Most Frequently Occurring Lexis in Body-enhancement Medicinal Unsolicited Bulk e-mails

Authors: Jatinderkumar R. Saini, Apurva A. Desai

Abstract:

e-mail has become an important means of electronic communication but the viability of its usage is marred by Unsolicited Bulk e-mail (UBE) messages. UBE consists of many types like pornographic, virus infected and 'cry-for-help' messages as well as fake and fraudulent offers for jobs, winnings and medicines. UBE poses technical and socio-economic challenges to usage of e-mails. To meet this challenge and combat this menace, we need to understand UBE. Towards this end, the current paper presents a content-based textual analysis of more than 2700 body enhancement medicinal UBE. Technically, this is an application of Text Parsing and Tokenization for an un-structured textual document and we approach it using Bag Of Words (BOW) and Vector Space Document Model techniques. We have attempted to identify the most frequently occurring lexis in the UBE documents that advertise various products for body enhancement. The analysis of such top 100 lexis is also presented. We exhibit the relationship between occurrence of a word from the identified lexis-set in the given UBE and the probability that the given UBE will be the one advertising for fake medicinal product. To the best of our knowledge and survey of related literature, this is the first formal attempt for identification of most frequently occurring lexis in such UBE by its textual analysis. Finally, this is a sincere attempt to bring about alertness against and mitigate the threat of such luring but fake UBE.

Keywords: Body Enhancement, Lexis, Medicinal, Unsolicited Bulk e-mail (UBE), Vector Space Document Model, Viagra

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3451

733 Data Mining to Capture User-Experience: A Case Study in Notebook Product Appearance Design

Authors: Rhoann Kerh, Chen-Fu Chien, Kuo-Yi Lin

Abstract:

In the era of rapidly increasing notebook market, consumer electronics manufacturers are facing a highly dynamic and competitive environment. In particular, the product appearance is the first part for user to distinguish the product from the product of other brands. Notebook product should differ in its appearance to engage users and contribute to the user experience (UX). The UX evaluates various product concepts to find the design for user needs; in addition, help the designer to further understand the product appearance preference of different market segment. However, few studies have been done for exploring the relationship between consumer background and the reaction of product appearance. This study aims to propose a data mining framework to capture the user’s information and the important relation between product appearance factors. The proposed framework consists of problem definition and structuring, data preparation, rules generation, and results evaluation and interpretation. An empirical study has been done in Taiwan that recruited 168 subjects from different background to experience the appearance performance of 11 different portable computers. The results assist the designers to develop product strategies based on the characteristics of consumers and the product concept that related to the UX, which help to launch the products to the right customers and increase the market shares. The results have shown the practical feasibility of the proposed framework.

Keywords: Consumers Decision Making, Product Design, Rough Set Theory, User Experience.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3427

732 Exploring Social Impact of Emerging Technologies from Futuristic Data

Authors: Heeyeul Kwon, Yongtae Park

Abstract:

Despite the highly touted benefits, emerging technologies have unleashed pervasive concerns regarding unintended and unforeseen social impacts. Thus, those wishing to create safe and socially acceptable products need to identify such side effects and mitigate them prior to the market proliferation. Various methodologies in the field of technology assessment (TA), namely Delphi, impact assessment, and scenario planning, have been widely incorporated in such a circumstance. However, literatures face a major limitation in terms of sole reliance on participatory workshop activities. They unfortunately missed out the availability of a massive untapped data source of futuristic information flooding through the Internet. This research thus seeks to gain insights into utilization of futuristic data, future-oriented documents from the Internet, as a supplementary method to generate social impact scenarios whilst capturing perspectives of experts from a wide variety of disciplines. To this end, network analysis is conducted based on the social keywords extracted from the futuristic documents by text mining, which is then used as a guide to produce a comprehensive set of detailed scenarios. Our proposed approach facilitates harmonized depictions of possible hazardous consequences of emerging technologies and thereby makes decision makers more aware of, and responsive to, broad qualitative uncertainties.

Keywords: Emerging technologies, futuristic data, scenario, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2341

731 Weighted Data Replication Strategy for Data Grid Considering Economic Approach

Authors: N. Mansouri, A. Asadi

Abstract:

Data Grid is a geographically distributed environment that deals with data intensive application in scientific and enterprise computing. Data replication is a common method used to achieve efficient and fault-tolerant data access in Grids. In this paper, a dynamic data replication strategy, called Enhanced Latest Access Largest Weight (ELALW) is proposed. This strategy is an enhanced version of Latest Access Largest Weight strategy. However, replication should be used wisely because the storage capacity of each Grid site is limited. Thus, it is important to design an effective strategy for the replication replacement task. ELALW replaces replicas based on the number of requests in future, the size of the replica, and the number of copies of the file. It also improves access latency by selecting the best replica when various sites hold replicas. The proposed replica selection selects the best replica location from among the many replicas based on response time that can be determined by considering the data transfer time, the storage access latency, the replica requests that waiting in the storage queue and the distance between nodes. Simulation results utilizing the OptorSim show our replication strategy achieve better performance overall than other strategies in terms of job execution time, effective network usage and storage resource usage.

Keywords: Data grid, data replication, simulation, replica selection, replica placement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2057

730 Health Hazards Related to Computer Use: Experience of the National Institute for Medical Research in Tanzania

Authors: V. P. Mvungi, J. Mcharo, M. E. Mmbuji, L. E. Mgonja, A. Y. Kitua

Abstract:

This paper is based on a study conducted in 2006 to assess the impact of computer usage on health of National Institute for Medical Research (NIMR) staff. NIMR being a research Institute, most of its staff spend substantial part of their working time on computers. There was notion among NIMR staff on possible prolonged computer usage health hazards. Hence, a study was conducted to establish facts and possible mitigation measures. A total of 144 NIMR staff were involved in the study of whom 63.2% were males and 36.8% females aged between 20 and 59 years. All staff cadres were included in the sample. The functions performed by Institute staff using computers includes; data management, proposal development and report writing, research activities, secretarial duties, accounting and administrative duties, on-line information retrieval and online communication through e-mail services. The interviewed staff had been using computers for 1-8 hours a day and for a period ranging from 1 to 20 years. The study has indicated ergonomic hazards for a significant proportion of interviewees (63%) of various kinds ranging from backache to eyesight related problems. The authors highlighted major issues which are substantially applicable in preventing occurrences of computer related problems and they urged NIMR Management and/or the government of Tanzania opts to adapt their practicability.

Keywords: Computers ergonomic hazards, computer usagehealth hazards.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2796

729 The Investigation of Enzymatic Activity in the Soils under the Impact of Metallurgical Industrial Activity in Lori Marz, Armenia

Authors: T. H. Derdzyan, K. A. Ghazaryan, G. A. Gevorgyan

Abstract:

Beta-glucosidase, chitinase, leucine-aminopeptidase, acid phosphomonoesterase and acetate-esterase enzyme activities in the soils under the impact of metallurgical industrial activity in Lori marz (district) were investigated. The results of the study showed that the activities of the investigated enzymes in the soils decreased with increasing distance from the Shamlugh copper mine, the Chochkan tailings storage facility and the ore transportation road. Statistical analysis revealed that the activities of the enzymes were positively correlated (significant) to each other according to the observation sites which indicated that enzyme activities were affected by the same anthropogenic factor. The investigations showed that the soils were polluted with heavy metals (Cu, Pb, As, Co, Ni, Zn) due to copper mining activity in this territory. The results of Pearson correlation analysis revealed a significant negative correlation between heavy metal pollution degree (Nemerow integrated pollution index) and soil enzyme activity. All of this indicated that copper mining activity in this territory causing the heavy metal pollution of the soils resulted in the inhabitation of the activities of the enzymes which are considered as biological catalysts to decompose organic materials and facilitate the cycling of nutrients.

Keywords: Armenia, metallurgical industrial activity, heavy metal pollutionl, soil enzyme activity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2526

728 Latent Semantic Inference for Agriculture FAQ Retrieval

Authors: Dawei Wang, Rujing Wang, Ying Li, Baozi Wei

Abstract:

FAQ system can make user find answer to the problem that puzzles them. But now the research on Chinese FAQ system is still on the theoretical stage. This paper presents an approach to semantic inference for FAQ mining. To enhance the efficiency, a small pool of the candidate question-answering pairs retrieved from the system for the follow-up work according to the concept of the agriculture domain extracted from user input .Input queries or questions are converted into four parts, the question word segment (QWS), the verb segment (VS), the concept of agricultural areas segment (CS), the auxiliary segment (AS). A semantic matching method is presented to estimate the similarity between the semantic segments of the query and the questions in the pool of the candidate. A thesaurus constructed from the HowNet, a Chinese knowledge base, is adopted for word similarity measure in the matcher. The questions are classified into eleven intension categories using predefined question stemming keywords. For FAQ mining, given a query, the question part and answer part in an FAQ question-answer pair is matched with the input query, respectively. Finally, the probabilities estimated from these two parts are integrated and used to choose the most likely answer for the input query. These approaches are experimented on an agriculture FAQ system. Experimental results indicate that the proposed approach outperformed the FAQ-Finder system in agriculture FAQ retrieval.

Keywords: FAQ, Semantic Inference, Ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1336

727 Service Quality and Consumer Behavior on Metered Taxi Services

Authors: Nattapong Techarattanased

Abstract:

The purposes of this research are to make comparisons in respect of the behaviors on the use of the services of metered taxi classified by the demographic factor and to study the influence of the recognition on service quality having the effect on usage behaviors of metered taxi services of consumers in Bangkok Metropolitan Areas. The samples used in this research were 400 metered taxi service users in Bangkok Metropolitan Areas and questionnaire was used as the tool for collecting the data. Analysis statistics are mean and multiple regression analysis. Results of the research revealed that the consumers recognize the overall quality of services in each aspect include tangible aspects of the service, responses to customers, assurance on the confidence, understanding and knowing of customers which is rated at the moderate level except the aspect of the assurance on the confidence and trustworthiness which are rated at a high level. For the result of hypothetical test, it is found that the quality in providing the services on the aspect of the assurance given to the customers has the effect on the usage behaviors of metered taxi services and the aspect of the frequency on the use of the services per month which in this connection. Such variable can forecast at one point nine percent (1.9%). In addition, quality in providing the services and the aspect of the responses to customers have the effect on the behaviors on the use of metered taxi services on the aspect of the expenses on the use of services per month which in this connection, such variable can forecast at two point one percent (2.1%).

Keywords: Consumer behavior, metered taxi, satisfaction, service quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3493

726 Computable Difference Matrix for Synonyms in the Holy Quran

Authors: Mohamed Ali AlShaari, Khalid M. ElFitori

Abstract:

In the field of Quran Studies known as GHAREEB AL QURAN (The study of the meanings of strange words and structures in Holy Quran), it is difficult to distinguish some pragmatic meanings from conceptual meanings. One who wants to study this subject may need to look for a common usage between any two words or more; to understand general meaning, and sometimes may need to look for common differences between them, even if there are synonyms (word sisters).

Some of the distinguished scholars of Arabic linguistics believe that there are no synonym words, they believe in varieties of meaning and multi-context usage. Based on this viewpoint, our method was designedto look for synonyms of a word, then the differences that distinct the word and their synonyms.

There are many available books that use such a method e.g. synonyms books, dictionaries, glossaries, and some books on the interpretations of strange vocabulary of the Holy Quran, but it is difficult to look up words in these written works.

For that reason, we proposed a logical entity, which we called Differences Matrix (DM).

DM groups the synonyms words to extract the relations between them and to know the general meaning, which defines the skeleton of all word synonyms; this meaning is expressed by a word of its sisters.

In Differences Matrix, we used the sisters(words) as titles for rows and columns, and in the obtained cells we tried to define the row title (word) by using column title (her sister), so the relations between sisters appear, the expected result is well defined groups of sisters for each word. We represented the obtained results formally, and used the defined groups as a base for building the ontology of the Holy Quran synonyms.

Keywords: Quran, synonyms, Differences Matrix, ontology

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2066

725 Using Genetic Algorithms to Outline Crop Rotations and a Cropping-System Model

Authors: Nicolae Bold, Daniel Nijloveanu

Abstract:

The idea of cropping-system is a method used by farmers. It is an environmentally-friendly method, protecting the natural resources (soil, water, air, nutritive substances) and increase the production at the same time, taking into account some crop particularities. The combination of this powerful method with the concepts of genetic algorithms results into a possibility of generating sequences of crops in order to form a rotation. The usage of this type of algorithms has been efficient in solving problems related to optimization and their polynomial complexity allows them to be used at solving more difficult and various problems. In our case, the optimization consists in finding the most profitable rotation of cultures. One of the expected results is to optimize the usage of the resources, in order to minimize the costs and maximize the profit. In order to achieve these goals, a genetic algorithm was designed. This algorithm ensures the finding of several optimized solutions of cropping-systems possibilities which have the highest profit and, thus, which minimize the costs. The algorithm uses genetic-based methods (mutation, crossover) and structures (genes, chromosomes). A cropping-system possibility will be considered a chromosome and a crop within the rotation is a gene within a chromosome. Results about the efficiency of this method will be presented in a special section. The implementation of this method would bring benefits into the activity of the farmers by giving them hints and helping them to use the resources efficiently.

Keywords: Genetic algorithm, chromosomes, genes, cropping, agriculture.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1568