Search results for: sequence mining
188 Spread Spectrum Code Estimationby Particle Swarm Algorithm
Authors: Vahid R. Asghari, Mehrdad Ardebilipour
Abstract:
In the context of spectrum surveillance, a new method to recover the code of spread spectrum signal is presented, while the receiver has no knowledge of the transmitter-s spreading sequence. In our previous paper, we used Genetic algorithm (GA), to recover spreading code. Although genetic algorithms (GAs) are well known for their robustness in solving complex optimization problems, but nonetheless, by increasing the length of the code, we will often lead to an unacceptable slow convergence speed. To solve this problem we introduce Particle Swarm Optimization (PSO) into code estimation in spread spectrum communication system. In searching process for code estimation, the PSO algorithm has the merits of rapid convergence to the global optimum, without being trapped in local suboptimum, and good robustness to noise. In this paper we describe how to implement PSO as a component of a searching algorithm in code estimation. Swarm intelligence boasts a number of advantages due to the use of mobile agents. Some of them are: Scalability, Fault tolerance, Adaptation, Speed, Modularity, Autonomy, and Parallelism. These properties make swarm intelligence very attractive for spread spectrum code estimation. They also make swarm intelligence suitable for a variety of other kinds of channels. Our results compare between swarm-based algorithms and Genetic algorithms, and also show PSO algorithm performance in code estimation process.Keywords: Code estimation, Particle Swarm Optimization(PSO), Spread spectrum.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2136187 Feature Selection with Kohonen Self Organizing Classification Algorithm
Authors: Francesco Maiorana
Abstract:
In this paper a one-dimension Self Organizing Map algorithm (SOM) to perform feature selection is presented. The algorithm is based on a first classification of the input dataset on a similarity space. From this classification for each class a set of positive and negative features is computed. This set of features is selected as result of the procedure. The procedure is evaluated on an in-house dataset from a Knowledge Discovery from Text (KDT) application and on a set of publicly available datasets used in international feature selection competitions. These datasets come from KDT applications, drug discovery as well as other applications. The knowledge of the correct classification available for the training and validation datasets is used to optimize the parameters for positive and negative feature extractions. The process becomes feasible for large and sparse datasets, as the ones obtained in KDT applications, by using both compression techniques to store the similarity matrix and speed up techniques of the Kohonen algorithm that take advantage of the sparsity of the input matrix. These improvements make it feasible, by using the grid, the application of the methodology to massive datasets.Keywords: Clustering algorithm, Data mining, Feature selection, Grid, Kohonen Self Organizing Map.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3052186 Detection of Legionella pneumophila in Cooling Water Systems of Hospitals and Nursing Homes of Kerman City, Iran by Semi- Nested PCR
Authors: Mohammad Ahmadinejad, Mohammad Reza Shakibaie, Kyvan Shams, Mohammad Khalili
Abstract:
Legionella pneumophila is involved in more than 95% cases of severe atypical pneumonia. Infection is mainly by inhalation the indoor aerosols through the water-coolant systems. Because some Legionella strains may be viable but not culturable, therefore, Taq polymerase, DNA amplification and semi-nested-PCR were carried out to detect Legionella-specific 16S-rDNA sequence. For this purpose, 1.5 litter of water samples from 77 water-coolant system were collected from four different hospitals, two nursing homes and one student hostel in Kerman city of Iran, each in a brand new plastic bottle during summer season of 2006 (from April to August). The samples were filtered in the sterile condition through the Millipore Membrane Filter. DNA was extracted from membrane and used for PCR to detect Legionella spp. The PCR product was then subjected to semi-nested PCR for detection of L. pneumophila. Out of 77 water samples that were tested by PCR, 30 (39%) were positive for most species of Legionella. However, L. pneumophila was detected from 14 (18.2%) water samples by semi-nested PCR. From the above results it can be concluded that water coolant systems of different hospitals and nursing homes in Kerman city of Iran are highly contaminated with L. pneumophila spp. and pose serious concern. So, we recommend avoiding such type of coolant system in the hospitals and nursing homes.Keywords: Legionella pneumophila, water-coolant system, semi-nested -PCR.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2056185 Performance of an Improved Fluidized System for Processing Green Tea
Authors: Nickson Kipng’etich Lang’at, Thomas Thoruwa, John Abraham, John Wanyoko
Abstract:
Green tea is made from the top two leaves and buds of a shrub, Camellia sinensis, of the family Theaceae and the order Theales. The green tea leaves are picked and immediately sent to be dried or steamed to prevent fermentation. Fluid bed drying technique is a common drying method used in drying green tea because of its ease in design and construction and fluidization of fine tea particles. Major problems in this method are significant loss of chemical content of the leaf and green appearance of tea, retention of high moisture content in the leaves and bed channeling and defluidization. The energy associated with the drying technology has been shown to be a vital factor in determining the quality of green tea. As part of the implementation, prototype dryer was built that facilitated sequence of operations involving steaming, cooling, pre-drying and final drying. The major findings of the project were in terms of quality characteristics of tea leaves and energy consumption during processing. The optimal design achieved a moisture content of 4.2 ± 0.84%. With the optimum drying temperature of 100 ºC, the specific energy consumption was 1697.8 kj.Kg-1 and evaporation rate of 4.272 x 10-4 Kg.m-2.s-1. The energy consumption in a fluidized system can be further reduced by focusing on energy saving designs.
Keywords: Evaporation rate, fluid bed dryer, maceration, specific energy consumption.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1700184 Automatic Extraction of Features and Opinion-Oriented Sentences from Customer Reviews
Authors: Khairullah Khan, Baharum B. Baharudin, Aurangzeb Khan, Fazal_e_Malik
Abstract:
Opinion extraction about products from customer reviews is becoming an interesting area of research. Customer reviews about products are nowadays available from blogs and review sites. Also tools are being developed for extraction of opinion from these reviews to help the user as well merchants to track the most suitable choice of product. Therefore efficient method and techniques are needed to extract opinions from review and blogs. As reviews of products mostly contains discussion about the features, functions and services, therefore, efficient techniques are required to extract user comments about the desired features, functions and services. In this paper we have proposed a novel idea to find features of product from user review in an efficient way. Our focus in this paper is to get the features and opinion-oriented words about products from text through auxiliary verbs (AV) {is, was, are, were, has, have, had}. From the results of our experiments we found that 82% of features and 85% of opinion-oriented sentences include AVs. Thus these AVs are good indicators of features and opinion orientation in customer reviews.Keywords: Classification, Customer Reviews, Helping Verbs, Opinion Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2096183 Topic Modeling Using Latent Dirichlet Allocation and Latent Semantic Indexing on South African Telco Twitter Data
Authors: Phumelele P. Kubheka, Pius A. Owolawi, Gbolahan Aiyetoro
Abstract:
Twitter is one of the most popular social media platforms where users share their opinions on different subjects. Twitter can be considered a great source for mining text due to the high volumes of data generated through the platform daily. Many industries such as telecommunication companies can leverage the availability of Twitter data to better understand their markets and make an appropriate business decision. This study performs topic modeling on Twitter data using Latent Dirichlet Allocation (LDA). The obtained results are benchmarked with another topic modeling technique, Latent Semantic Indexing (LSI). The study aims to retrieve topics on a Twitter dataset containing user tweets on South African Telcos. Results from this study show that LSI is much faster than LDA. However, LDA yields better results with higher topic coherence by 8% for the best-performing model in this experiment. A higher topic coherence score indicates better performance of the model.
Keywords: Big data, latent Dirichlet allocation, latent semantic indexing, Telco, topic modeling, Twitter.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 459182 An Intelligent System for Phish Detection, using Dynamic Analysis and Template Matching
Authors: Chinmay Soman, Hrishikesh Pathak, Vishal Shah, Aniket Padhye, Amey Inamdar
Abstract:
Phishing, or stealing of sensitive information on the web, has dealt a major blow to Internet Security in recent times. Most of the existing anti-phishing solutions fail to handle the fuzziness involved in phish detection, thus leading to a large number of false positives. This fuzziness is attributed to the use of highly flexible and at the same time, highly ambiguous HTML language. We introduce a new perspective against phishing, that tries to systematically prove, whether a given page is phished or not, using the corresponding original page as the basis of the comparison. It analyzes the layout of the pages under consideration to determine the percentage distortion between them, indicative of any form of malicious alteration. The system design represents an intelligent system, employing dynamic assessment which accurately identifies brand new phishing attacks and will prove effective in reducing the number of false positives. This framework could potentially be used as a knowledge base, in educating the internet users against phishing.Keywords: World Wide Web, Phishing, Internet security, data mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1832181 A Distance Function for Data with Missing Values and Its Application
Authors: Loai AbdAllah, Ilan Shimshoni
Abstract:
Missing values in data are common in real world applications. Since the performance of many data mining algorithms depend critically on it being given a good metric over the input space, we decided in this paper to define a distance function for unlabeled datasets with missing values. We use the Bhattacharyya distance, which measures the similarity of two probability distributions, to define our new distance function. According to this distance, the distance between two points without missing attributes values is simply the Mahalanobis distance. When on the other hand there is a missing value of one of the coordinates, the distance is computed according to the distribution of the missing coordinate. Our distance is general and can be used as part of any algorithm that computes the distance between data points. Because its performance depends strongly on the chosen distance measure, we opted for the k nearest neighbor classifier to evaluate its ability to accurately reflect object similarity. We experimented on standard numerical datasets from the UCI repository from different fields. On these datasets we simulated missing values and compared the performance of the kNN classifier using our distance to other three basic methods. Our experiments show that kNN using our distance function outperforms the kNN using other methods. Moreover, the runtime performance of our method is only slightly higher than the other methods.
Keywords: Missing values, Distance metric, Bhattacharyya distance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2751180 A Study on the Average Information Ratio of Perfect Secret-Sharing Schemes for Access Structures Based On Bipartite Graphs
Authors: Hui-Chuan Lu
Abstract:
A perfect secret-sharing scheme is a method to distribute a secret among a set of participants in such a way that only qualified subsets of participants can recover the secret and the joint share of participants in any unqualified subset is statistically independent of the secret. The collection of all qualified subsets is called the access structure of the perfect secret-sharing scheme. In a graph-based access structure, each vertex of a graph G represents a participant and each edge of G represents a minimal qualified subset. The average information ratio of a perfect secret-sharing scheme realizing the access structure based on G is defined as AR = (Pv2V (G) H(v))/(|V (G)|H(s)), where s is the secret and v is the share of v, both are random variables from and H is the Shannon entropy. The infimum of the average information ratio of all possible perfect secret-sharing schemes realizing a given access structure is called the optimal average information ratio of that access structure. Most known results about the optimal average information ratio give upper bounds or lower bounds on it. In this present structures based on bipartite graphs and determine the exact values of the optimal average information ratio of some infinite classes of them.
Keywords: secret-sharing scheme, average information ratio, star covering, core sequence.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1579179 A Group Setting of IED in Microgrid Protection Management System
Authors: Jyh-Cherng Gu, Ming-Ta Yang, Chao-Fong Yan, Hsin-Yung Chung, Yung-Ruei Chang, Yih-Der Lee, Chen-Min Chan, Chia-Hao Hsu
Abstract:
There are a number of Distributed Generations (DGs) installed in microgrid, which may have diverse path and direction of power flow or fault current. The overcurrent protection scheme for the traditional radial type distribution system will no longer meet the needs of microgrid protection. Integrating the Intelligent Electronic Device (IED) and a Supervisory Control and Data Acquisition (SCADA) with IEC 61850 communication protocol, the paper proposes a Microgrid Protection Management System (MPMS) to protect power system from the fault. In the proposed method, the MPMS performs logic programming of each IED to coordinate their tripping sequence. The GOOSE message defined in IEC 61850 is used as the transmission information medium among IEDs. Moreover, to cope with the difference in fault current of microgrid between grid-connected mode and islanded mode, the proposed MPMS applies the group setting feature of IED to protect system and robust adaptability. Once the microgrid topology varies, the MPMS will recalculate the fault current and update the group setting of IED. Provided there is a fault, IEDs will isolate the fault at once. Finally, the Matlab/Simulink and Elipse Power Studio software are used to simulate and demonstrate the feasibility of the proposed method.Keywords: IEC 61850, IED, Group Setting, Microgrid.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2268178 Application of Artificial Neural Network to Classification Surface Water Quality
Authors: S. Wechmongkhonkon, N.Poomtong, S. Areerachakul
Abstract:
Water quality is a subject of ongoing concern. Deterioration of water quality has initiated serious management efforts in many countries. This study endeavors to automatically classify water quality. The water quality classes are evaluated using 6 factor indices. These factors are pH value (pH), Dissolved Oxygen (DO), Biochemical Oxygen Demand (BOD), Nitrate Nitrogen (NO3N), Ammonia Nitrogen (NH3N) and Total Coliform (TColiform). The methodology involves applying data mining techniques using multilayer perceptron (MLP) neural network models. The data consisted of 11 sites of canals in Dusit district in Bangkok, Thailand. The data is obtained from the Department of Drainage and Sewerage Bangkok Metropolitan Administration during 2007-2011. The results of multilayer perceptron neural network exhibit a high accuracy multilayer perception rate at 96.52% in classifying the water quality of Dusit district canal in Bangkok Subsequently, this encouraging result could be applied with plan and management source of water quality.Keywords: artificial neural network, classification, surface water quality
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3209177 A Comprehensive CFD Model for Sugar-Cane Bagasse Heterogeneous Combustion in a Grate Boiler System
Authors: Daniel J. O. Ferreira, Juan H. Sosa-Arnao, Bruno C. Moreira, Leonardo P. Rangel, Song W. Park
Abstract:
The comprehensive CFD models have been used to represent and study the heterogeneous combustion of biomass. In the present work, the operation of a global flue gas circuit in the sugarcane bagasse combustion, from wind boxes below primary air grate supply, passing by bagasse insertion in swirl burners and boiler furnace, to boiler bank outlet is simulated. It uses five different meshes representing each part of this system located in sequence: wind boxes and grate, boiler furnace, swirl burners, superheaters and boiler bank. The model considers turbulence using standard k-ε, combustion using EDM, radiation heat transfer using DTM with 16 ray directions and bagasse particle tracking represented by Schiller- Naumann model. The results showed good agreement with expected behavior found in literature and equipment design. The more detailed results view in separated parts of flue gas system allows observing some flow behaviors that cannot be represented by usual simplifications like bagasse supply under homogeneous axial and rotational vectors and others that can be represented using new considerations like the representation of 26 thousand grate orifices by 144 rectangular inlets.Keywords: Comprehensive CFD model, sugar-cane bagasse combustion, sugar-cane bagasse grate boiler.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2726176 Adaptive Network Intrusion Detection Learning: Attribute Selection and Classification
Authors: Dewan Md. Farid, Jerome Darmont, Nouria Harbi, Nguyen Huu Hoa, Mohammad Zahidur Rahman
Abstract:
In this paper, a new learning approach for network intrusion detection using naïve Bayesian classifier and ID3 algorithm is presented, which identifies effective attributes from the training dataset, calculates the conditional probabilities for the best attribute values, and then correctly classifies all the examples of training and testing dataset. Most of the current intrusion detection datasets are dynamic, complex and contain large number of attributes. Some of the attributes may be redundant or contribute little for detection making. It has been successfully tested that significant attribute selection is important to design a real world intrusion detection systems (IDS). The purpose of this study is to identify effective attributes from the training dataset to build a classifier for network intrusion detection using data mining algorithms. The experimental results on KDD99 benchmark intrusion detection dataset demonstrate that this new approach achieves high classification rates and reduce false positives using limited computational resources.Keywords: Attributes selection, Conditional probabilities, information gain, network intrusion detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2698175 Bayesian Networks for Earthquake Magnitude Classification in a Early Warning System
Authors: G. Zazzaro, F.M. Pisano, G. Romano
Abstract:
During last decades, worldwide researchers dedicated efforts to develop machine-based seismic Early Warning systems, aiming at reducing the huge human losses and economic damages. The elaboration time of seismic waveforms is to be reduced in order to increase the time interval available for the activation of safety measures. This paper suggests a Data Mining model able to correctly and quickly estimate dangerousness of the running seismic event. Several thousand seismic recordings of Japanese and Italian earthquakes were analyzed and a model was obtained by means of a Bayesian Network (BN), which was tested just over the first recordings of seismic events in order to reduce the decision time and the test results were very satisfactory. The model was integrated within an Early Warning System prototype able to collect and elaborate data from a seismic sensor network, estimate the dangerousness of the running earthquake and take the decision of activating the warning promptly.Keywords: Bayesian Networks, Decision Support System, Magnitude Classification, Seismic Early Warning System
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3598174 Placer Gold Deposits in Madari Gold Mine, Southern Eastern Desert, Egypt: Orientation, Source and Distribution
Authors: Tarek Sedki
Abstract:
Madari gold mine is delineated by latitudes 22° 30' 29" and 22° 32' 33" N and longitudes 36° 24' 03" and 35°11' 44" E. Geologically, Madari rock units are classified into dismembered ophiolites, arc volcanic assemblage, syntectonic metagabbro-diorites and Mineralized quartz diorite and granodiorite. Deposition of gold in area occurred as a direct result of weathering of nearby gold-bearing veins. Main concentrations of gold are supposed to ensue close to the bed rock. Nevertheless, the several shallow channel-fill features covering lag deposits, arising throughout the alluvial fan sequence would definitely contain a percentage of the finer gold due to the limited washing and sorting capacity of the uncommon flood events. Gold deposits arise as disseminated and separate gold with limited pyrite, arsenopyrite and chalcopyrite everywhere veins in the wall rocks and lode gold deposits in quartz veins. In places, the wall rocks, in near district of the quartz vein, are grieved strong silicification, chloritization and pyritization as a result of a metasomatic alteration due to purification of external hydrothermal fluids. Quartz veins are mostly steeply dipping and display banding features and frequently sheared and brecciated.
Keywords: Madari gold mine, placer deposits, southern eastern desert, gold mineralization, quartz veins.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 425173 Classification of Political Affiliations by Reduced Number of Features
Authors: Vesile Evrim, Aliyu Awwal
Abstract:
By the evolvement in technology, the way of expressing opinions switched direction to the digital world. The domain of politics, as one of the hottest topics of opinion mining research, merged together with the behavior analysis for affiliation determination in texts, which constitutes the subject of this paper. This study aims to classify the text in news/blogs either as Republican or Democrat with the minimum number of features. As an initial set, 68 features which 64 were constituted by Linguistic Inquiry and Word Count (LIWC) features were tested against 14 benchmark classification algorithms. In the later experiments, the dimensions of the feature vector reduced based on the 7 feature selection algorithms. The results show that the “Decision Tree”, “Rule Induction” and “M5 Rule” classifiers when used with “SVM” and “IGR” feature selection algorithms performed the best up to 82.5% accuracy on a given dataset. Further tests on a single feature and the linguistic based feature sets showed the similar results. The feature “Function”, as an aggregate feature of the linguistic category, was found as the most differentiating feature among the 68 features with the accuracy of 81% in classifying articles either as Republican or Democrat.Keywords: Politics, machine learning, feature selection, LIWC.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2365172 Probiotic Properties of Lactic Acid Bacteria Isolated from Fermented Food
Authors: Wilailak Siripornadulsil, Siriyanapat Tasaku, Jutamas Buahorm, Surasak Siripornadulsil
Abstract:
The objectives of this study were to isolate LAB from various sources, dietary supplement, Thai traditional fermented food, and freshwater fish and to characterize their potential as probiotic cultures. Out of 1,558 isolates, 730 were identified as LAB based on isolation on MRS agar supplemented with a bromocresol purple indicator&CaCO3 and Gram-positive, catalase- and oxidase-negative characteristics. Eight isolates showed the potential probiotic properties including tolerance to acid, bile salt & heat, proteolytic, amylolytic & lipolytic activities and oxalate-degrading capability. They all showed the antimicrobial activity against some Gram-negative and Gram-positive pathogenic bacteria. Based on 16S rDNA sequence analysis, they were identified as Enterococcus faecalis BT2 & MG30, Leconostoc mesenteroides SW64 and Pediococcus pentosaceous BD33, CF32, NP6, PS34 & SW5. The health beneficial effects and food safety will be further investigated and developed as a probiotic or protective culture used in Nile tilapia belly flap meat fermentation.
Keywords: Lactic acid bacteria, pathogen, probiotic, protective culture.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3897171 Lexical Database for Multiple Languages: Multilingual Word Semantic Network
Authors: K. K. Yong, R. Mahmud, C. S. Woo
Abstract:
Data mining and knowledge engineering have become a tough task due to the availability of large amount of data in the web nowadays. Validity and reliability of data also become a main debate in knowledge acquisition. Besides, acquiring knowledge from different languages has become another concern. There are many language translators and corpora developed but the function of these translators and corpora are usually limited to certain languages and domains. Furthermore, search results from engines with traditional 'keyword' approach are no longer satisfying. More intelligent knowledge engineering agents are needed. To address to these problems, a system known as Multilingual Word Semantic Network is proposed. This system adapted semantic network to organize words according to concepts and relations. The system also uses open source as the development philosophy to enable the native language speakers and experts to contribute their knowledge to the system. The contributed words are then defined and linked using lexical and semantic relations. Thus, related words and derivatives can be identified and linked. From the outcome of the system implementation, it contributes to the development of semantic web and knowledge engineering.
Keywords: Multilingual, semantic network, intelligent knowledge engineering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1963170 Attacks Classification in Adaptive Intrusion Detection using Decision Tree
Authors: Dewan Md. Farid, Nouria Harbi, Emna Bahri, Mohammad Zahidur Rahman, Chowdhury Mofizur Rahman
Abstract:
Recently, information security has become a key issue in information technology as the number of computer security breaches are exposed to an increasing number of security threats. A variety of intrusion detection systems (IDS) have been employed for protecting computers and networks from malicious network-based or host-based attacks by using traditional statistical methods to new data mining approaches in last decades. However, today's commercially available intrusion detection systems are signature-based that are not capable of detecting unknown attacks. In this paper, we present a new learning algorithm for anomaly based network intrusion detection system using decision tree algorithm that distinguishes attacks from normal behaviors and identifies different types of intrusions. Experimental results on the KDD99 benchmark network intrusion detection dataset demonstrate that the proposed learning algorithm achieved 98% detection rate (DR) in comparison with other existing methods.Keywords: Detection rate, decision tree, intrusion detectionsystem, network security.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3629169 Application of Acinetobacter sp. KKU44 for Cellulase Production from Agricultural Waste
Authors: Surasak Siripornadulsil, Nutt Poomai, Wilailak Siripornadulsil
Abstract:
Due to a high ethanol demand, the approach for effective ethanol production is important and has been developed rapidly worldwide. Several agricultural wastes are highly abundant in celluloses and the effective cellulase enzymes do exist widely among microorganisms. Accordingly, the cellulose degradation using microbial cellulase to produce a low-cost substrate for ethanol production has attracted more attention. In this study, the cellulase producing bacterial strain has been isolated from rich straw and identified by 16S rDNA sequence analysis as Acinetobacter sp. KKU44. This strain is able to grow and exhibit the cellulase activity. The optimal temperature for its growth and cellulase production is 37°C. The optimal temperature of bacterial cellulase activity is 60°C. The cellulase enzyme from Acinetobacter sp. KKU44 is heat-tolerant enzyme. The bacterial culture of 36h. showed highest cellulase activity at 120U/mL when grown in LB medium containing 2% (w/v). The capability of Acinetobacter sp. KKU44 to grow in cellulosic agricultural wastes as a sole carbon source and exhibiting the high cellulase activity at high temperature suggested that this strain could be potentially developed further as a cellulose degrading strain for a production of low-cost substrate used in ethanol production.
Keywords: Acinetobacter sp. KKU44, bagasse, cellulase enzyme, rice husk.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2684168 Decision Trees for Predicting Risk of Mortality using Routinely Collected Data
Authors: Tessy Badriyah, Jim S. Briggs, Dave R. Prytherch
Abstract:
It is well known that Logistic Regression is the gold standard method for predicting clinical outcome, especially predicting risk of mortality. In this paper, the Decision Tree method has been proposed to solve specific problems that commonly use Logistic Regression as a solution. The Biochemistry and Haematology Outcome Model (BHOM) dataset obtained from Portsmouth NHS Hospital from 1 January to 31 December 2001 was divided into four subsets. One subset of training data was used to generate a model, and the model obtained was then applied to three testing datasets. The performance of each model from both methods was then compared using calibration (the χ2 test or chi-test) and discrimination (area under ROC curve or c-index). The experiment presented that both methods have reasonable results in the case of the c-index. However, in some cases the calibration value (χ2) obtained quite a high result. After conducting experiments and investigating the advantages and disadvantages of each method, we can conclude that Decision Trees can be seen as a worthy alternative to Logistic Regression in the area of Data Mining.Keywords: Decision Trees, Logistic Regression, clinical outcome, risk of mortality.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2523167 The Development of the Multi-Agent Classification System (MACS) in Compliance with FIPA Specifications
Authors: Mohamed R. Mhereeg
Abstract:
The paper investigates the feasibility of constructing a software multi-agent based monitoring and classification system and utilizing it to provide an automated and accurate classification of end users developing applications in the spreadsheet domain. The agents function autonomously to provide continuous and periodic monitoring of excels spreadsheet workbooks. Resulting in, the development of the MultiAgent classification System (MACS) that is in compliance with the specifications of the Foundation for Intelligent Physical Agents (FIPA). However, different technologies have been brought together to build MACS. The strength of the system is the integration of the agent technology with the FIPA specifications together with other technologies that are Windows Communication Foundation (WCF) services, Service Oriented Architecture (SOA), and Oracle Data Mining (ODM). The Microsoft's .NET widows service based agents were utilized to develop the monitoring agents of MACS, the .NET WCF services together with SOA approach allowed the distribution and communication between agents over the WWW that is in order to satisfy the monitoring and classification of the multiple developer aspect. ODM was used to automate the classification phase of MACS.
Keywords: Autonomous, Classification, MACS, Multi-Agent, SOA, WCF.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1589166 Utilization of 3-N-trimethylamino-1-propanol by Rhodococcus sp. strain A4 isolated from Natural Soil
Authors: Isam A. Mohamed Ahmed, Jiro Arima, Tsuyoshi Ichiyanagi, Emi Sakuno, Nobuhiro Mori
Abstract:
The aim of this study was to screen for microorganism that able to utilize 3-N-trimethylamino-1-propanol (homocholine) as a sole source of carbon and nitrogen. The aerobic degradation of homocholine has been found by a gram-positive Rhodococcus sp. bacterium isolated from soil. The isolate was identified as Rhodococcus sp. strain A4 based on the phenotypic features, physiologic and biochemical characteristics, and phylogenetic analysis. The cells of the isolated strain grown on both basal-TMAP and nutrient agar medium displayed elementary branching mycelia fragmented into irregular rod and coccoid elements. Comparative 16S rDNA sequencing studies indicated that the strain A4 falls into the Rhodococcus erythropolis subclade and forms a monophyletic group with the type-strains of R. opacus, and R. wratislaviensis. Metabolites analysis by capillary electrophoresis, fast atom bombardment-mass spectrometry, and gas chromatography- mass spectrometry, showed trimethylamine (TMA) as the major metabolite beside β-alanine betaine and trimethylaminopropionaldehyde. Therefore, the possible degradation pathway of trimethylamino propanol in the isolated strain is through consequence oxidation of alcohol group (-OH) to aldehyde (-CHO) and acid (-COOH), and thereafter the cleavage of β-alanine betaine C-N bonds yielded trimethylamine and alkyl chain.Keywords: Homocholine, 3-N-trimethylamino-1-propanol, Quaternary ammonium compounds, 16S rDNA gene sequence.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1533165 Using Data Mining Methodology to Build the Predictive Model of Gold Passbook Price
Authors: Chien-Hui Yang, Che-Yang Lin, Ya-Chen Hsu
Abstract:
Gold passbook is an investing tool that is especially suitable for investors to do small investment in the solid gold. The gold passbook has the lower risk than other ways investing in gold, but its price is still affected by gold price. However, there are many factors can cause influences on gold price. Therefore, building a model to predict the price of gold passbook can both reduce the risk of investment and increase the benefits. This study investigates the important factors that influence the gold passbook price, and utilize the Group Method of Data Handling (GMDH) to build the predictive model. This method can not only obtain the significant variables but also perform well in prediction. Finally, the significant variables of gold passbook price, which can be predicted by GMDH, are US dollar exchange rate, international petroleum price, unemployment rate, whole sale price index, rediscount rate, foreign exchange reserves, misery index, prosperity coincident index and industrial index.Keywords: Gold price, Gold passbook price, Group Method ofData Handling (GMDH), Regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2285164 Design and Development of 5-DOF Color Sorting Manipulator for Industrial Applications
Authors: Atef. A. Ata, Sohair F. Rezeka, Ahmed El-Shenawy, Mohammed Diab
Abstract:
Image processing in today’s world grabs massive attentions as it leads to possibilities of broaden application in many fields of high technology. The real challenge is how to improve existing sorting system applications which consists of two integrated stations of processing and handling with a new image processing feature. Existing color sorting techniques use a set of inductive, capacitive, and optical sensors to differentiate object color. This research presents a mechatronic color sorting system solution with the application of image processing. A 5-DOF robot arm is designed and developed with pick and place operation to act as the main part of the color sorting system. Image processing procedure senses the circular objects in an image captured in real time by a webcam fixed at the end-effector then extracts color and position information out of it. This information is passed as a sequence of sorting commands to the manipulator that has pick-and-place mechanism. Performance analysis proves that this color based object sorting system works accurately under ideal condition in term of adequate illumination, circular objects shape and color. The circular objects tested for sorting are red, green and blue. For non-ideal condition, such as unspecified color the accuracy reduces to 80%.
Keywords: Robotics manipulator, 5-DOF manipulator, image processing, Color sorting, Pick-and-place.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4218163 Influence of Culture Conditions on the Growth and Fatty Acid Composition of Green Microalgae Oocystis rhomboideus, Scenedesmus obliquus, Dictyochlorella globosa
Authors: Tatyana A. Karpenyuk, Saltanat B. Orazova, Yana S. Tzurkan, Alla V. Goncharova, Bakytzhan K. Kairat, Togzhan D. Mukasheva, Ludmila V. Ignatova, Ramza Z. Berzhanova
Abstract:
Microalgae due to the ability to accumulate high levels of practically valuable polyunsaturated fatty acids attract attention as a promising raw material for commercial products. The features of the growth processes of cells green protococcal microalgae Oocystis rhomboideus, Scenedesmus obliquus, Dictyochlorella globosa at cultivation in different nutritional mediums were determined. For the rapid accumulation of biomass, combined with high productivity of total lipids fraction yield recommended to use the Fitzgerald medium (Scenodesmus obliquus, Oocystis rhomboideus) and/or Bold medium (Dictyochlorella globosa). Productivity of lipids decreased in sequence Dictyochlorella globosa > Scenodesmus obliquus > Oocystis rhomboideus. The bulk of fatty acids fraction of the total lipids is unsaturated fatty acids, which accounts for 70 to 83% of the total number of fatty acids. The share of monoenic acids accounts from 18 to 34%, while the share of unsaturated fatty acids - from 44 to 62% of the total number of unsaturated fatty acids fraction. Among the unsaturated acids dominate α-linolenic acid (C18:3n-3), hexadecatetraenic acid (C16:4) and linoleic acid (C18:2).
Keywords: Fatty acids, lipids, microalgae.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2185162 Seismic Performance of Slopes Subjected to Earthquake Mainshock Aftershock Sequences
Authors: Alisha Khanal, Gokhan Saygili
Abstract:
It is commonly observed that aftershocks follow the mainshock. Aftershocks continue over a period of time with a decreasing frequency and typically there is not sufficient time for repair and retrofit between a mainshock–aftershock sequence. Usually, aftershocks are smaller in magnitude; however, aftershock ground motion characteristics such as the intensity and duration can be greater than the mainshock due to the changes in the earthquake mechanism and location with respect to the site. The seismic performance of slopes is typically evaluated based on the sliding displacement predicted to occur along a critical sliding surface. Various empirical models are available that predict sliding displacement as a function of seismic loading parameters, ground motion parameters, and site parameters but these models do not include the aftershocks. The seismic risks associated with the post-mainshock slopes ('damaged slopes') subjected to aftershocks is significant. This paper extends the empirical sliding displacement models for flexible slopes subjected to earthquake mainshock-aftershock sequences (a multi hazard approach). A dataset was developed using 144 pairs of as-recorded mainshock-aftershock sequences using the Pacific Earthquake Engineering Research Center (PEER) database. The results reveal that the combination of mainshock and aftershock increases the seismic demand on slopes relative to the mainshock alone; thus, seismic risks are underestimated if aftershocks are neglected.
Keywords: Seismic slope stability, sliding displacement, mainshock, aftershock, landslide, earthquake.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 899161 A Text Clustering System based on k-means Type Subspace Clustering and Ontology
Authors: Liping Jing, Michael K. Ng, Xinhua Yang, Joshua Zhexue Huang
Abstract:
This paper presents a text clustering system developed based on a k-means type subspace clustering algorithm to cluster large, high dimensional and sparse text data. In this algorithm, a new step is added in the k-means clustering process to automatically calculate the weights of keywords in each cluster so that the important words of a cluster can be identified by the weight values. For understanding and interpretation of clustering results, a few keywords that can best represent the semantic topic are extracted from each cluster. Two methods are used to extract the representative words. The candidate words are first selected according to their weights calculated by our new algorithm. Then, the candidates are fed to the WordNet to identify the set of noun words and consolidate the synonymy and hyponymy words. Experimental results have shown that the clustering algorithm is superior to the other subspace clustering algorithms, such as PROCLUS and HARP and kmeans type algorithm, e.g., Bisecting-KMeans. Furthermore, the word extraction method is effective in selection of the words to represent the topics of the clusters.
Keywords: Subspace Clustering, Text Mining, Feature Weighting, Cluster Interpretation, Ontology
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2462160 All Types of Base Pair Substitutions Induced by γ-Rays in Haploid and Diploid Yeast Cells
Authors: Natalia Koltovaya, Nadezhda Zhuchkina, Ksenia Lyubimova
Abstract:
We study the biological effects induced by ionizing radiation in view of therapeutic exposure and the idea of space flights beyond Earth's magnetosphere. In particular, we examine the differences between base pair substitution induction by ionizing radiation in model haploid and diploid yeast Saccharomyces cerevisiae cells. Such mutations are difficult to study in higher eukaryotic systems. In our research, we have used a collection of six isogenic trp5-strains and 14 isogenic haploid and diploid cyc1-strains that are specific markers of all possible base-pair substitutions. These strains differ from each other only in single base substitutions within codon-50 of the trp5 gene or codon-22 of the cyc1 gene. Different mutation spectra for two different haploid genetic trp5- and cyc1-assays and different mutation spectra for the same genetic cyc1-system in cells with different ploidy — haploid and diploid — have been obtained. It was linear function for dose-dependence in haploid and exponential in diploid cells. We suggest that the differences between haploid yeast strains reflect the dependence on the sequence context, while the differences between haploid and diploid strains reflect the different molecular mechanisms of mutations.
Keywords: Base pair substitutions, γ-rays, haploid and diploid cells, yeast Saccharomyces cerevisiae.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 846159 DCBOR: A Density Clustering Based on Outlier Removal
Authors: A. M. Fahim, G. Saake, A. M. Salem, F. A. Torkey, M. A. Ramadan
Abstract:
Data clustering is an important data exploration technique with many applications in data mining. We present an enhanced version of the well known single link clustering algorithm. We will refer to this algorithm as DCBOR. The proposed algorithm alleviates the chain effect by removing the outliers from the given dataset. So this algorithm provides outlier detection and data clustering simultaneously. This algorithm does not need to update the distance matrix, since the algorithm depends on merging the most k-nearest objects in one step and the cluster continues grow as long as possible under specified condition. So the algorithm consists of two phases; at the first phase, it removes the outliers from the input dataset. At the second phase, it performs the clustering process. This algorithm discovers clusters of different shapes, sizes, densities and requires only one input parameter; this parameter represents a threshold for outlier points. The value of the input parameter is ranging from 0 to 1. The algorithm supports the user in determining an appropriate value for it. We have tested this algorithm on different datasets contain outlier and connecting clusters by chain of density points, and the algorithm discovers the correct clusters. The results of our experiments demonstrate the effectiveness and the efficiency of DCBOR.Keywords: Data Clustering, Clustering Algorithms, Handling Noise, Arbitrary Shape of Clusters.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1933