Search results for: stream mining.

183 Observations about the Principal Components Analysis and Data Clustering Techniques in the Study of Medical Data

Authors: Cristina G. Dascâlu, Corina Dima Cozma, Elena Carmen Cotrutz

Abstract:

The medical data statistical analysis often requires the using of some special techniques, because of the particularities of these data. The principal components analysis and the data clustering are two statistical methods for data mining very useful in the medical field, the first one as a method to decrease the number of studied parameters, and the second one as a method to analyze the connections between diagnosis and the data about the patient-s condition. In this paper we investigate the implications obtained from a specific data analysis technique: the data clustering preceded by a selection of the most relevant parameters, made using the principal components analysis. Our assumption was that, using the principal components analysis before data clustering - in order to select and to classify only the most relevant parameters – the accuracy of clustering is improved, but the practical results showed the opposite fact: the clustering accuracy decreases, with a percentage approximately equal with the percentage of information loss reported by the principal components analysis.

Keywords: Data clustering, medical data, principal components analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1461

182 Forecasting Fraudulent Financial Statements using Data Mining

Authors: S. Kotsiantis, E. Koumanakos, D. Tzelepis, V. Tampakas

Abstract:

This paper explores the effectiveness of machine learning techniques in detecting firms that issue fraudulent financial statements (FFS) and deals with the identification of factors associated to FFS. To this end, a number of experiments have been conducted using representative learning algorithms, which were trained using a data set of 164 fraud and non-fraud Greek firms in the recent period 2001-2002. The decision of which particular method to choose is a complicated problem. A good alternative to choosing only one method is to create a hybrid forecasting system incorporating a number of possible solution methods as components (an ensemble of classifiers). For this purpose, we have implemented a hybrid decision support system that combines the representative algorithms using a stacking variant methodology and achieves better performance than any examined simple and ensemble method. To sum up, this study indicates that the investigation of financial information can be used in the identification of FFS and underline the importance of financial ratios.

Keywords: Machine learning, stacking, classifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3007

181 Post Occupancy Life Cycle Analysis of a Green Building Energy Consumption at the University of Western Ontario in London - Canada

Authors: M. Bittencourt, E. K. Yanful, D. Velasquez, A. E. Jungles

Abstract:

The CMLP building was developed to be a model for sustainability with strategies to reduce water, energy and pollution, and to provide a healthy environment for the building occupants. The aim of this paper is to investigate the environmental effects of energy used by this building. A LCA (life cycle analysis) was led to measure the real environmental effects produced by the use of energy. The impact categories most affected by the energy use were found to be the human health effects, as well as ecotoxicity. Natural gas extraction, uranium milling for nuclear energy production, and the blasting for mining and infrastructure construction are the processes contributing the most to emissions in the human health effect. Data comparing LCA results of CMLP building with a conventional building results showed that energy used by the CMLP building has less damage for the environment and human health than a conventional building.

Keywords: Environmental Impacts, Green buildings, Life CycleAnalysis, Sustainability

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1737

180 Correlation-based Feature Selection using Ant Colony Optimization

Authors: M. Sadeghzadeh, M. Teshnehlab

Abstract:

Feature selection has recently been the subject of intensive research in data mining, specially for datasets with a large number of attributes. Recent work has shown that feature selection can have a positive effect on the performance of machine learning algorithms. The success of many learning algorithms in their attempts to construct models of data, hinges on the reliable identification of a small set of highly predictive attributes. The inclusion of irrelevant, redundant and noisy attributes in the model building process phase can result in poor predictive performance and increased computation. In this paper, a novel feature search procedure that utilizes the Ant Colony Optimization (ACO) is presented. The ACO is a metaheuristic inspired by the behavior of real ants in their search for the shortest paths to food sources. It looks for optimal solutions by considering both local heuristics and previous knowledge. When applied to two different classification problems, the proposed algorithm achieved very promising results.

Keywords: Ant colony optimization, Classification, Datamining, Feature selection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2381

179 Walking Hexapod Robot in Disaster Recovery: Developing Algorithm for Terrain Negotiation and Navigation

Authors: Md. Masum Billah, Mohiuddin Ahmed, Soheli Farhana

Abstract:

In modern day disaster recovery mission has become one of the top priorities in any natural disaster management regime. Smart autonomous robots may play a significant role in such missions, including search for life under earth quake hit rubbles, Tsunami hit islands, de-mining in war affected areas and many other such situations. In this paper current state of many walking robots are compared and advantages of hexapod systems against wheeled robots are described. In our research we have selected a hexapod spider robot; we are developing focusing mainly on efficient navigation method in different terrain using apposite gait of locomotion, which will make it faster and at the same time energy efficient to navigate and negotiate difficult terrain. This paper describes the method of terrain negotiation navigation in a hazardous field.

Keywords: Walking robots, locomotion, hexapod robot, gait, hazardous field.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4381

178 Performance Evaluation of an Ontology-Based Arabic Sentiment Analysis

Authors: Salima Behdenna, Fatiha Barigou, Ghalem Belalem

Abstract:

Due to the quick increase in the volume of Arabic opinions posted on various social media, Arabic sentiment analysis has become one of the most important areas of research. Compared to English, there is very little works on Arabic sentiment analysis, in particular aspect-based sentiment analysis (ABSA). In ABSA, aspect extraction is the most important task. In this paper, we propose a semantic ABSA approach for standard Arabic reviews to extract explicit aspect terms and identify the polarity of the extracted aspects. The proposed approach was evaluated using HAAD datasets. Experiments showed that the proposed approach achieved a good level of performance compared with baseline results. The F-measure was improved by 19% for the aspect term extraction tasks and 55% aspect term polarity task.

Keywords: Sentiment analysis, opinion mining, Arabic, aspect level, opinion, polarity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 394

177 Analysis of Diverse Cluster Ensemble Techniques

Authors: S. Sarumathi, N. Shanthi, P. Ranjetha

Abstract:

Data mining is the procedure of determining interesting patterns from the huge amount of data. With the intention of accessing the data faster the most supporting processes needed is clustering. Clustering is the process of identifying similarity between data according to the individuality present in the data and grouping associated data objects into clusters. Cluster ensemble is the technique to combine various runs of different clustering algorithms to obtain a general partition of the original dataset, aiming for consolidation of outcomes from a collection of individual clustering outcomes. The performances of clustering ensembles are mainly affecting by two principal factors such as diversity and quality. This paper presents the overview about the different cluster ensemble algorithm along with their methods used in cluster ensemble to improve the diversity and quality in the several cluster ensemble related papers and shows the comparative analysis of different cluster ensemble also summarize various cluster ensemble methods. Henceforth this clear analysis will be very useful for the world of clustering experts and also helps in deciding the most appropriate one to determine the problem in hand.

Keywords: Cluster Ensemble, Consensus Function, CSPA, Diversity, HGPA, MCLA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1798

176 Case Studies of CSAMT Method Applied to Study of Complex Rock Mass Structure and Hidden Tectonic

Authors: Yuxin Chen, Qingyun Di, C. Dinis da Gama

Abstract:

In projects like waterpower, transportation and mining, etc., proving up the rock-mass structure and hidden tectonic to estimate the geological body-s activity is very important. Integrating the seismic results, drilling and trenching data, CSAMT method was carried out at a planning dame site in southwest China to evaluate the stability of a deformation. 2D and imitated 3D inversion resistivity results of CSAMT method were analyzed. The results indicated that CSAMT was an effective method for defining an outline of deformation body to several hundred meters deep; the Lung Pan Deformation was stable in natural conditions; but uncertain after the future reservoir was impounded. This research presents a good case study of the fine surveying and research on complex geological structure and hidden tectonic in engineering project.

Keywords: CSAMT Surveying, Deformation Stability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2403

175 A Two-Step, Temperature-Staged Direct Coal Liquefaction Process

Authors: Reyna Singh, David Lokhat, Milan Carsky

Abstract:

The world crude oil demand is projected to rise to 108.5 million bbl/d by the year 2035. With reserves estimated at 869 billion tonnes worldwide, coal remains an abundant resource. The aim of this work was to produce a high value hydrocarbon liquid product using a Direct Coal Liquefaction (DCL) process at, relatively mild operating conditions. Via hydrogenation, the temperature-staged approach was investigated in a dual reactor lab-scale pilot plant facility. The objectives included maximising thermal dissolution of the coal in the presence of tetralin as the hydrogen donor solvent in the first stage with 2:1 and 3:1 solvent: coal ratios. Subsequently, in the second stage, hydrogen saturation, in particular, hydrodesulphurization (HDS) performance was assessed. Two commercial hydrotreating catalysts were investigated viz. NickelMolybdenum (Ni-Mo) and Cobalt-Molybdenum (Co-Mo). GC-MS results identified 77 compounds and various functional groups present in the first and second stage liquid product. In the first stage 3:1 ratios and liquid product yields catalysed by magnetite were favoured. The second stage product distribution showed an increase in the BTX (Benzene, Toluene, Xylene) quality of the liquid product, branched chain alkanes and a reduction in the sulphur concentration. As an HDS performer and selectivity to the production of long and branched chain alkanes, Ni-Mo had an improved performance over Co-Mo. Co-Mo is selective to a higher concentration of cyclohexane. For 16 days on stream each, Ni-Mo had a higher activity than Co-Mo. The potential to cover the demand for low–sulphur, crude diesel and solvents from the production of high value hydrocarbon liquid in the said process, is thus demonstrated.

Keywords: Catalyst, coal, liquefaction, temperature-staged.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1603

174 A Relationship Extraction Method from Literary Fiction Considering Korean Linguistic Features

Authors: Hee-Jeong Ahn, Kee-Won Kim, Seung-Hoon Kim

Abstract:

The knowledge of the relationship between characters can help readers to understand the overall story or plot of the literary fiction. In this paper, we present a method for extracting the specific relationship between characters from a Korean literary fiction. Generally, methods for extracting relationships between characters in text are statistical or computational methods based on the sentence distance between characters without considering Korean linguistic features. Furthermore, it is difficult to extract the relationship with direction from text, such as one-sided love, because they consider only the weight of relationship, without considering the direction of the relationship. Therefore, in order to identify specific relationships between characters, we propose a statistical method considering linguistic features, such as syntactic patterns and speech verbs in Korean. The result of our method is represented by a weighted directed graph of the relationship between the characters. Furthermore, we expect that proposed method could be applied to the relationship analysis between characters of other content like movie or TV drama.

Keywords: Data mining, Korean linguistic feature, literary fiction, relationship extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1763

173 Value Analysis of Islamic Banking and Conventional Banking to Measure Value Co-creation

Authors: Amna Javed, Hisashi Masuda, Youji Kohda

Abstract:

This study examines the value analysis in Islamic and conventional banking services in Pakistan. Many scholars have focused on co-creation of values in services but mainly economic values not non-economic. As Islamic banking is based on Islamic principles that are more concerned with non-economic values (well-being, partnership, fairness, trust worthy, and justice) than economic values as money in terms of interest. This study is important to know the providers point of view about the co-created values, because, it may be more sustainable and appropriate for today’s unpredictable socio-economic environment. Data were collected from 4 banks (2 Islamic and 2 conventional banks). Text mining technique is applied for data analysis, and values with 100% occurrences in Islamic banking are chosen. The results reflect that Islamic banking is more centric towards non-economic values than economic values and it promotes team work and partnership concept by applying Islamic spirit and trust worthiness concept.

Keywords: Economic values, Islamic banking, Non-economic values, Value system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3209

172 Evolving Knowledge Extraction from Online Resources

Authors: Zhibo Xiao, Tharini Nayanika de Silva, Kezhi Mao

Abstract:

In this paper, we present an evolving knowledge extraction system named AKEOS (Automatic Knowledge Extraction from Online Sources). AKEOS consists of two modules, including a one-time learning module and an evolving learning module. The one-time learning module takes in user input query, and automatically harvests knowledge from online unstructured resources in an unsupervised way. The output of the one-time learning is a structured vector representing the harvested knowledge. The evolving learning module automatically schedules and performs repeated one-time learning to extract the newest information and track the development of an event. In addition, the evolving learning module summarizes the knowledge learned at different time points to produce a final knowledge vector about the event. With the evolving learning, we are able to visualize the key information of the event, discover the trends, and track the development of an event.

Keywords: Evolving learning, knowledge extraction, knowledge graph, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 894

171 Discriminant Analysis as a Function of Predictive Learning to Select Evolutionary Algorithms in Intelligent Transportation System

Authors: Jorge A. Ruiz-Vanoye, Ocotlán Díaz-Parra, Alejandro Fuentes-Penna, Daniel Vélez-Díaz, Edith Olaco García

Abstract:

In this paper, we present the use of the discriminant analysis to select evolutionary algorithms that better solve instances of the vehicle routing problem with time windows. We use indicators as independent variables to obtain the classification criteria, and the best algorithm from the generic genetic algorithm (GA), random search (RS), steady-state genetic algorithm (SSGA), and sexual genetic algorithm (SXGA) as the dependent variable for the classification. The discriminant classification was trained with classic instances of the vehicle routing problem with time windows obtained from the Solomon benchmark. We obtained a classification of the discriminant analysis of 66.7%.

Keywords: Intelligent transportation systems, data-mining techniques, evolutionary algorithms, discriminant analysis, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1511

170 Computational Investigation of Secondary Flow Losses in Linear Turbine Cascade by Modified Leading Edge Fence

Authors: K. N. Kiran, S. Anish

Abstract:

It is well known that secondary flow loses account about one third of the total loss in any axial turbine. Modern gas turbine height is smaller and have longer chord length, which might lead to increase in secondary flow. In order to improve the efficiency of the turbine, it is important to understand the behavior of secondary flow and device mechanisms to curtail these losses. The objective of the present work is to understand the effect of a stream wise end-wall fence on the aerodynamics of a linear turbine cascade. The study is carried out computationally by using commercial software ANSYS CFX. The effect of end-wall on the flow field are calculated based on RANS simulation by using SST transition turbulence model. Durham cascade which is similar to high-pressure axial flow turbine for simulation is used. The aim of fencing in blade passage is to get the maximum benefit from flow deviation and destroying the passage vortex in terms of loss reduction. It is observed that, for the present analysis, fence in the blade passage helps reducing the strength of horseshoe vortex and is capable of restraining the flow along the blade passage. Fence in the blade passage helps in reducing the under turning by 7⁰ in comparison with base case. Fence on end-wall is effective in preventing the movement of pressure side leg of horseshoe vortex and helps in breaking the passage vortex. Computations are carried for different fence height whose curvature is different from the blade camber. The optimum fence geometry and location reduces the loss coefficient by 15.6% in comparison with base case.

Keywords: Boundary layer fence, horseshoe vortex, linear cascade, passage vortex, secondary flow.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1986

169 Fe, Pb, Mn, and Cd Concentrations in Edible Mushrooms (Agaricus campestris) Grown in Abakaliki, Ebonyi State, Nigeria

Authors: N. O. Omaka, I. F. Offor, R.C. Ehiri

Abstract:

The health and environmental risk of eating mushrooms grown in Abakaliki were evaluated in terms of heavy metals accumulation. Mushroom samples were collected from four different farms located at Izzi, Amajim, Amana and Amudo and analyzed for (iron, lead, manganese and cadmium) using Bulk Scientific Atomic Absorption Spectrophotometer 205. Results indicates mean range of concentrations of the trace metals in the mushrooms were Fe (0.22-152. 03), Mn (0.74-9.76), Pb (0.01.0.80), Cd (0.61-0.82) mg/L respectively. Accumulation of Cd on the four locations under investigation was higher than the UK Government Food Science Surveillance and World Health Organization maximum recommended levels in mushroom for human consumption. The Fe and Mn contaminants of Amudo were significant and show the impact of anthropogenic/atmospheric pollution. The potential sources of the heavy metals in the mushrooms were from urban waste, dust from mining and quarrying activities, natural geochemistry of the area, and use of inorganic fertilizers

Keywords: Agaricus campestris, edible, health implication heavy metal, mushroom.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2525

168 Providing a Practical Model to Reduce Maintenance Costs: A Case Study in Golgohar Company

Authors: Iman Atighi, Jalal Soleimannejad, Ahmad Akbarinasab, Saeid Moradpour

Abstract:

In the past, we could increase profit by increasing product prices. But in the new decade, a competitive market does not let us to increase profit with increase prices. Therefore, the only way to increase profit will be reduce costs. A significant percentage of production costs are the maintenance costs, and analysis of these costs could achieve more profit. Most maintenance strategies such as RCM (Reliability-Center-Maintenance), TPM (Total Productivity Maintenance), PM (Preventive Maintenance) etc., are trying to reduce maintenance costs. In this paper, decreasing the maintenance costs of Concentration Plant of Golgohar Company (GEG) was examined by using of MTBF (Mean Time between Failures) and MTTR (Mean Time to Repair) analyses. These analyses showed that instead of buying new machines and increasing costs in order to promote capacity, the improving of MTBF and MTTR indexes would solve capacity problems in the best way and decrease costs.

Keywords: Golgohar Iron Ore Mining & Industrial Company, maintainability, maintenance costs, reliability-center-maintenance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 601

167 Off-Line Hand Written Thai Character Recognition using Ant-Miner Algorithm

Authors: P. Phokharatkul, K. Sankhuangaw, S. Somkuarnpanit, S. Phaiboon, C. Kimpan

Abstract:

Much research into handwritten Thai character recognition have been proposed, such as comparing heads of characters, Fuzzy logic and structure trees, etc. This paper presents a system of handwritten Thai character recognition, which is based on the Ant-minor algorithm (data mining based on Ant colony optimization). Zoning is initially used to determine each character. Then three distinct features (also called attributes) of each character in each zone are extracted. The attributes are Head zone, End point, and Feature code. All attributes are used for construct the classification rules by an Ant-miner algorithm in order to classify 112 Thai characters. For this experiment, the Ant-miner algorithm is adapted, with a small change to increase the recognition rate. The result of this experiment is a 97% recognition rate of the training set (11200 characters) and 82.7% recognition rate of unseen data test (22400 characters).

Keywords: Hand written, Thai character recognition, Ant-mineralgorithm, distinct feature.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1890

166 Modeling of Pulsatile Blood Flow in a Weak Magnetic Field

Authors: Chee Teck Phua, Gaëlle Lissorgues

Abstract:

Blood pulse is an important human physiological signal commonly used for the understanding of the individual physical health. Current methods of non-invasive blood pulse sensing require direct contact or access to the human skin. As such, the performances of these devices tend to vary with time and are subjective to human body fluids (e.g. blood, perspiration and skin-oil) and environmental contaminants (e.g. mud, water, etc). This paper proposes a simulation model for the novel method of non-invasive acquisition of blood pulse using the disturbance created by blood flowing through a localized magnetic field. The simulation model geometry represents a blood vessel, a permanent magnet, a magnetic sensor, surrounding tissues and air in 2-dimensional. In this model, the velocity and pressure fields in the blood stream are described based on Navier-Stroke equations and the walls of the blood vessel are assumed to have no-slip condition. The blood assumes a parabolic profile considering a laminar flow for blood in major artery near the skin. And the inlet velocity follows a sinusoidal equation. This will allow the computational software to compute the interactions between the magnetic vector potential generated by the permanent magnet and the magnetic nanoparticles in the blood. These interactions are simulated based on Maxwell equations at the location where the magnetic sensor is placed. The simulated magnetic field at the sensor location is found to assume similar sinusoidal waveform characteristics as the inlet velocity of the blood. The amplitude of the simulated waveforms at the sensor location are compared with physical measurements on human subjects and found to be highly correlated.

Keywords: Blood pulse, magnetic sensing, non-invasive measurement, magnetic disturbance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2573

165 Effective Keyword and Similarity Thresholds for the Discovery of Themes from the User Web Access Patterns

Authors: Haider A Ramadhan, Khalil Shihab

Abstract:

Clustering techniques have been used by many intelligent software agents to group similar access patterns of the Web users into high level themes which express users intentions and interests. However, such techniques have been mostly focusing on one salient feature of the Web document visited by the user, namely the extracted keywords. The major aim of these techniques is to come up with an optimal threshold for the number of keywords needed to produce more focused themes. In this paper we focus on both keyword and similarity thresholds to generate themes with concentrated themes, and hence build a more sound model of the user behavior. The purpose of this paper is two fold: use distance based clustering methods to recognize overall themes from the Proxy log file, and suggest an efficient cut off levels for the keyword and similarity thresholds which tend to produce more optimal clusters with better focus and efficient size.

Keywords: Data mining, knowledge discovery, clustering, dataanalysis, Web log analysis, theme based searching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1413

164 Sentiment Analysis: Comparative Analysis of Multilingual Sentiment and Opinion Classification Techniques

Authors: Sannikumar Patel, Brian Nolan, Markus Hofmann, Philip Owende, Kunjan Patel

Abstract:

Sentiment analysis and opinion mining have become emerging topics of research in recent years but most of the work is focused on data in the English language. A comprehensive research and analysis are essential which considers multiple languages, machine translation techniques, and different classifiers. This paper presents, a comparative analysis of different approaches for multilingual sentiment analysis. These approaches are divided into two parts: one using classification of text without language translation and second using the translation of testing data to a target language, such as English, before classification. The presented research and results are useful for understanding whether machine translation should be used for multilingual sentiment analysis or building language specific sentiment classification systems is a better approach. The effects of language translation techniques, features, and accuracy of various classifiers for multilingual sentiment analysis is also discussed in this study.

Keywords: Cross-language analysis, machine learning, machine translation, sentiment analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1607

163 Rock Thickness Measurement by Using Self-Excited Acoustical System

Authors: JanuszKwaśniewski, IreneuszDominik, KrzysztofLalik

Abstract:

The knowledge about rock layers thickness,especially above drilled mining pavements is crucial for workers safety. The measuring systems used nowadays are generally imperfect and there is a strong demand for improvement. The application of a new type of a measurement system called Self-excited Acoustical System is presentedin the paper. The system was applied until now to monitor stress changes in metal and concrete constructions. The change in measurement methodology resulted in possibility of measuring the thickness of the rocks above the tunnels as well as thickness of a singular rocklayer. The idea is to find two resonance frequencies of the self-exited system,which consists of a vibration exciter and vibration receiver placed at a distance, which are coupled with a proper power amplifier, and which operate in a closed loop with a positive feedback. The resonance with the higher amplitude determines thickness of the whole rock, whereas the lower amplitude resonance indicates thickness of a singular layer. The results of the laboratory tests conducted on a group of different rock materials are also presented.

Keywords: Autooscillator, non-destructive testing, rock thickness measurement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2031

162 Performance Analysis of Proprietary and Non-Proprietary Tools for Regression Testing Using Genetic Algorithm

Authors: K. Hema Shankari, R. Thirumalaiselvi, N. V. Balasubramanian

Abstract:

The present paper addresses to the research in the area of regression testing with emphasis on automated tools as well as prioritization of test cases. The uniqueness of regression testing and its cyclic nature is pointed out. The difference in approach between industry, with business model as basis, and academia, with focus on data mining, is highlighted. Test Metrics are discussed as a prelude to our formula for prioritization; a case study is further discussed to illustrate this methodology. An industrial case study is also described in the paper, where the number of test cases is so large that they have to be grouped as Test Suites. In such situations, a genetic algorithm proposed by us can be used to reconfigure these Test Suites in each cycle of regression testing. The comparison is made between a proprietary tool and an open source tool using the above-mentioned metrics. Our approach is clarified through several tables.

Keywords: APFD metric, genetic algorithm, regression testing, RFT tool, test case prioritization, selenium tool.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 882

161 Annotations of Gene Pathways Images in Biomedical Publications Using Siamese Network

Authors: Micheal Olaolu Arowolo, Muhammad Azam, Fei He, Mihail Popescu, Dong Xu

Abstract:

As the quantity of biological articles rises, so does the number of biological route figures. Each route figure shows gene names and relationships. Manually annotating pathway diagrams is time-consuming. Advanced image understanding models could speed up curation, but they must be more precise. There is rich information in biological pathway figures. The first step to performing image understanding of these figures is to recognize gene names automatically. Classical optical character recognition methods have been employed for gene name recognition, but they are not optimized for literature mining data. This study devised a method to recognize an image bounding box of gene name as a photo using deep Siamese neural network models to outperform the existing methods using ResNet, DenseNet and Inception architectures, the results obtained about 84% accuracy.

Keywords: Biological pathway, gene identification, object detection, Siamese network, ResNet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 175

160 Message Framework for Disaster Management: An Application Model for Mines

Authors: A. Baloğlu, A. Çınar

Abstract:

Different tools and technologies were implemented for Crisis Response and Management (CRM) which is generally using available network infrastructure for information exchange. Depending on type of disaster or crisis, network infrastructure could be affected and it could not be able to provide reliable connectivity. Thus any tool or technology that depends on the connectivity could not be able to fulfill its functionalities. As a solution, a new message exchange framework has been developed. Framework provides offline/online information exchange platform for CRM Information Systems (CRMIS) and it uses XML compression and packet prioritization algorithms and is based on open source web technologies. By introducing offline capabilities to the web technologies, framework will be able to perform message exchange on unreliable networks. The experiments done on the simulation environment provide promising results on low bandwidth networks (56kbps and 28.8 kbps) with up to 50% packet loss and the solution is to successfully transfer all the information on these low quality networks where the traditional 2 and 3 tier applications failed.

Keywords: Crisis Response and Management, XML Messaging, Web Services, XML compression, Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1863

159 Comparison between Associative Classification and Decision Tree for HCV Treatment Response Prediction

Authors: Enas M. F. El Houby, Marwa S. Hassan

Abstract:

Combined therapy using Interferon and Ribavirin is the standard treatment in patients with chronic hepatitis C. However, the number of responders to this treatment is low, whereas its cost and side effects are high. Therefore, there is a clear need to predict patient’s response to the treatment based on clinical information to protect the patients from the bad drawbacks, Intolerable side effects and waste of money. Different machine learning techniques have been developed to fulfill this purpose. From these techniques are Associative Classification (AC) and Decision Tree (DT). The aim of this research is to compare the performance of these two techniques in the prediction of virological response to the standard treatment of HCV from clinical information. 200 patients treated with Interferon and Ribavirin; were analyzed using AC and DT. 150 cases had been used to train the classifiers and 50 cases had been used to test the classifiers. The experiment results showed that the two techniques had given acceptable results however the best accuracy for the AC reached 92% whereas for DT reached 80%.

Keywords: Associative Classification, Data mining, Decision tree, HCV, interferon.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1849

158 Comparative Analysis of Different Page Ranking Algorithms

Authors: S. Prabha, K. Duraiswamy, J. Indhumathi

Abstract:

Search engine plays an important role in internet, to retrieve the relevant documents among the huge number of web pages. However, it retrieves more number of documents, which are all relevant to your search topics. To retrieve the most meaningful documents related to search topics, ranking algorithm is used in information retrieval technique. One of the issues in data miming is ranking the retrieved document. In information retrieval the ranking is one of the practical problems. This paper includes various Page Ranking algorithms, page segmentation algorithms and compares those algorithms used for Information Retrieval. Diverse Page Rank based algorithms like Page Rank (PR), Weighted Page Rank (WPR), Weight Page Content Rank (WPCR), Hyperlink Induced Topic Selection (HITS), Distance Rank, Eigen Rumor, Distance Rank Time Rank, Tag Rank, Relational Based Page Rank and Query Dependent Ranking algorithms are discussed and compared.

Keywords: Information Retrieval, Web Page Ranking, search engine, web mining, page segmentations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4247

157 Application of a New Hybrid Optimization Algorithm on Cluster Analysis

Authors: T. Niknam, M. Nayeripour, B.Bahmani Firouzi

Abstract:

Clustering techniques have received attention in many areas including engineering, medicine, biology and data mining. The purpose of clustering is to group together data points, which are close to one another. The K-means algorithm is one of the most widely used techniques for clustering. However, K-means has two shortcomings: dependency on the initial state and convergence to local optima and global solutions of large problems cannot found with reasonable amount of computation effort. In order to overcome local optima problem lots of studies done in clustering. This paper is presented an efficient hybrid evolutionary optimization algorithm based on combining Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO), called PSO-ACO, for optimally clustering N object into K clusters. The new PSO-ACO algorithm is tested on several data sets, and its performance is compared with those of ACO, PSO and K-means clustering. The simulation results show that the proposed evolutionary optimization algorithm is robust and suitable for handing data clustering.

Keywords: Ant Colony Optimization (ACO), Data clustering, Hybrid evolutionary optimization algorithm, K-means clustering, Particle Swarm Optimization (PSO).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2165

156 Sounds Alike Name Matching for Myanmar Language

Authors: Yuzana, Khin Marlar Tun

Abstract:

Personal name matching system is the core of essential task in national citizen database, text and web mining, information retrieval, online library system, e-commerce and record linkage system. It has necessitated to the all embracing research in the vicinity of name matching. Traditional name matching methods are suitable for English and other Latin based language. Asian languages which have no word boundary such as Myanmar language still requires sounds alike matching system in Unicode based application. Hence we proposed matching algorithm to get analogous sounds alike (phonetic) pattern that is convenient for Myanmar character spelling. According to the nature of Myanmar character, we consider for word boundary fragmentation, collation of character. Thus we use pattern conversion algorithm which fabricates words in pattern with fragmented and collated. We create the Myanmar sounds alike phonetic group to help in the phonetic matching. The experimental results show that fragmentation accuracy in 99.32% and processing time in 1.72 ms.

Keywords: natural language processing, name matching, phonetic matching

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1760

155 Automatic Clustering of Gene Ontology by Genetic Algorithm

Authors: Razib M. Othman, Safaai Deris, Rosli M. Illias, Zalmiyah Zakaria, Saberi M. Mohamad

Abstract:

Nowadays, Gene Ontology has been used widely by many researchers for biological data mining and information retrieval, integration of biological databases, finding genes, and incorporating knowledge in the Gene Ontology for gene clustering. However, the increase in size of the Gene Ontology has caused problems in maintaining and processing them. One way to obtain their accessibility is by clustering them into fragmented groups. Clustering the Gene Ontology is a difficult combinatorial problem and can be modeled as a graph partitioning problem. Additionally, deciding the number k of clusters to use is not easily perceived and is a hard algorithmic problem. Therefore, an approach for solving the automatic clustering of the Gene Ontology is proposed by incorporating cohesion-and-coupling metric into a hybrid algorithm consisting of a genetic algorithm and a split-and-merge algorithm. Experimental results and an example of modularized Gene Ontology in RDF/XML format are given to illustrate the effectiveness of the algorithm.

Keywords: Automatic clustering, cohesion-and-coupling metric, gene ontology; genetic algorithm, split-and-merge algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1921

154 Technical Aspects of Closing the Loop in Depth-of-Anesthesia Control

Authors: Gorazd Karer

Abstract:

When performing a diagnostic procedure or surgery in general anesthesia (GA), a proper introduction and dosing of anesthetic agents is one of the main tasks of the anesthesiologist. That being said, depth of anesthesia (DoA) also seems to be a suitable process for closed-loop control implementation. To implement such a system, one must be able to acquire the relevant signals online and in real-time, as well as stream the calculated control signal to the infusion pump. However, during a procedure, patient monitors and infusion pumps are purposely unable to connect to an external (possibly medically unapproved) device for safety reasons, thus preventing closed-loop control. This paper proposes a conceptual solution to the aforementioned problem. First, it presents some important aspects of contemporary clinical practice. Next, it introduces the closed-loop-control-system structure and the relevant information flow. Focusing on transferring the data from the patient to the computer, it presents a non-invasive image-based system for signal acquisition from a patient monitor for online depth-of-anesthesia assessment. Furthermore, it introduces a User-Datagram-Protocol-based (UDP-based) communication method that can be used for transmitting the calculated anesthetic inflow to the infusion pump. The proposed system is independent of medical-device manufacturer and is implemented in MATLAB-Simulink, which can be conveniently used for DoA control implementation. The proposed scheme has been tested in a simulated GA setting and is ready to be evaluated in an operating theatre. However, the proposed system is only a step towards a proper closed-loop control system for DoA, which could routinely be used in clinical practice.

Keywords: Closed-loop control, Depth of Anesthesia, DoA, optical signal acquisition, Patient State index, PSi, UDP communication protocol.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 435