Search results for: Frequent itemset mining.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 721

Search results for: Frequent itemset mining.

211 Performance Evaluation of an Ontology-Based Arabic Sentiment Analysis

Authors: Salima Behdenna, Fatiha Barigou, Ghalem Belalem

Abstract:

Due to the quick increase in the volume of Arabic opinions posted on various social media, Arabic sentiment analysis has become one of the most important areas of research. Compared to English, there is very little works on Arabic sentiment analysis, in particular aspect-based sentiment analysis (ABSA). In ABSA, aspect extraction is the most important task. In this paper, we propose a semantic ABSA approach for standard Arabic reviews to extract explicit aspect terms and identify the polarity of the extracted aspects. The proposed approach was evaluated using HAAD datasets. Experiments showed that the proposed approach achieved a good level of performance compared with baseline results. The F-measure was improved by 19% for the aspect term extraction tasks and 55% aspect term polarity task.

Keywords: Sentiment analysis, opinion mining, Arabic, aspect level, opinion, polarity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 382
210 Analysis of Diverse Cluster Ensemble Techniques

Authors: S. Sarumathi, N. Shanthi, P. Ranjetha

Abstract:

Data mining is the procedure of determining interesting patterns from the huge amount of data. With the intention of accessing the data faster the most supporting processes needed is clustering. Clustering is the process of identifying similarity between data according to the individuality present in the data and grouping associated data objects into clusters. Cluster ensemble is the technique to combine various runs of different clustering algorithms to obtain a general partition of the original dataset, aiming for consolidation of outcomes from a collection of individual clustering outcomes. The performances of clustering ensembles are mainly affecting by two principal factors such as diversity and quality. This paper presents the overview about the different cluster ensemble algorithm along with their methods used in cluster ensemble to improve the diversity and quality in the several cluster ensemble related papers and shows the comparative analysis of different cluster ensemble also summarize various cluster ensemble methods. Henceforth this clear analysis will be very useful for the world of clustering experts and also helps in deciding the most appropriate one to determine the problem in hand.

Keywords: Cluster Ensemble, Consensus Function, CSPA, Diversity, HGPA, MCLA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1794
209 VHL, PBRM1 and SETD2 Genes in Kidney Cancer: A Molecular Investigation

Authors: Rozhgar A. Khailany, Mehri Igci, Emine Bayraktar, Sakip Erturhan, Metin Karakok, Ahmet Arslan

Abstract:

Kidney cancer is the most lethal urological cancer accounting for 3% of adult malignancies. VHL, a tumor-suppressor gene, is best known to be associated with renal cell carcinoma (RCC). The VHL functions as negative regulator of hypoxia inducible factors. Recent sequencing efforts have identified several novel frequent mutations of histone modifying and chromatin remodeling genes in ccRCC (clear cell RCC) including PBRM1 and SETD2. The PBRM1 gene encodes the BAF180 protein, which involved in transcriptional activation and repression of selected genes. SETD2 encodes a histone methyltransferase, which may play a role in suppressing tumor development. In this study, RNAs of 30 paired tumor and normal samples that were grouped according to the types of kidney cancer and clinical characteristics of patients, including gender and average age were examined by RT-PCR, SSCP and sequencing techniques. VHL, PBRM1 and SETD2 expressions were relatively down-regulated. However, statistically no significance was found (Wilcoxon signed rank test, p>0.05). Interestingly, no mutation was observed on the contrary of previous studies. Understanding the molecular mechanisms involved in the pathogenesis of RCC has aided the development of molecular-targeted drugs for kidney cancer. Further analysis is required to identify the responsible genes rather than VHL, PBRM1 and SETD2 in kidney cancer.

Keywords: Kidney cancer, molecular biomarker, expression analysis, mutation screening.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1968
208 Case Studies of CSAMT Method Applied to Study of Complex Rock Mass Structure and Hidden Tectonic

Authors: Yuxin Chen, Qingyun Di, C. Dinis da Gama

Abstract:

In projects like waterpower, transportation and mining, etc., proving up the rock-mass structure and hidden tectonic to estimate the geological body-s activity is very important. Integrating the seismic results, drilling and trenching data, CSAMT method was carried out at a planning dame site in southwest China to evaluate the stability of a deformation. 2D and imitated 3D inversion resistivity results of CSAMT method were analyzed. The results indicated that CSAMT was an effective method for defining an outline of deformation body to several hundred meters deep; the Lung Pan Deformation was stable in natural conditions; but uncertain after the future reservoir was impounded. This research presents a good case study of the fine surveying and research on complex geological structure and hidden tectonic in engineering project.

Keywords: CSAMT Surveying, Deformation Stability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2394
207 Classification of Potential Biomarkers in Breast Cancer Using Artificial Intelligence Algorithms and Anthropometric Datasets

Authors: Aref Aasi, Sahar Ebrahimi Bajgani, Erfan Aasi

Abstract:

Breast cancer (BC) continues to be the most frequent cancer in females and causes the highest number of cancer-related deaths in women worldwide. Inspired by recent advances in studying the relationship between different patient attributes and features and the disease, in this paper, we have tried to investigate the different classification methods for better diagnosis of BC in the early stages. In this regard, datasets from the University Hospital Centre of Coimbra were chosen, and different machine learning (ML)-based and neural network (NN) classifiers have been studied. For this purpose, we have selected favorable features among the nine provided attributes from the clinical dataset by using a random forest algorithm. This dataset consists of both healthy controls and BC patients, and it was noted that glucose, BMI, resistin, and age have the most importance, respectively. Moreover, we have analyzed these features with various ML-based classifier methods, including Decision Tree (DT), K-Nearest Neighbors (KNN), eXtreme Gradient Boosting (XGBoost), Logistic Regression (LR), Naive Bayes (NB), and Support Vector Machine (SVM) along with NN-based Multi-Layer Perceptron (MLP) classifier. The results revealed that among different techniques, the SVM and MLP classifiers have the most accuracy, with amounts of 96% and 92%, respectively. These results divulged that the adopted procedure could be used effectively for the classification of cancer cells, and also it encourages further experimental investigations with more collected data for other types of cancers.

Keywords: Breast cancer, health diagnosis, Machine Learning, biomarker classification, Neural Network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 220
206 The Taiwanese Institutional Arrangement for Coastal Management Due to Climate Change

Authors: Wen-Hong Liu, Hao-Tang Jhan, Kun-Lung Lin, Meng-Tsung Lee

Abstract:

Weather disaster events were frequent and caused loss of lives and property in Taiwan recently. Excessive concentration of population and lacking of integrated planning led to Taiwanese coastal zone face the impacts of climate change directly. Comparing to many countries which have already set up legislation, competent authorities and national adaptation strategies, the ability of coastal management adapting to climate change is still insufficient in Taiwan. Therefore, it is necessary to establish a complete institutional arrangement for coastal management due to climate change in order to protect environment and sustain socio-economic development. This paper firstly reviews the impact of climate change on Taiwanese coastal zone. Secondly, development of Taiwanese institutional arrangement of coastal management is introduced. Followed is the analysis of four dimensions of legal basis, competent authority, scientific and financial support and international cooperations of institutional arrangement. The results show that Taiwanese government shall: 1) integrate climate change issue into Coastal Act, Wetland Act and territorial planning Act and pass them; 2) establish the high level competent authority for coastal management; 3) set up the climate change disaster coordinate platform; 4) link scientific information and decision markers; 5) establish the climate change adjustment fund; 6) participate in international climate change organizations and meetings actively; 7) cooperate with near countries to exchange experiences.

Keywords: Climate Change, Coastal Zone Management, Institution Arrangement, Adaptation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1919
205 A Relationship Extraction Method from Literary Fiction Considering Korean Linguistic Features

Authors: Hee-Jeong Ahn, Kee-Won Kim, Seung-Hoon Kim

Abstract:

The knowledge of the relationship between characters can help readers to understand the overall story or plot of the literary fiction. In this paper, we present a method for extracting the specific relationship between characters from a Korean literary fiction. Generally, methods for extracting relationships between characters in text are statistical or computational methods based on the sentence distance between characters without considering Korean linguistic features. Furthermore, it is difficult to extract the relationship with direction from text, such as one-sided love, because they consider only the weight of relationship, without considering the direction of the relationship. Therefore, in order to identify specific relationships between characters, we propose a statistical method considering linguistic features, such as syntactic patterns and speech verbs in Korean. The result of our method is represented by a weighted directed graph of the relationship between the characters. Furthermore, we expect that proposed method could be applied to the relationship analysis between characters of other content like movie or TV drama.

Keywords: Data mining, Korean linguistic feature, literary fiction, relationship extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1755
204 Value Analysis of Islamic Banking and Conventional Banking to Measure Value Co-creation

Authors: Amna Javed, Hisashi Masuda, Youji Kohda

Abstract:

This study examines the value analysis in Islamic and conventional banking services in Pakistan. Many scholars have focused on co-creation of values in services but mainly economic values not non-economic. As Islamic banking is based on Islamic principles that are more concerned with non-economic values (well-being, partnership, fairness, trust worthy, and justice) than economic values as money in terms of interest.  This study is important to know the providers point of view about the co-created values, because, it may be more sustainable and appropriate for today’s unpredictable socio-economic environment. Data were collected from 4 banks (2 Islamic and 2 conventional banks). Text mining technique is applied for data analysis, and values with 100% occurrences in Islamic banking are chosen. The results reflect that Islamic banking is more centric towards non-economic values than economic values and it promotes team work and partnership concept by applying Islamic spirit and trust worthiness concept.

Keywords: Economic values, Islamic banking, Non-economic values, Value system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3199
203 Effect of Reynolds Number on Wall-normal Turbulence Intensity in a Smooth and Rough Open Channel Using both Outer and Inner Scaling

Authors: Md Abdullah Al Faruque, Ram Balachandar

Abstract:

Sudden change of bed condition is frequent in open channel flow. Change of bed condition affects the turbulence characteristics in both streamwise and wall-normal direction. Understanding the turbulence intensity in open channel flow is of vital importance to the modeling of sediment transport and resuspension, bed formation, entrainment, and the exchange of energy and momentum. A comprehensive study was carried out to understand the extent of the effect of Reynolds number and bed roughness on different turbulence characteristics in an open channel flow. Four different bed conditions (impervious smooth bed, impervious continuous rough bed, pervious rough sand bed, and impervious distributed roughness) and two different Reynolds numbers were adopted for this cause. The effect of bed roughness on different turbulence characteristics is seen to be prevalent for most of the flow depth. Effect of Reynolds number on different turbulence characteristics is also evident for flow over different bed, but the extent varies on bed condition. Although the same sand grain is used to create the different rough bed conditions, the difference in turbulence characteristics is an indication that specific geometry of the roughness has an influence on turbulence characteristics. Roughness increases the contribution of the extreme turbulent events which produces very large instantaneous Reynolds shear stress and can potentially influence the sediment transport, resuspension of pollutant from bed and alter the nutrient composition, which eventually affect the sustainability of benthic organisms.

Keywords: Open channel flow, Reynolds Number, roughness, turbulence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1039
202 Evolving Knowledge Extraction from Online Resources

Authors: Zhibo Xiao, Tharini Nayanika de Silva, Kezhi Mao

Abstract:

In this paper, we present an evolving knowledge extraction system named AKEOS (Automatic Knowledge Extraction from Online Sources). AKEOS consists of two modules, including a one-time learning module and an evolving learning module. The one-time learning module takes in user input query, and automatically harvests knowledge from online unstructured resources in an unsupervised way. The output of the one-time learning is a structured vector representing the harvested knowledge. The evolving learning module automatically schedules and performs repeated one-time learning to extract the newest information and track the development of an event. In addition, the evolving learning module summarizes the knowledge learned at different time points to produce a final knowledge vector about the event. With the evolving learning, we are able to visualize the key information of the event, discover the trends, and track the development of an event.

Keywords: Evolving learning, knowledge extraction, knowledge graph, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 890
201 Discriminant Analysis as a Function of Predictive Learning to Select Evolutionary Algorithms in Intelligent Transportation System

Authors: Jorge A. Ruiz-Vanoye, Ocotlán Díaz-Parra, Alejandro Fuentes-Penna, Daniel Vélez-Díaz, Edith Olaco García

Abstract:

In this paper, we present the use of the discriminant analysis to select evolutionary algorithms that better solve instances of the vehicle routing problem with time windows. We use indicators as independent variables to obtain the classification criteria, and the best algorithm from the generic genetic algorithm (GA), random search (RS), steady-state genetic algorithm (SSGA), and sexual genetic algorithm (SXGA) as the dependent variable for the classification. The discriminant classification was trained with classic instances of the vehicle routing problem with time windows obtained from the Solomon benchmark. We obtained a classification of the discriminant analysis of 66.7%.

Keywords: Intelligent transportation systems, data-mining techniques, evolutionary algorithms, discriminant analysis, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1505
200 Fe, Pb, Mn, and Cd Concentrations in Edible Mushrooms (Agaricus campestris) Grown in Abakaliki, Ebonyi State, Nigeria

Authors: N. O. Omaka, I. F. Offor, R.C. Ehiri

Abstract:

The health and environmental risk of eating mushrooms grown in Abakaliki were evaluated in terms of heavy metals accumulation. Mushroom samples were collected from four different farms located at Izzi, Amajim, Amana and Amudo and analyzed for (iron, lead, manganese and cadmium) using Bulk Scientific Atomic Absorption Spectrophotometer 205. Results indicates mean range of concentrations of the trace metals in the mushrooms were Fe (0.22-152. 03), Mn (0.74-9.76), Pb (0.01.0.80), Cd (0.61-0.82) mg/L respectively. Accumulation of Cd on the four locations under investigation was higher than the UK Government Food Science Surveillance and World Health Organization maximum recommended levels in mushroom for human consumption. The Fe and Mn contaminants of Amudo were significant and show the impact of anthropogenic/atmospheric pollution. The potential sources of the heavy metals in the mushrooms were from urban waste, dust from mining and quarrying activities, natural geochemistry of the area, and use of inorganic fertilizers

Keywords: Agaricus campestris, edible, health implication heavy metal, mushroom.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2522
199 Providing a Practical Model to Reduce Maintenance Costs: A Case Study in Golgohar Company

Authors: Iman Atighi, Jalal Soleimannejad, Ahmad Akbarinasab, Saeid Moradpour

Abstract:

In the past, we could increase profit by increasing product prices. But in the new decade, a competitive market does not let us to increase profit with increase prices. Therefore, the only way to increase profit will be reduce costs. A significant percentage of production costs are the maintenance costs, and analysis of these costs could achieve more profit. Most maintenance strategies such as RCM (Reliability-Center-Maintenance), TPM (Total Productivity Maintenance), PM (Preventive Maintenance) etc., are trying to reduce maintenance costs. In this paper, decreasing the maintenance costs of Concentration Plant of Golgohar Company (GEG) was examined by using of MTBF (Mean Time between Failures) and MTTR (Mean Time to Repair) analyses. These analyses showed that instead of buying new machines and increasing costs in order to promote capacity, the improving of MTBF and MTTR indexes would solve capacity problems in the best way and decrease costs.

Keywords: Golgohar Iron Ore Mining & Industrial Company, maintainability, maintenance costs, reliability-center-maintenance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 591
198 Estimation of the Bit Side Force by Using Artificial Neural Network

Authors: Mohammad Heidari

Abstract:

Horizontal wells are proven to be better producers because they can be extended for a long distance in the pay zone. Engineers have the technical means to forecast the well productivity for a given horizontal length. However, experiences have shown that the actual production rate is often significantly less than that of forecasted. It is a difficult task, if not impossible to identify the real reason why a horizontal well is not producing what was forecasted. Often the source of problem lies in the drilling of horizontal section such as permeability reduction in the pay zone due to mud invasion or snaky well patterns created during drilling. Although drillers aim to drill a constant inclination hole in the pay zone, the more frequent outcome is a sinusoidal wellbore trajectory. The two factors, which play an important role in wellbore tortuosity, are the inclination and side force at bit. A constant inclination horizontal well can only be drilled if the bit face is maintained perpendicular to longitudinal axis of bottom hole assembly (BHA) while keeping the side force nil at the bit. This approach assumes that there exists no formation force at bit. Hence, an appropriate BHA can be designed if bit side force and bit tilt are determined accurately. The Artificial Neural Network (ANN) is superior to existing analytical techniques. In this study, the neural networks have been employed as a general approximation tool for estimation of the bit side forces. A number of samples are analyzed with ANN for parameters of bit side force and the results are compared with exact analysis. Back Propagation Neural network (BPN) is used to approximation of bit side forces. Resultant low relative error value of the test indicates the usability of the BPN in this area.

Keywords: Artificial Neural Network, BHA, Horizontal Well, Stabilizer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1937
197 A Comparison of Air Quality in Arid and Temperate Climatic Conditions – A Case Study of Leeds and Makkah

Authors: Turki M. Habeebullah, Said Munir, Karl Ropkins, Essam A. Morsy, Atef M. F. Mohammed, Abdulaziz R. Seroji

Abstract:

In this paper air quality conditions in Makkah and Leeds are compared. These two cities have totally different climatic conditions. Makkah climate is characterised as hot and dry (arid) whereas that of Leeds is characterised as cold and wet (temperate). This study uses air quality data from 2012 collected in Makkah, Saudi Arabia and Leeds, UK. The concentrations of all pollutants, except NO are higher in Makkah. Most notable, the concentrations of PM10 are much higher in Makkah than in Leeds. This is probably due to the arid nature of climatic conditions in Makkah and not solely due to anthropogenic emission sources, otherwise like PM10 some of the other pollutants, such as CO, NO, and SO2 would have shown much greater difference between Leeds and Makkah. Correlation analysis is performed between different pollutants at the same site and the same pollutants at different sites. In Leeds the correlation between PM10 and other pollutants is significantly stronger than in Makkah. Weaker correlation in Makkah is probably due to the fact that in Makkah most of the gaseous pollutants are emitted by combustion processes, whereas most of the PM10 is generated by other sources, such as windblown dust, re-suspension, and construction activities. This is in contrast to Leeds where all pollutants including PM10 are predominantly emitted by combustions, such as road traffic. Furthermore, in Leeds frequent rains wash out most of the atmospheric particulate matter and suppress re-suspension of dust. Temporal trends of various pollutants are compared and discussed. This study emphasises the role of climatic conditions in managing air quality, and hence the need for region-specific controlling strategies according to the local climatic and meteorological conditions.

Keywords: Air pollution, climatic conditions, particulate matter, Makkah, Leeds.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2519
196 Unattended Crowdsensing Method to Monitor the Quality Condition of Dirt Roads

Authors: Matías Micheletto, Rodrigo Santos, Sergio F. Ochoa

Abstract:

In developing countries, most roads in rural areas are dirt road. They require frequent maintenance since they are affected by erosive events, such as rain or wind, and the transit of heavy-weight trucks and machinery. Early detection of damages on the road condition is a key aspect, since it allows to reduce the maintenance time and cost, and also the limitations for other vehicles to travel through. Most proposals that help address this problem require the explicit participation of drivers, a permanent internet connection, or important instrumentation in vehicles or roads. These constraints limit the suitability of these proposals when applied into developing regions, like Latin America. This paper proposes an alternative method, based on unattended crowdsensing, to determine the quality of dirt roads in rural areas. This method involves the use of a mobile application that complements the road condition surveys carried out by organizations in charge of the road network maintenance, giving them early warnings about road areas that could be requiring maintenance. Drivers can also take advantage of the early warnings while they move through these roads. The method was evaluated using information from a public dataset. Although they are preliminary, the results indicate the proposal is potentially suitable to provide awareness about dirt roads condition to drivers, transportation authority and road maintenance companies.

Keywords: Dirt roads automatic quality assessment, collaborative system, unattended crowdsensing method, roads quality awareness provision.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 453
195 A Study on Abnormal Behavior Detection in BYOD Environment

Authors: Dongwan Kang, Joohyung Oh, Chaetae Im

Abstract:

Advancement of communication technologies and smart devices in the recent times is leading to changes into the integrated wired and wireless communication environments. Since early days, businesses had started introducing environments for mobile device application to their operations in order to improve productivity (efficiency) and the closed corporate environment gradually shifted to an open structure. Recently, individual user's interest in working environment using mobile devices has increased and a new corporate working environment under the concept of BYOD is drawing attention. BYOD (bring your own device) is a concept where individuals bring in and use their own devices in business activities. Through BYOD, businesses can anticipate improved productivity (efficiency) and also a reduction in the cost of purchasing devices. However, as a result of security threats caused by frequent loss and theft of personal devices and corporate data leaks due to low security, companies are reluctant about adopting BYOD system. In addition, without considerations to diverse devices and connection environments, there are limitations in detecting abnormal behaviors, such as information leaks, using the existing network-based security equipment. This study suggests a method to detect abnormal behaviors according to individual behavioral patterns, rather than the existing signature-based malicious behavior detection, and discusses applications of this method in BYOD environment.

Keywords: BYOD, Security, Anomaly Behavior Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2021
194 Off-Line Hand Written Thai Character Recognition using Ant-Miner Algorithm

Authors: P. Phokharatkul, K. Sankhuangaw, S. Somkuarnpanit, S. Phaiboon, C. Kimpan

Abstract:

Much research into handwritten Thai character recognition have been proposed, such as comparing heads of characters, Fuzzy logic and structure trees, etc. This paper presents a system of handwritten Thai character recognition, which is based on the Ant-minor algorithm (data mining based on Ant colony optimization). Zoning is initially used to determine each character. Then three distinct features (also called attributes) of each character in each zone are extracted. The attributes are Head zone, End point, and Feature code. All attributes are used for construct the classification rules by an Ant-miner algorithm in order to classify 112 Thai characters. For this experiment, the Ant-miner algorithm is adapted, with a small change to increase the recognition rate. The result of this experiment is a 97% recognition rate of the training set (11200 characters) and 82.7% recognition rate of unseen data test (22400 characters).

Keywords: Hand written, Thai character recognition, Ant-mineralgorithm, distinct feature.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1887
193 Effective Keyword and Similarity Thresholds for the Discovery of Themes from the User Web Access Patterns

Authors: Haider A Ramadhan, Khalil Shihab

Abstract:

Clustering techniques have been used by many intelligent software agents to group similar access patterns of the Web users into high level themes which express users intentions and interests. However, such techniques have been mostly focusing on one salient feature of the Web document visited by the user, namely the extracted keywords. The major aim of these techniques is to come up with an optimal threshold for the number of keywords needed to produce more focused themes. In this paper we focus on both keyword and similarity thresholds to generate themes with concentrated themes, and hence build a more sound model of the user behavior. The purpose of this paper is two fold: use distance based clustering methods to recognize overall themes from the Proxy log file, and suggest an efficient cut off levels for the keyword and similarity thresholds which tend to produce more optimal clusters with better focus and efficient size.

Keywords: Data mining, knowledge discovery, clustering, dataanalysis, Web log analysis, theme based searching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1409
192 Sentiment Analysis: Comparative Analysis of Multilingual Sentiment and Opinion Classification Techniques

Authors: Sannikumar Patel, Brian Nolan, Markus Hofmann, Philip Owende, Kunjan Patel

Abstract:

Sentiment analysis and opinion mining have become emerging topics of research in recent years but most of the work is focused on data in the English language. A comprehensive research and analysis are essential which considers multiple languages, machine translation techniques, and different classifiers. This paper presents, a comparative analysis of different approaches for multilingual sentiment analysis. These approaches are divided into two parts: one using classification of text without language translation and second using the translation of testing data to a target language, such as English, before classification. The presented research and results are useful for understanding whether machine translation should be used for multilingual sentiment analysis or building language specific sentiment classification systems is a better approach. The effects of language translation techniques, features, and accuracy of various classifiers for multilingual sentiment analysis is also discussed in this study.

Keywords: Cross-language analysis, machine learning, machine translation, sentiment analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1602
191 Rock Thickness Measurement by Using Self-Excited Acoustical System

Authors: JanuszKwaśniewski, IreneuszDominik, KrzysztofLalik

Abstract:

The knowledge about rock layers thickness,especially above drilled mining pavements is crucial for workers safety. The measuring systems used nowadays are generally imperfect and there is a strong demand for improvement. The application of a new type of a measurement system called Self-excited Acoustical System is presentedin the paper. The system was applied until now to monitor stress changes in metal and concrete constructions. The change in measurement methodology resulted in possibility of measuring the thickness of the rocks above the tunnels as well as thickness of a singular rocklayer. The idea is to find two resonance frequencies of the self-exited system,which consists of a vibration exciter and vibration receiver placed at a distance, which are coupled with a proper power amplifier, and which operate in a closed loop with a positive feedback. The resonance with the higher amplitude determines thickness of the whole rock, whereas the lower amplitude resonance indicates thickness of a singular layer. The results of the laboratory tests conducted on a group of different rock materials are also presented.

Keywords: Autooscillator, non-destructive testing, rock thickness measurement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2024
190 Rescue Emergency Drone for Fast Response to Medical Emergencies Due to Traffic Accidents

Authors: Anders S. Kristensen, Dewan Ahsan, Saqib Mehmood, Shakeel Ahmed

Abstract:

Traffic accidents are a result of the convergence of hazards, malfunctioning of vehicles and human negligence that have adverse economic and health impacts and effects. Unfortunately, avoiding them completely is very difficult, but with quick response to rescue and first aid, the mortality rate of inflicted persons can be reduced significantly. Smart and innovative technologies can play a pivotal role to respond faster to traffic crash emergencies comparing conventional means of transportation. For instance, Rescue Emergency Drone (RED) can provide faster and real-time crash site risk assessment to emergency medical services, thereby helping them to quickly and accurately assess a situation, dispatch the right equipment and assist bystanders to treat inflicted person properly. To conduct a research in this regard, the case of a traffic roundabout that is prone to frequent traffic accidents on the outskirts of Esbjerg, a town located on western coast of Denmark is hypothetically considered. Along with manual calculations, Emergency Disaster Management Simulation (EDMSIM) has been used to verify the response time of RED from a fire station of the town to the presumed crash site. The results of the study demonstrate the robustness of RED into emergency services to help save lives. 

Keywords: Automated external defibrillator, medical emergency, fire and rescue services, response time, unmanned aerial system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1698
189 Performance Analysis of Proprietary and Non-Proprietary Tools for Regression Testing Using Genetic Algorithm

Authors: K. Hema Shankari, R. Thirumalaiselvi, N. V. Balasubramanian

Abstract:

The present paper addresses to the research in the area of regression testing with emphasis on automated tools as well as prioritization of test cases. The uniqueness of regression testing and its cyclic nature is pointed out. The difference in approach between industry, with business model as basis, and academia, with focus on data mining, is highlighted. Test Metrics are discussed as a prelude to our formula for prioritization; a case study is further discussed to illustrate this methodology. An industrial case study is also described in the paper, where the number of test cases is so large that they have to be grouped as Test Suites. In such situations, a genetic algorithm proposed by us can be used to reconfigure these Test Suites in each cycle of regression testing. The comparison is made between a proprietary tool and an open source tool using the above-mentioned metrics. Our approach is clarified through several tables.

Keywords: APFD metric, genetic algorithm, regression testing, RFT tool, test case prioritization, selenium tool.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 879
188 Annotations of Gene Pathways Images in Biomedical Publications Using Siamese Network

Authors: Micheal Olaolu Arowolo, Muhammad Azam, Fei He, Mihail Popescu, Dong Xu

Abstract:

As the quantity of biological articles rises, so does the number of biological route figures. Each route figure shows gene names and relationships. Manually annotating pathway diagrams is time-consuming. Advanced image understanding models could speed up curation, but they must be more precise. There is rich information in biological pathway figures. The first step to performing image understanding of these figures is to recognize gene names automatically. Classical optical character recognition methods have been employed for gene name recognition, but they are not optimized for literature mining data. This study devised a method to recognize an image bounding box of gene name as a photo using deep Siamese neural network models to outperform the existing methods using ResNet, DenseNet and Inception architectures, the results obtained about 84% accuracy.

Keywords: Biological pathway, gene identification, object detection, Siamese network, ResNet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 162
187 Message Framework for Disaster Management: An Application Model for Mines

Authors: A. Baloğlu, A. Çınar

Abstract:

Different tools and technologies were implemented for Crisis Response and Management (CRM) which is generally using available network infrastructure for information exchange. Depending on type of disaster or crisis, network infrastructure could be affected and it could not be able to provide reliable connectivity. Thus any tool or technology that depends on the connectivity could not be able to fulfill its functionalities. As a solution, a new message exchange framework has been developed. Framework provides offline/online information exchange platform for CRM Information Systems (CRMIS) and it uses XML compression and packet prioritization algorithms and is based on open source web technologies. By introducing offline capabilities to the web technologies, framework will be able to perform message exchange on unreliable networks. The experiments done on the simulation environment provide promising results on low bandwidth networks (56kbps and 28.8 kbps) with up to 50% packet loss and the solution is to successfully transfer all the information on these low quality networks where the traditional 2 and 3 tier applications failed.

Keywords: Crisis Response and Management, XML Messaging, Web Services, XML compression, Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1859
186 Comparison between Associative Classification and Decision Tree for HCV Treatment Response Prediction

Authors: Enas M. F. El Houby, Marwa S. Hassan

Abstract:

Combined therapy using Interferon and Ribavirin is the standard treatment in patients with chronic hepatitis C. However, the number of responders to this treatment is low, whereas its cost and side effects are high. Therefore, there is a clear need to predict patient’s response to the treatment based on clinical information to protect the patients from the bad drawbacks, Intolerable side effects and waste of money. Different machine learning techniques have been developed to fulfill this purpose. From these techniques are Associative Classification (AC) and Decision Tree (DT). The aim of this research is to compare the performance of these two techniques in the prediction of virological response to the standard treatment of HCV from clinical information. 200 patients treated with Interferon and Ribavirin; were analyzed using AC and DT. 150 cases had been used to train the classifiers and 50 cases had been used to test the classifiers. The experiment results showed that the two techniques had given acceptable results however the best accuracy for the AC reached 92% whereas for DT reached 80%.

Keywords: Associative Classification, Data mining, Decision tree, HCV, interferon.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1846
185 Comparative Analysis of Different Page Ranking Algorithms

Authors: S. Prabha, K. Duraiswamy, J. Indhumathi

Abstract:

Search engine plays an important role in internet, to retrieve the relevant documents among the huge number of web pages. However, it retrieves more number of documents, which are all relevant to your search topics. To retrieve the most meaningful documents related to search topics, ranking algorithm is used in information retrieval technique. One of the issues in data miming is ranking the retrieved document. In information retrieval the ranking is one of the practical problems. This paper includes various Page Ranking algorithms, page segmentation algorithms and compares those algorithms used for Information Retrieval. Diverse Page Rank based algorithms like Page Rank (PR), Weighted Page Rank (WPR), Weight Page Content Rank (WPCR), Hyperlink Induced Topic Selection (HITS), Distance Rank, Eigen Rumor, Distance Rank Time Rank, Tag Rank, Relational Based Page Rank and Query Dependent Ranking algorithms are discussed and compared.

Keywords: Information Retrieval, Web Page Ranking, search engine, web mining, page segmentations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4237
184 Application of a New Hybrid Optimization Algorithm on Cluster Analysis

Authors: T. Niknam, M. Nayeripour, B.Bahmani Firouzi

Abstract:

Clustering techniques have received attention in many areas including engineering, medicine, biology and data mining. The purpose of clustering is to group together data points, which are close to one another. The K-means algorithm is one of the most widely used techniques for clustering. However, K-means has two shortcomings: dependency on the initial state and convergence to local optima and global solutions of large problems cannot found with reasonable amount of computation effort. In order to overcome local optima problem lots of studies done in clustering. This paper is presented an efficient hybrid evolutionary optimization algorithm based on combining Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO), called PSO-ACO, for optimally clustering N object into K clusters. The new PSO-ACO algorithm is tested on several data sets, and its performance is compared with those of ACO, PSO and K-means clustering. The simulation results show that the proposed evolutionary optimization algorithm is robust and suitable for handing data clustering.

Keywords: Ant Colony Optimization (ACO), Data clustering, Hybrid evolutionary optimization algorithm, K-means clustering, Particle Swarm Optimization (PSO).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2157
183 Sounds Alike Name Matching for Myanmar Language

Authors: Yuzana, Khin Marlar Tun

Abstract:

Personal name matching system is the core of essential task in national citizen database, text and web mining, information retrieval, online library system, e-commerce and record linkage system. It has necessitated to the all embracing research in the vicinity of name matching. Traditional name matching methods are suitable for English and other Latin based language. Asian languages which have no word boundary such as Myanmar language still requires sounds alike matching system in Unicode based application. Hence we proposed matching algorithm to get analogous sounds alike (phonetic) pattern that is convenient for Myanmar character spelling. According to the nature of Myanmar character, we consider for word boundary fragmentation, collation of character. Thus we use pattern conversion algorithm which fabricates words in pattern with fragmented and collated. We create the Myanmar sounds alike phonetic group to help in the phonetic matching. The experimental results show that fragmentation accuracy in 99.32% and processing time in 1.72 ms.

Keywords: natural language processing, name matching, phonetic matching

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1751
182 Automatic Clustering of Gene Ontology by Genetic Algorithm

Authors: Razib M. Othman, Safaai Deris, Rosli M. Illias, Zalmiyah Zakaria, Saberi M. Mohamad

Abstract:

Nowadays, Gene Ontology has been used widely by many researchers for biological data mining and information retrieval, integration of biological databases, finding genes, and incorporating knowledge in the Gene Ontology for gene clustering. However, the increase in size of the Gene Ontology has caused problems in maintaining and processing them. One way to obtain their accessibility is by clustering them into fragmented groups. Clustering the Gene Ontology is a difficult combinatorial problem and can be modeled as a graph partitioning problem. Additionally, deciding the number k of clusters to use is not easily perceived and is a hard algorithmic problem. Therefore, an approach for solving the automatic clustering of the Gene Ontology is proposed by incorporating cohesion-and-coupling metric into a hybrid algorithm consisting of a genetic algorithm and a split-and-merge algorithm. Experimental results and an example of modularized Gene Ontology in RDF/XML format are given to illustrate the effectiveness of the algorithm.

Keywords: Automatic clustering, cohesion-and-coupling metric, gene ontology; genetic algorithm, split-and-merge algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1916