Search results for: Frequent itemset mining.
218 An Efficient Protocol for Cyclic Somatic Embryogenesis in Neem (Azadirachta indica A Juss.)
Authors: Mithilesh Singh, Rakhi Chaturvedi
Abstract:
Neem is a highly heterozygous and commercially important perennial plant. Conventionally, it is propagated by seeds which loose viability within two weeks. Strictly cross pollinating nature of the plant causes serious barrier to the genetic improvement by conventional methods. Alternative methods of tree improvement such as somatic hybridization, mutagenesis and genetic transformation require an efficient in vitro plant regeneration system. In this regard, somatic embryogenesis particularly secondary somatic embryogenesis may offer an effective system for large scale plant propagation without affecting the clonal fidelity of the regenerants. It can be used for synthetic seed production, which further bolsters conservation of this tree species which is otherwise very difficult The present report describes the culture conditions necessary to induce and maintain repetitive somatic embryogenesis, for the first time, in neem. Out of various treatments tested, the somatic embryos were induced directly from immature zygotic embryos of neem on MS + TDZ (0.1 μM) + ABA (4 μM), in more than 76 % cultures. Direct secondary somatic embryogenesis occurred from primary somatic embryos on MS + IAA (5 μM) + GA3 (5 μM) in 12.5 % cultures. Embryogenic competence of the explant as well as of the primary embryos was maintained for a long period by repeated subcultures at frequent intervals. A maximum of 10 % of these somatic embryos were converted into plantlets.Keywords: Azadirachta indica A. Juss., Cytokinin, Somatic embryogenesis, zygotic embryo culture.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1467217 A Relationship Extraction Method from Literary Fiction Considering Korean Linguistic Features
Authors: Hee-Jeong Ahn, Kee-Won Kim, Seung-Hoon Kim
Abstract:
The knowledge of the relationship between characters can help readers to understand the overall story or plot of the literary fiction. In this paper, we present a method for extracting the specific relationship between characters from a Korean literary fiction. Generally, methods for extracting relationships between characters in text are statistical or computational methods based on the sentence distance between characters without considering Korean linguistic features. Furthermore, it is difficult to extract the relationship with direction from text, such as one-sided love, because they consider only the weight of relationship, without considering the direction of the relationship. Therefore, in order to identify specific relationships between characters, we propose a statistical method considering linguistic features, such as syntactic patterns and speech verbs in Korean. The result of our method is represented by a weighted directed graph of the relationship between the characters. Furthermore, we expect that proposed method could be applied to the relationship analysis between characters of other content like movie or TV drama.
Keywords: Data mining, Korean linguistic feature, literary fiction, relationship extraction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1795216 Value Analysis of Islamic Banking and Conventional Banking to Measure Value Co-creation
Authors: Amna Javed, Hisashi Masuda, Youji Kohda
Abstract:
This study examines the value analysis in Islamic and conventional banking services in Pakistan. Many scholars have focused on co-creation of values in services but mainly economic values not non-economic.
Keywords: Economic values, Islamic banking, Non-economic values, Value system.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3257215 Investigating the Effect of Uncertainty on a LP Model of a Petrochemical Complex: Stability Analysis Approach
Authors: Abdallah Al-Shammari
Abstract:
This study discusses the effect of uncertainty on production levels of a petrochemical complex. Uncertainly or variations in some model parameters, such as prices, supply and demand of materials, can affect the optimality or the efficiency of any chemical process. For any petrochemical complex with many plants, there are many sources of uncertainty and frequent variations which require more attention. Many optimization approaches are proposed in the literature to incorporate uncertainty within the model in order to obtain a robust solution. In this work, a stability analysis approach is applied to a deterministic LP model of a petrochemical complex consists of ten plants to investigate the effect of such variations on the obtained optimal production levels. The proposed approach can determinate the allowable variation ranges of some parameters, mainly objective or RHS coefficients, before the system lose its optimality. Parameters with relatively narrow range of variations, i.e. stability limits, are classified as sensitive parameters or constraints that need accurate estimate or intensive monitoring. These stability limits offer easy-to-use information to the decision maker and help in understanding the interaction between some model parameters and deciding when the system need to be re-optimize. The study shows that maximum production of ethylene and the prices of intermediate products are the most sensitive factors that affect the stability of the optimum solutionKeywords: Linear programming, Petrochemicals, stability analysis, uncertainty
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1956214 Evolving Knowledge Extraction from Online Resources
Authors: Zhibo Xiao, Tharini Nayanika de Silva, Kezhi Mao
Abstract:
In this paper, we present an evolving knowledge extraction system named AKEOS (Automatic Knowledge Extraction from Online Sources). AKEOS consists of two modules, including a one-time learning module and an evolving learning module. The one-time learning module takes in user input query, and automatically harvests knowledge from online unstructured resources in an unsupervised way. The output of the one-time learning is a structured vector representing the harvested knowledge. The evolving learning module automatically schedules and performs repeated one-time learning to extract the newest information and track the development of an event. In addition, the evolving learning module summarizes the knowledge learned at different time points to produce a final knowledge vector about the event. With the evolving learning, we are able to visualize the key information of the event, discover the trends, and track the development of an event.Keywords: Evolving learning, knowledge extraction, knowledge graph, text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 942213 Discriminant Analysis as a Function of Predictive Learning to Select Evolutionary Algorithms in Intelligent Transportation System
Authors: Jorge A. Ruiz-Vanoye, Ocotlán Díaz-Parra, Alejandro Fuentes-Penna, Daniel Vélez-Díaz, Edith Olaco García
Abstract:
In this paper, we present the use of the discriminant analysis to select evolutionary algorithms that better solve instances of the vehicle routing problem with time windows. We use indicators as independent variables to obtain the classification criteria, and the best algorithm from the generic genetic algorithm (GA), random search (RS), steady-state genetic algorithm (SSGA), and sexual genetic algorithm (SXGA) as the dependent variable for the classification. The discriminant classification was trained with classic instances of the vehicle routing problem with time windows obtained from the Solomon benchmark. We obtained a classification of the discriminant analysis of 66.7%.
Keywords: Intelligent transportation systems, data-mining techniques, evolutionary algorithms, discriminant analysis, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1547212 Fe, Pb, Mn, and Cd Concentrations in Edible Mushrooms (Agaricus campestris) Grown in Abakaliki, Ebonyi State, Nigeria
Authors: N. O. Omaka, I. F. Offor, R.C. Ehiri
Abstract:
The health and environmental risk of eating mushrooms grown in Abakaliki were evaluated in terms of heavy metals accumulation. Mushroom samples were collected from four different farms located at Izzi, Amajim, Amana and Amudo and analyzed for (iron, lead, manganese and cadmium) using Bulk Scientific Atomic Absorption Spectrophotometer 205. Results indicates mean range of concentrations of the trace metals in the mushrooms were Fe (0.22-152. 03), Mn (0.74-9.76), Pb (0.01.0.80), Cd (0.61-0.82) mg/L respectively. Accumulation of Cd on the four locations under investigation was higher than the UK Government Food Science Surveillance and World Health Organization maximum recommended levels in mushroom for human consumption. The Fe and Mn contaminants of Amudo were significant and show the impact of anthropogenic/atmospheric pollution. The potential sources of the heavy metals in the mushrooms were from urban waste, dust from mining and quarrying activities, natural geochemistry of the area, and use of inorganic fertilizers
Keywords: Agaricus campestris, edible, health implication heavy metal, mushroom.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2564211 Providing a Practical Model to Reduce Maintenance Costs: A Case Study in Golgohar Company
Authors: Iman Atighi, Jalal Soleimannejad, Ahmad Akbarinasab, Saeid Moradpour
Abstract:
In the past, we could increase profit by increasing product prices. But in the new decade, a competitive market does not let us to increase profit with increase prices. Therefore, the only way to increase profit will be reduce costs. A significant percentage of production costs are the maintenance costs, and analysis of these costs could achieve more profit. Most maintenance strategies such as RCM (Reliability-Center-Maintenance), TPM (Total Productivity Maintenance), PM (Preventive Maintenance) etc., are trying to reduce maintenance costs. In this paper, decreasing the maintenance costs of Concentration Plant of Golgohar Company (GEG) was examined by using of MTBF (Mean Time between Failures) and MTTR (Mean Time to Repair) analyses. These analyses showed that instead of buying new machines and increasing costs in order to promote capacity, the improving of MTBF and MTTR indexes would solve capacity problems in the best way and decrease costs.
Keywords: Golgohar Iron Ore Mining & Industrial Company, maintainability, maintenance costs, reliability-center-maintenance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 658210 Biological Diagnosis and Physiopathology of von Willebrand-s Disease in a Part of the Algerian Population in the East and the South
Authors: H. Djaara, M. Yahia, H. Bousselsela, N Khelif, A. Zidani, S. Benbia.
Abstract:
Von Willebrand-s disease is the most common inherited bleeding disorder in humans, it caused by qualitative abnormalities of the von Willebrand factor (vWF). Our objective is to determine the prevalence of this disease at part of the Algerian population in the East and the South by a biological diagnosis based on specific biological tests (automated platelet count, the bleeding time (TS), the time of cephalin + activator (TCA), measure of the prothrombin rate (TP), vWF rate and factor VIII rate, Molecular electrophoresis of vWF multimers in agarose gel in the presence of SDS). Four patients of type III or severe Willebrand-s disease were found on 200 suspect cases. All cases are showed a deficit in vWF rate (< 5%), and factor VIII (P<0, 0001), and lengthening very significantly high of the TCA (P<0, 0001) and of the bleeding time (P<0,0001), with a normal blood platelet rate (P=0,7433) and a normal prothrombin rate (P=0,5808), an absence of all the multimers of vWF in plasma patients. The severe Willebrand-s disease is not only one pathology of primary haemostasis, but it can be accompanied by coagulation-s anomaly due to deficit in factor VIII. At this studied population, von Willebrand-s disease is less frequent (2%) than other hemorrhagic syndromes identified by the differential diagnosis like the thrombocytopenia (36%).Keywords: Von Willebrand's disease, differential diagnosis, von Willebrand factor, factor VIII, biological diagnosis, thrombocytopenia.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1737209 Off-Line Hand Written Thai Character Recognition using Ant-Miner Algorithm
Authors: P. Phokharatkul, K. Sankhuangaw, S. Somkuarnpanit, S. Phaiboon, C. Kimpan
Abstract:
Much research into handwritten Thai character recognition have been proposed, such as comparing heads of characters, Fuzzy logic and structure trees, etc. This paper presents a system of handwritten Thai character recognition, which is based on the Ant-minor algorithm (data mining based on Ant colony optimization). Zoning is initially used to determine each character. Then three distinct features (also called attributes) of each character in each zone are extracted. The attributes are Head zone, End point, and Feature code. All attributes are used for construct the classification rules by an Ant-miner algorithm in order to classify 112 Thai characters. For this experiment, the Ant-miner algorithm is adapted, with a small change to increase the recognition rate. The result of this experiment is a 97% recognition rate of the training set (11200 characters) and 82.7% recognition rate of unseen data test (22400 characters).Keywords: Hand written, Thai character recognition, Ant-mineralgorithm, distinct feature.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1933208 Effective Keyword and Similarity Thresholds for the Discovery of Themes from the User Web Access Patterns
Authors: Haider A Ramadhan, Khalil Shihab
Abstract:
Clustering techniques have been used by many intelligent software agents to group similar access patterns of the Web users into high level themes which express users intentions and interests. However, such techniques have been mostly focusing on one salient feature of the Web document visited by the user, namely the extracted keywords. The major aim of these techniques is to come up with an optimal threshold for the number of keywords needed to produce more focused themes. In this paper we focus on both keyword and similarity thresholds to generate themes with concentrated themes, and hence build a more sound model of the user behavior. The purpose of this paper is two fold: use distance based clustering methods to recognize overall themes from the Proxy log file, and suggest an efficient cut off levels for the keyword and similarity thresholds which tend to produce more optimal clusters with better focus and efficient size.
Keywords: Data mining, knowledge discovery, clustering, dataanalysis, Web log analysis, theme based searching.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1455207 Sentiment Analysis: Comparative Analysis of Multilingual Sentiment and Opinion Classification Techniques
Authors: Sannikumar Patel, Brian Nolan, Markus Hofmann, Philip Owende, Kunjan Patel
Abstract:
Sentiment analysis and opinion mining have become emerging topics of research in recent years but most of the work is focused on data in the English language. A comprehensive research and analysis are essential which considers multiple languages, machine translation techniques, and different classifiers. This paper presents, a comparative analysis of different approaches for multilingual sentiment analysis. These approaches are divided into two parts: one using classification of text without language translation and second using the translation of testing data to a target language, such as English, before classification. The presented research and results are useful for understanding whether machine translation should be used for multilingual sentiment analysis or building language specific sentiment classification systems is a better approach. The effects of language translation techniques, features, and accuracy of various classifiers for multilingual sentiment analysis is also discussed in this study.
Keywords: Cross-language analysis, machine learning, machine translation, sentiment analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1667206 Rock Thickness Measurement by Using Self-Excited Acoustical System
Authors: JanuszKwaśniewski, IreneuszDominik, KrzysztofLalik
Abstract:
The knowledge about rock layers thickness,especially above drilled mining pavements is crucial for workers safety. The measuring systems used nowadays are generally imperfect and there is a strong demand for improvement. The application of a new type of a measurement system called Self-excited Acoustical System is presentedin the paper. The system was applied until now to monitor stress changes in metal and concrete constructions. The change in measurement methodology resulted in possibility of measuring the thickness of the rocks above the tunnels as well as thickness of a singular rocklayer. The idea is to find two resonance frequencies of the self-exited system,which consists of a vibration exciter and vibration receiver placed at a distance, which are coupled with a proper power amplifier, and which operate in a closed loop with a positive feedback. The resonance with the higher amplitude determines thickness of the whole rock, whereas the lower amplitude resonance indicates thickness of a singular layer. The results of the laboratory tests conducted on a group of different rock materials are also presented.
Keywords: Autooscillator, non-destructive testing, rock thickness measurement.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2071205 Information Requirements for Vessel Traffic Service Operations
Authors: Fan Li, Chun-Hsien Chen, Li Pheng Khoo
Abstract:
Operators of vessel traffic service (VTS) center provides three different types of services; namely information service, navigational assistance and traffic organization to vessels. To provide these services, operators monitor vessel traffic through computer interface and provide navigational advice based on the information integrated from multiple sources, including automatic identification system (AIS), radar system, and closed circuit television (CCTV) system. Therefore, this information is crucial in VTS operation. However, what information the VTS operator actually need to efficiently and properly offer services is unclear. The aim of this study is to investigate into information requirements for VTS operation. To achieve this aim, field observation was carried out to elicit the information requirements for VTS operation. The study revealed that the most frequent and important tasks were handling arrival vessel report, potential conflict control and abeam vessel report. Current location and vessel name were used in all tasks. Hazard cargo information was particularly required when operators handle arrival vessel report. The speed, the course, and the distance of two or several vessels were only used in potential conflict control. The information requirements identified in this study can be utilized in designing a human-computer interface that takes into consideration what and when information should be displayed, and might be further used to build the foundation of a decision support system for VTS.
Keywords: Vessel traffic service, information requirements, hierarchy task analysis, field observation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1595204 Microservices-Based Provisioning and Control of Network Services for Heterogeneous Networks
Authors: Shameemraj M. Nadaf, Sipra Behera, Hemant K. Rath, Garima Mishra, Raja Mukhopadhyay, Sumanta Patro
Abstract:
Microservices architecture has been widely embraced for rapid, frequent, and reliable delivery of complex applications. It enables organizations to evolve their technology stack in various domains. Today, the networking domain is flooded with plethora of devices and software solutions which address different functionalities ranging from elementary operations, viz., switching, routing, firewall etc., to complex analytics and insights based intelligent services. In this paper, we attempt to bring in the microservices based approach for agile and adaptive delivery of network services for any underlying networking technology. We discuss the life cycle management of each individual microservice and a distributed control approach with emphasis for dynamic provisioning, management, and orchestration in an automated fashion which can provide seamless operations in large scale networks. We have conducted validations of the system in lab testbed comprising of Traditional/Legacy and Software Defined Wireless Local Area networks.
Keywords: Microservices architecture, software defined wireless networks, traditional wireless networks, automation, orchestration, intelligent networks, network analytics, seamless management, single pane control, fine-grain control.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 900203 Performance Analysis of Proprietary and Non-Proprietary Tools for Regression Testing Using Genetic Algorithm
Authors: K. Hema Shankari, R. Thirumalaiselvi, N. V. Balasubramanian
Abstract:
The present paper addresses to the research in the area of regression testing with emphasis on automated tools as well as prioritization of test cases. The uniqueness of regression testing and its cyclic nature is pointed out. The difference in approach between industry, with business model as basis, and academia, with focus on data mining, is highlighted. Test Metrics are discussed as a prelude to our formula for prioritization; a case study is further discussed to illustrate this methodology. An industrial case study is also described in the paper, where the number of test cases is so large that they have to be grouped as Test Suites. In such situations, a genetic algorithm proposed by us can be used to reconfigure these Test Suites in each cycle of regression testing. The comparison is made between a proprietary tool and an open source tool using the above-mentioned metrics. Our approach is clarified through several tables.Keywords: APFD metric, genetic algorithm, regression testing, RFT tool, test case prioritization, selenium tool.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 918202 Annotations of Gene Pathways Images in Biomedical Publications Using Siamese Network
Authors: Micheal Olaolu Arowolo, Muhammad Azam, Fei He, Mihail Popescu, Dong Xu
Abstract:
As the quantity of biological articles rises, so does the number of biological route figures. Each route figure shows gene names and relationships. Manually annotating pathway diagrams is time-consuming. Advanced image understanding models could speed up curation, but they must be more precise. There is rich information in biological pathway figures. The first step to performing image understanding of these figures is to recognize gene names automatically. Classical optical character recognition methods have been employed for gene name recognition, but they are not optimized for literature mining data. This study devised a method to recognize an image bounding box of gene name as a photo using deep Siamese neural network models to outperform the existing methods using ResNet, DenseNet and Inception architectures, the results obtained about 84% accuracy.
Keywords: Biological pathway, gene identification, object detection, Siamese network, ResNet.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 249201 Message Framework for Disaster Management: An Application Model for Mines
Authors: A. Baloğlu, A. Çınar
Abstract:
Different tools and technologies were implemented for Crisis Response and Management (CRM) which is generally using available network infrastructure for information exchange. Depending on type of disaster or crisis, network infrastructure could be affected and it could not be able to provide reliable connectivity. Thus any tool or technology that depends on the connectivity could not be able to fulfill its functionalities. As a solution, a new message exchange framework has been developed. Framework provides offline/online information exchange platform for CRM Information Systems (CRMIS) and it uses XML compression and packet prioritization algorithms and is based on open source web technologies. By introducing offline capabilities to the web technologies, framework will be able to perform message exchange on unreliable networks. The experiments done on the simulation environment provide promising results on low bandwidth networks (56kbps and 28.8 kbps) with up to 50% packet loss and the solution is to successfully transfer all the information on these low quality networks where the traditional 2 and 3 tier applications failed.
Keywords: Crisis Response and Management, XML Messaging, Web Services, XML compression, Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1904200 Comparison between Associative Classification and Decision Tree for HCV Treatment Response Prediction
Authors: Enas M. F. El Houby, Marwa S. Hassan
Abstract:
Combined therapy using Interferon and Ribavirin is the standard treatment in patients with chronic hepatitis C. However, the number of responders to this treatment is low, whereas its cost and side effects are high. Therefore, there is a clear need to predict patient’s response to the treatment based on clinical information to protect the patients from the bad drawbacks, Intolerable side effects and waste of money. Different machine learning techniques have been developed to fulfill this purpose. From these techniques are Associative Classification (AC) and Decision Tree (DT). The aim of this research is to compare the performance of these two techniques in the prediction of virological response to the standard treatment of HCV from clinical information. 200 patients treated with Interferon and Ribavirin; were analyzed using AC and DT. 150 cases had been used to train the classifiers and 50 cases had been used to test the classifiers. The experiment results showed that the two techniques had given acceptable results however the best accuracy for the AC reached 92% whereas for DT reached 80%.
Keywords: Associative Classification, Data mining, Decision tree, HCV, interferon.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1900199 Comparative Analysis of Different Page Ranking Algorithms
Authors: S. Prabha, K. Duraiswamy, J. Indhumathi
Abstract:
Search engine plays an important role in internet, to retrieve the relevant documents among the huge number of web pages. However, it retrieves more number of documents, which are all relevant to your search topics. To retrieve the most meaningful documents related to search topics, ranking algorithm is used in information retrieval technique. One of the issues in data miming is ranking the retrieved document. In information retrieval the ranking is one of the practical problems. This paper includes various Page Ranking algorithms, page segmentation algorithms and compares those algorithms used for Information Retrieval. Diverse Page Rank based algorithms like Page Rank (PR), Weighted Page Rank (WPR), Weight Page Content Rank (WPCR), Hyperlink Induced Topic Selection (HITS), Distance Rank, Eigen Rumor, Distance Rank Time Rank, Tag Rank, Relational Based Page Rank and Query Dependent Ranking algorithms are discussed and compared.
Keywords: Information Retrieval, Web Page Ranking, search engine, web mining, page segmentations.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4289198 Application of a New Hybrid Optimization Algorithm on Cluster Analysis
Authors: T. Niknam, M. Nayeripour, B.Bahmani Firouzi
Abstract:
Clustering techniques have received attention in many areas including engineering, medicine, biology and data mining. The purpose of clustering is to group together data points, which are close to one another. The K-means algorithm is one of the most widely used techniques for clustering. However, K-means has two shortcomings: dependency on the initial state and convergence to local optima and global solutions of large problems cannot found with reasonable amount of computation effort. In order to overcome local optima problem lots of studies done in clustering. This paper is presented an efficient hybrid evolutionary optimization algorithm based on combining Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO), called PSO-ACO, for optimally clustering N object into K clusters. The new PSO-ACO algorithm is tested on several data sets, and its performance is compared with those of ACO, PSO and K-means clustering. The simulation results show that the proposed evolutionary optimization algorithm is robust and suitable for handing data clustering.
Keywords: Ant Colony Optimization (ACO), Data clustering, Hybrid evolutionary optimization algorithm, K-means clustering, Particle Swarm Optimization (PSO).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2200197 Sounds Alike Name Matching for Myanmar Language
Authors: Yuzana, Khin Marlar Tun
Abstract:
Personal name matching system is the core of essential task in national citizen database, text and web mining, information retrieval, online library system, e-commerce and record linkage system. It has necessitated to the all embracing research in the vicinity of name matching. Traditional name matching methods are suitable for English and other Latin based language. Asian languages which have no word boundary such as Myanmar language still requires sounds alike matching system in Unicode based application. Hence we proposed matching algorithm to get analogous sounds alike (phonetic) pattern that is convenient for Myanmar character spelling. According to the nature of Myanmar character, we consider for word boundary fragmentation, collation of character. Thus we use pattern conversion algorithm which fabricates words in pattern with fragmented and collated. We create the Myanmar sounds alike phonetic group to help in the phonetic matching. The experimental results show that fragmentation accuracy in 99.32% and processing time in 1.72 ms.Keywords: natural language processing, name matching, phonetic matching
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1799196 Automatic Clustering of Gene Ontology by Genetic Algorithm
Authors: Razib M. Othman, Safaai Deris, Rosli M. Illias, Zalmiyah Zakaria, Saberi M. Mohamad
Abstract:
Nowadays, Gene Ontology has been used widely by many researchers for biological data mining and information retrieval, integration of biological databases, finding genes, and incorporating knowledge in the Gene Ontology for gene clustering. However, the increase in size of the Gene Ontology has caused problems in maintaining and processing them. One way to obtain their accessibility is by clustering them into fragmented groups. Clustering the Gene Ontology is a difficult combinatorial problem and can be modeled as a graph partitioning problem. Additionally, deciding the number k of clusters to use is not easily perceived and is a hard algorithmic problem. Therefore, an approach for solving the automatic clustering of the Gene Ontology is proposed by incorporating cohesion-and-coupling metric into a hybrid algorithm consisting of a genetic algorithm and a split-and-merge algorithm. Experimental results and an example of modularized Gene Ontology in RDF/XML format are given to illustrate the effectiveness of the algorithm.
Keywords: Automatic clustering, cohesion-and-coupling metric, gene ontology; genetic algorithm, split-and-merge algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1956195 Road Accidents Bigdata Mining and Visualization Using Support Vector Machines
Authors: Usha Lokala, Srinivas Nowduri, Prabhakar K. Sharma
Abstract:
Useful information has been extracted from the road accident data in United Kingdom (UK), using data analytics method, for avoiding possible accidents in rural and urban areas. This analysis make use of several methodologies such as data integration, support vector machines (SVM), correlation machines and multinomial goodness. The entire datasets have been imported from the traffic department of UK with due permission. The information extracted from these huge datasets forms a basis for several predictions, which in turn avoid unnecessary memory lapses. Since data is expected to grow continuously over a period of time, this work primarily proposes a new framework model which can be trained and adapt itself to new data and make accurate predictions. This work also throws some light on use of SVM’s methodology for text classifiers from the obtained traffic data. Finally, it emphasizes the uniqueness and adaptability of SVMs methodology appropriate for this kind of research work.Keywords: Road accident, machine learning, support vector machines.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1130194 Time Series Regression with Meta-Clusters
Authors: Monika Chuchro
Abstract:
This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain subgroups of time series data with normal distribution from the inflow into wastewater treatment plant data, composed of several groups differing by mean value. Two simple algorithms, K-mean and EM, were chosen as a clustering method. The Rand index was used to measure the similarity. After simple meta-clustering, a regression model was performed for each subgroups. The final model was a sum of the subgroups models. The quality of the obtained model was compared with the regression model made using the same explanatory variables, but with no clustering of data. Results were compared using determination coefficient (R2), measure of prediction accuracy- mean absolute percentage error (MAPE) and comparison on a linear chart. Preliminary results allow us to foresee the potential of the presented technique.
Keywords: Clustering, Data analysis, Data mining, Predictive models.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1951193 A Study of Growth Factors on Sustainable Manufacturing in Small and Medium-Sized Enterprises: Case Study of Japan Manufacturing
Authors: Tadayuki Kyoutani, Shigeyuki Haruyama, Ken Kaminishi, Zefry Darmawan
Abstract:
Japan’s semiconductor industries have developed greatly in recent years. Many were started from a Small and Medium-sized Enterprises (SMEs) that found at a good circumstance and now become the prosperous industries in the world. Sustainable growth factors that support the creation of spirit value inside the Japanese company were strongly embedded through performance. Those factors were not clearly defined among each company. A series of literature research conducted to explore quantitative text mining about the definition of sustainable growth factors. Sustainable criteria were developed from previous research to verify the definition of the factors. A typical frame work was proposed as a systematical approach to develop sustainable growth factor in a specific company. Result of approach was review in certain period shows that factors influenced in sustainable growth was importance for the company to achieve the goal.
Keywords: SME, manufacture, sustainable, growth factor.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 638192 Methods for Distinction of Cattle Using Supervised Learning
Authors: Radoslav Židek, Veronika Šidlová, Radovan Kasarda, Birgit Fuerst-Waltl
Abstract:
Machine learning represents a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. The data can present identification patterns which are used to classify into groups. The result of the analysis is the pattern which can be used for identification of data set without the need to obtain input data used for creation of this pattern. An important requirement in this process is careful data preparation validation of model used and its suitable interpretation. For breeders, it is important to know the origin of animals from the point of the genetic diversity. In case of missing pedigree information, other methods can be used for traceability of animal´s origin. Genetic diversity written in genetic data is holding relatively useful information to identify animals originated from individual countries. We can conclude that the application of data mining for molecular genetic data using supervised learning is an appropriate tool for hypothesis testing and identifying an individual.
Keywords: Genetic data, Pinzgau cattle, supervised learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2320191 Evaluating Hurst Parameters and Fractal Dimensions of Surveyed Dataset of Tailings Dam Embankment
Authors: I. Yakubu, Y. Y. Ziggah, C. Yeboah
Abstract:
In the mining environment, tailings dam embankment is among the hazards and risk areas. The tailings dam embankment could fail and result to damages to facilities, human injuries or even fatalities. Periodic monitoring of the dam embankment is needed to help assess the safety of the tailings dam embankment. Artificial intelligence techniques such as fractals can be used to analyse the stability of the monitored dataset from survey measurement techniques. In this paper, the fractal dimension (D) was determined using D = 2-H. The Hurst parameters (H) of each monitored prism were determined by using a time domain of rescaled range programming in MATLAB software. The fractal dimensions of each monitored prism were determined based on the values of H. The results reveal that the values of the determined H were all within the threshold of 0 ≤ H ≤ 1 m. The smaller the H, the bigger the fractal dimension is. Fractal dimension values ranging from 1.359 x 10-4 m to 1.8843 x 10-3 m were obtained from the monitored prisms on the based on the tailing dam embankment dataset used. The ranges of values obtained indicate that the tailings dam embankment is stable.Keywords: Hurst parameter, fractal dimension, tailings dam embankment, surveyed dataset.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 760190 Hybrid Weighted Multiple Attribute Decision Making Handover Method for Heterogeneous Networks
Authors: Mohanad Alhabo, Li Zhang, Naveed Nawaz
Abstract:
Small cell deployment in 5G networks is a promising technology to enhance the capacity and coverage. However, unplanned deployment may cause high interference levels and high number of unnecessary handovers, which in turn result in an increase in the signalling overhead. To guarantee service continuity, minimize unnecessary handovers and reduce signalling overhead in heterogeneous networks, it is essential to properly model the handover decision problem. In this paper, we model the handover decision problem using Multiple Attribute Decision Making (MADM) method, specifically Technique for Order Preference by Similarity to an Ideal Solution (TOPSIS), and propose a hybrid TOPSIS method to control the handover in heterogeneous network. The proposed method adopts a hybrid weighting policy, which is a combination of entropy and standard deviation. A hybrid weighting control parameter is introduced to balance the impact of the standard deviation and entropy weighting on the network selection process and the overall performance. Our proposed method show better performance, in terms of the number of frequent handovers and the mean user throughput, compared to the existing methods.
Keywords: Handover, HetNets, interference, MADM, small cells, TOPSIS, weight.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 579189 Identification of Conserved Domains and Motifs for GRF Gene Family
Authors: Jafar Ahmadi, Nafiseh Noormohammadi, Sedigheh Fabriki Ourang
Abstract:
GRF, Growth regulating factor, genes encode a novel class of plant-specific transcription factors. The GRF proteins play a role in the regulation of cell numbers in young and growing tissues and may act as transcription activations in growth and development of plants. Identification of GRF genes and their expression are important in plants to performance of the growth and development of various organs. In this study, to better understanding the structural and functional differences of GRFs family, 45 GRF proteins sequences in A. thaliana, Z. mays, O. sativa, B. napus, B. rapa, H. vulgare and S. bicolor, have been collected and analyzed through bioinformatics data mining. As a result, in secondary structure of GRFs, the number of alpha helices was more than beta sheets and in all of them QLQ domains were completely in the biggest alpha helix. In all GRFs, QLQ and WRC domains were completely protected except in AtGRF9. These proteins have no trans-membrane domain and due to have nuclear localization signals act in nuclear and they are component of unstable proteins in the test tube.
Keywords: Domain, Gene Family, GRF, Motif.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2330