Search results for: dependency tree.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 521

Search results for: dependency tree.

431 Implementation of Heuristics for Solving Travelling Salesman Problem Using Nearest Neighbour and Minimum Spanning Tree Algorithms

Authors: Fatma A. Karkory, Ali A. Abudalmola

Abstract:

The travelling salesman problem (TSP) is a combinatorial optimization problem in which the goal is to find the shortest path between different cities that the salesman takes. In other words, the problem deals with finding a route covering all cities so that total distance and execution time is minimized. This paper adopts the nearest neighbor and minimum spanning tree algorithm to solve the well-known travelling salesman problem. The algorithms were implemented using java programming language. The approach is tested on three graphs that making a TSP tour instance of 5-city, 10 –city, and 229–city. The computation results validate the performance of the proposed algorithm.

Keywords: Heuristics, minimum spanning tree algorithm, Nearest Neighbor, Travelling Salesman Problem (TSP).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7821
430 A Hybrid Classification Method using Artificial Neural Network Based Decision Tree for Automatic Sleep Scoring

Authors: Haoyu Ma, Bin Hu, Mike Jackson, Jingzhi Yan, Wen Zhao

Abstract:

In this paper we propose a new classification method for automatic sleep scoring using an artificial neural network based decision tree. It attempts to treat sleep scoring progress as a series of two-class problems and solves them with a decision tree made up of a group of neural network classifiers, each of which uses a special feature set and is aimed at only one specific sleep stage in order to maximize the classification effect. A single electroencephalogram (EEG) signal is used for our analysis rather than depending on multiple biological signals, which makes greatly simplifies the data acquisition process. Experimental results demonstrate that the average epoch by epoch agreement between the visual and the proposed method in separating 30s wakefulness+S1, REM, S2 and SWS epochs was 88.83%. This study shows that the proposed method performed well in all the four stages, and can effectively limit error propagation at the same time. It could, therefore, be an efficient method for automatic sleep scoring. Additionally, since it requires only a small volume of data it could be suited to pervasive applications.

Keywords: Sleep, Sleep stage, Automatic sleep scoring, Electroencephalography, Decision tree, Artificial neural network

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2070
429 Distributed Splay Suffix Arrays: A New Structure for Distributed String Search

Authors: Tu Kun, Gu Nai-jie, Bi Kun, Liu Gang, Dong Wan-li

Abstract:

As a structure for processing string problem, suffix array is certainly widely-known and extensively-studied. But if the string access pattern follows the “90/10" rule, suffix array can not take advantage of the fact that we often find something that we have just found. Although the splay tree is an efficient data structure for small documents when the access pattern follows the “90/10" rule, it requires many structures and an excessive amount of pointer manipulations for efficiently processing and searching large documents. In this paper, we propose a new and conceptually powerful data structure, called splay suffix arrays (SSA), for string search. This data structure combines the features of splay tree and suffix arrays into a new approach which is suitable to implementation on both conventional and clustered computers.

Keywords: suffix arrays, splay tree, string search, distributedalgorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1776
428 Balancing Strategies for Parallel Content-based Data Retrieval Algorithms in a k-tree Structured Database

Authors: Radu Dobrescu, Matei Dobrescu, Daniela Hossu

Abstract:

The paper proposes a unified model for multimedia data retrieval which includes data representatives, content representatives, index structure, and search algorithms. The multimedia data are defined as k-dimensional signals indexed in a multidimensional k-tree structure. The benefits of using the k-tree unified model were demonstrated by running the data retrieval application on a six networked nodes test bed cluster. The tests were performed with two retrieval algorithms, one that allows parallel searching using a single feature, the second that performs a weighted cascade search for multiple features querying. The experiments show a significant reduction of retrieval time while maintaining the quality of results.

Keywords: balancing strategies, multimedia databases, parallelprocessing, retrieval algorithms

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1422
427 The Knapsack Sharing Problem: A Tree Search Exact Algorithm

Authors: Mhand Hifi, Hedi Mhalla

Abstract:

In this paper, we study the knapsack sharing problem, a variant of the well-known NP-Hard single knapsack problem. We investigate the use of a tree search for optimally solving the problem. The used method combines two complementary phases: a reduction interval search phase and a branch and bound procedure one. First, the reduction phase applies a polynomial reduction strategy; that is used for decomposing the problem into a series of knapsack problems. Second, the tree search procedure is applied in order to attain a set of optimal capacities characterizing the knapsack problems. Finally, the performance of the proposed optimal algorithm is evaluated on a set of instances of the literature and its runtime is compared to the best exact algorithm of the literature.

Keywords: Branch and bound, combinatorial optimization, knap¬sack, knapsack sharing, heuristics, interval reduction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1558
426 Ranking Genes from DNA Microarray Data of Cervical Cancer by a local Tree Comparison

Authors: Frank Emmert-Streib, Matthias Dehmer, Jing Liu, Max Muhlhauser

Abstract:

The major objective of this paper is to introduce a new method to select genes from DNA microarray data. As criterion to select genes we suggest to measure the local changes in the correlation graph of each gene and to select those genes whose local changes are largest. More precisely, we calculate the correlation networks from DNA microarray data of cervical cancer whereas each network represents a tissue of a certain tumor stage and each node in the network represents a gene. From these networks we extract one tree for each gene by a local decomposition of the correlation network. The interpretation of a tree is that it represents the n-nearest neighbor genes on the n-th level of a tree, measured by the Dijkstra distance, and, hence, gives the local embedding of a gene within the correlation network. For the obtained trees we measure the pairwise similarity between trees rooted by the same gene from normal to cancerous tissues. This evaluates the modification of the tree topology due to tumor progression. Finally, we rank the obtained similarity values from all tissue comparisons and select the top ranked genes. For these genes the local neighborhood in the correlation networks changes most between normal and cancerous tissues. As a result we find that the top ranked genes are candidates suspected to be involved in tumor growth. This indicates that our method captures essential information from the underlying DNA microarray data of cervical cancer.

Keywords: Graph similarity, generalized trees, graph alignment, DNA microarray data, cervical cancer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1752
425 A Case-Based Reasoning-Decision Tree Hybrid System for Stock Selection

Authors: Yaojun Wang, Yaoqing Wang

Abstract:

Stock selection is an important decision-making problem. Many machine learning and data mining technologies are employed to build automatic stock-selection system. A profitable stock-selection system should consider the stock’s investment value and the market timing. In this paper, we present a hybrid system including both engage for stock selection. This system uses a case-based reasoning (CBR) model to execute the stock classification, uses a decision-tree model to help with market timing and stock selection. The experiments show that the performance of this hybrid system is better than that of other techniques regarding to the classification accuracy, the average return and the Sharpe ratio.

Keywords: Case-based reasoning, decision tree, stock selection, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1705
424 Performance Analysis of Artificial Neural Network with Decision Tree in Prediction of Diabetes Mellitus

Authors: J. K. Alhassan, B. Attah, S. Misra

Abstract:

Human beings have the ability to make logical decisions. Although human decision - making is often optimal, it is insufficient when huge amount of data is to be classified. Medical dataset is a vital ingredient used in predicting patient’s health condition. In other to have the best prediction, there calls for most suitable machine learning algorithms. This work compared the performance of Artificial Neural Network (ANN) and Decision Tree Algorithms (DTA) as regards to some performance metrics using diabetes data. WEKA software was used for the implementation of the algorithms. Multilayer Perceptron (MLP) and Radial Basis Function (RBF) were the two algorithms used for ANN, while RegTree and LADTree algorithms were the DTA models used. From the results obtained, DTA performed better than ANN. The Root Mean Squared Error (RMSE) of MLP is 0.3913 that of RBF is 0.3625, that of RepTree is 0.3174 and that of LADTree is 0.3206 respectively.

Keywords: Artificial neural network, classification, decision tree, diabetes mellitus.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2416
423 A Numerical Model for Simulation of Blood Flow in Vascular Networks

Authors: Houman Tamaddon, Mehrdad Behnia, Masud Behnia

Abstract:

An accurate study of blood flow is associated with an accurate vascular pattern and geometrical properties of the organ of interest. Due to the complexity of vascular networks and poor accessibility in vivo, it is challenging to reconstruct the entire vasculature of any organ experimentally. The objective of this study is to introduce an innovative approach for the reconstruction of a full vascular tree from available morphometric data. Our method consists of implementing morphometric data on those parts of the vascular tree that are smaller than the resolution of medical imaging methods. This technique reconstructs the entire arterial tree down to the capillaries. Vessels greater than 2 mm are obtained from direct volume and surface analysis using contrast enhanced computed tomography (CT). Vessels smaller than 2mm are reconstructed from available morphometric and distensibility data and rearranged by applying Murray’s Laws. Implementation of morphometric data to reconstruct the branching pattern and applying Murray’s Laws to every vessel bifurcation simultaneously, lead to an accurate vascular tree reconstruction. The reconstruction algorithm generates full arterial tree topography down to the first capillary bifurcation. Geometry of each order of the vascular tree is generated separately to minimize the construction and simulation time. The node-to-node connectivity along with the diameter and length of every vessel segment is established and order numbers, according to the diameter-defined Strahler system, are assigned. During the simulation, we used the averaged flow rate for each order to predict the pressure drop and once the pressure drop is predicted, the flow rate is corrected to match the computed pressure drop for each vessel. The final results for 3 cardiac cycles is presented and compared to the clinical data.

Keywords: Blood flow, Morphometric data, Vascular tree, Strahler ordering system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2101
422 The Determinants and Outcomes of Pathological Internet use (PIU) among Urban Millennial Teens: A Theoretical Framework

Authors: Pressca Neging, Rosidah Musa, Rabiah Abdul Wahab

Abstract:

The rapid adoption of Internet has turned the Millennial Teens- life like a lightning speed. Empirical evidence has illustrated that Pathological Internet Use (PIU) among them ensure long-term success to the market players in the children industry. However, it creates concerns among their care takers as it generates mental disorder among some of them. The purpose of this paper is to examine the determinants of PIU and identify its outcomes among urban Millennial Teens. It aims to develop a theoretical framework based on a modified Media System Dependency (MSD) Theory that integrates important systems and components that determine and resulted from PIU.

Keywords: Internet, media system dependency theory, millennial, pathological internet use.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2422
421 Clustering Mixed Data Using Non-normal Regression Tree for Process Monitoring

Authors: Youngji Yoo, Cheong-Sool Park, Jun Seok Kim, Young-Hak Lee, Sung-Shick Kim, Jun-Geol Baek

Abstract:

In the semiconductor manufacturing process, large amounts of data are collected from various sensors of multiple facilities. The collected data from sensors have several different characteristics due to variables such as types of products, former processes and recipes. In general, Statistical Quality Control (SQC) methods assume the normality of the data to detect out-of-control states of processes. Although the collected data have different characteristics, using the data as inputs of SQC will increase variations of data, require wide control limits, and decrease performance to detect outof- control. Therefore, it is necessary to separate similar data groups from mixed data for more accurate process control. In the paper, we propose a regression tree using split algorithm based on Pearson distribution to handle non-normal distribution in parametric method. The regression tree finds similar properties of data from different variables. The experiments using real semiconductor manufacturing process data show improved performance in fault detecting ability.

Keywords: Semiconductor, non-normal mixed process data, clustering, Statistical Quality Control (SQC), regression tree, Pearson distribution system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1779
420 Optimizing Mobile Agents Migration Based on Decision Tree Learning

Authors: Yasser k. Ali, Hesham N. Elmahdy, Sanaa El Olla Hanfy Ahmed

Abstract:

Mobile agents are a powerful approach to develop distributed systems since they migrate to hosts on which they have the resources to execute individual tasks. In a dynamic environment like a peer-to-peer network, Agents have to be generated frequently and dispatched to the network. Thus they will certainly consume a certain amount of bandwidth of each link in the network if there are too many agents migration through one or several links at the same time, they will introduce too much transferring overhead to the links eventually, these links will be busy and indirectly block the network traffic, therefore, there is a need of developing routing algorithms that consider about traffic load. In this paper we seek to create cooperation between a probabilistic manner according to the quality measure of the network traffic situation and the agent's migration decision making to the next hop based on decision tree learning algorithms.

Keywords: Agent Migration, Decision Tree learning, ID3 algorithm, Naive Bayes Classifier

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1990
419 A Green Method for Selective Spectrophotometric Determination of Hafnium(IV) with Aqueous Extract of Ficus carica Tree Leaves

Authors: A. Boveiri Monji, H. Yousefnia, M. Haji Hosseini, S. Zolghadri

Abstract:

A clean spectrophotometric method for the determination of hafnium by using a green reagent, acidic extract of Ficus carica tree leaves is developed. In 6-M hydrochloric acid, hafnium reacts with this reagent to form a yellow product. The formed product shows maximum absorbance at 421 nm with a molar absorptivity value of 0.28 × 104 l mol⁻¹ cm⁻¹, and the method was linear in the 2-11 µg ml⁻¹ concentration range. The detection limit value was found to be 0.312 µg ml⁻¹. Except zirconium and iron, the selectivity was good, and most of the ions did not show any significant spectral interference at concentrations up to several hundred times. The proposed method was green, simple, low cost, and selective.

Keywords: Spectrophotometric determination, Ficus carica tree leaves, synthetic reagents, hafnium.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 737
418 A 25-year Monitoring of the Air Pollution Depicted by Plane Tree Species in Tehran

Authors: S. A. A. Korori, H. Valipour K., S. Shabestani, A. shirvany, M. Matinizadeh

Abstract:

Tehran, one of the heavily-populated capitals, is severely suffering from increasing air pollution. To show a documented trend of such pollutants during last years, plane tree species (Platanus orientalis) were suited to be studied as indicators, for the species have been planted throughout the city many years ago. Two areas (Saadatabad and Narmak districts) allotting different contents of crowed and highly-traffic routs but the same ecological characteristics were selected. Twelve sample individuals were cored twice perpendicularly in each area. Tree-rings of each core were measured by a binocular microscope and separated annually for the last 25 years. Two heavy metals including Cd and Pb accompanied by a mineral element (Ca) were analyzed using Hatch method. Treerings analysis of the two areas showed different groups in term of physiologically ability as the growths were plunged during the last 10 years in Saadatabad district and showed a slight decrease in the same period for another studying area. In direct contrast to decreasing growth trend in Saadatabad, all three mentioned elements increased sharply during last 25 years in the same area. When it came to Narmak district, the trend was completely different with Saadatabad. There were some fluctuations in absorbing trace elements like tree-rings widths were, yet calcium showed an upward trend all the last 25 years. The results of the study proved the possibility of using tree species of each region to monitor its air pollution trends of the past, hence to depict a pollution assessment of a populated city for last years and then to make appropriate decisions for the future as it is well-known what the trend is. On the other hand, risen values of calcium (as the stress-indicator element) accompanied by increased trace elements suggests non-sustainable state of the trees.

Keywords: Air pollution, Platanus orientalis, Tehran, Traceelements, Tree rings.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1679
417 Using Single Decision Tree to Assess the Impact of Cutting Conditions on Vibration

Authors: S. Ghorbani, N. I. Polushin

Abstract:

Vibration during machining process is crucial since it affects cutting tool, machine, and workpiece leading to a tool wear, tool breakage, and an unacceptable surface roughness. This paper applies a nonparametric statistical method, single decision tree (SDT), to identify factors affecting on vibration in machining process. Workpiece material (AISI 1045 Steel, AA2024 Aluminum alloy, A48-class30 Gray Cast Iron), cutting tool (conventional, cutting tool with holes in toolholder, cutting tool filled up with epoxy-granite), tool overhang (41-65 mm), spindle speed (630-1000 rpm), feed rate (0.05-0.075 mm/rev) and depth of cut (0.05-0.15 mm) were used as input variables, while vibration was the output parameter. It is concluded that workpiece material is the most important parameters for natural frequency followed by cutting tool and overhang.

Keywords: Cutting condition, vibration, natural frequency, decision tree, CART algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1433
416 Dynamic Routing to Multiple Destinations in IP Networks using Hybrid Genetic Algorithm (DRHGA)

Authors: K. Vijayalakshmi, S. Radhakrishnan

Abstract:

In this paper we have proposed a novel dynamic least cost multicast routing protocol using hybrid genetic algorithm for IP networks. Our protocol finds the multicast tree with minimum cost subject to delay, degree, and bandwidth constraints. The proposed protocol has the following features: i. Heuristic local search function has been devised and embedded with normal genetic operation to increase the speed and to get the optimized tree, ii. It is efficient to handle the dynamic situation arises due to either change in the multicast group membership or node / link failure, iii. Two different crossover and mutation probabilities have been used for maintaining the diversity of solution and quick convergence. The simulation results have shown that our proposed protocol generates dynamic multicast tree with lower cost. Results have also shown that the proposed algorithm has better convergence rate, better dynamic request success rate and less execution time than other existing algorithms. Effects of degree and delay constraints have also been analyzed for the multicast tree interns of search success rate.

Keywords: Dynamic Group membership change, Hybrid Genetic Algorithm, Link / node failure, QoS Parameters.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1447
415 Fusion of ETM+ Multispectral and Panchromatic Texture for Remote Sensing Classification

Authors: Mahesh Pal

Abstract:

This paper proposes to use ETM+ multispectral data and panchromatic band as well as texture features derived from the panchromatic band for land cover classification. Four texture features including one 'internal texture' and three GLCM based textures namely correlation, entropy, and inverse different moment were used in combination with ETM+ multispectral data. Two data sets involving combination of multispectral, panchromatic band and its texture were used and results were compared with those obtained by using multispectral data alone. A decision tree classifier with and without boosting were used to classify different datasets. Results from this study suggest that the dataset consisting of panchromatic band, four of its texture features and multispectral data was able to increase the classification accuracy by about 2%. In comparison, a boosted decision tree was able to increase the classification accuracy by about 3% with the same dataset.

Keywords: Internal texture; GLCM; decision tree; boosting; classification accuracy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1735
414 A Relative Analysis of Carbon and Dust Uptake by Important Tree Species in Tehran, Iran

Authors: Sahar Elkaee Behjati

Abstract:

Air pollution, particularly with dust, is one of the biggest issues Tehran is dealing with, and the city's green space which consists of trees has a critical role in absorption of it. The question this study aimed to investigate was which tree species the highest uptake capacity of the dust and carbon have suspended in the air. On this basis, 30 samples of trees from two different districts in Tehran were collected, and after washing and centrifuging, the samples were oven dried. The results of the study revealed that Ulmus minor had the highest amount of deposited dust in both districts. In addition, it was found that in Chamran district Ailanthus altissima and in Gandi district Ulmus minor has had the highest absorption of deposited carbon. Therefore, it could be argued that decision making on the selection of species for urban green spaces should take the above-mentioned parameters into account.

Keywords: Dust, leaves, uptake total carbon, tehran, tree species.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 731
413 A Systems Approach to Gene Ranking from DNA Microarray Data of Cervical Cancer

Authors: Frank Emmert Streib, Matthias Dehmer, Jing Liu, Max Mühlhauser

Abstract:

In this paper we present a method for gene ranking from DNA microarray data. More precisely, we calculate the correlation networks, which are unweighted and undirected graphs, from microarray data of cervical cancer whereas each network represents a tissue of a certain tumor stage and each node in the network represents a gene. From these networks we extract one tree for each gene by a local decomposition of the correlation network. The interpretation of a tree is that it represents the n-nearest neighbor genes on the n-th level of a tree, measured by the Dijkstra distance, and, hence, gives the local embedding of a gene within the correlation network. For the obtained trees we measure the pairwise similarity between trees rooted by the same gene from normal to cancerous tissues. This evaluates the modification of the tree topology due to progression of the tumor. Finally, we rank the obtained similarity values from all tissue comparisons and select the top ranked genes. For these genes the local neighborhood in the correlation networks changes most between normal and cancerous tissues. As a result we find that the top ranked genes are candidates suspected to be involved in tumor growth and, hence, indicates that our method captures essential information from the underlying DNA microarray data of cervical cancer.

Keywords: Graph similarity, DNA microarray data, cancer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1755
412 A Patricia-Tree Approach for Frequent Closed Itemsets

Authors: Moez Ben Hadj Hamida, Yahya SlimaniI

Abstract:

In this paper, we propose an adaptation of the Patricia-Tree for sparse datasets to generate non redundant rule associations. Using this adaptation, we can generate frequent closed itemsets that are more compact than frequent itemsets used in Apriori approach. This adaptation has been experimented on a set of datasets benchmarks.

Keywords: Datamining, Frequent itemsets, Frequent closeditemsets, Sparse datasets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1883
411 A High-Speed Multiplication Algorithm Using Modified Partial Product Reduction Tree

Authors: P. Asadee

Abstract:

Multiplication algorithms have considerable effect on processors performance. A new high-speed, low-power multiplication algorithm has been presented using modified Dadda tree structure. Three important modifications have been implemented in inner product generation step, inner product reduction step and final addition step. Optimized algorithms have to be used into basic computation components, such as multiplication algorithms. In this paper, we proposed a new algorithm to reduce power, delay, and transistor count of a multiplication algorithm implemented using low power modified counter. This work presents a novel design for Dadda multiplication algorithms. The proposed multiplication algorithm includes structured parts, which have important effect on inner product reduction tree. In this paper, a 1.3V, 64-bit carry hybrid adder is presented for fast, low voltage applications. The new 64-bit adder uses a new circuit to implement the proposed carry hybrid adder. The new adder using 80 nm CMOS technology has been implemented on 700 MHz clock frequency. The proposed multiplication algorithm has achieved 14 percent improvement in transistor count, 13 percent reduction in delay and 12 percent modification in power consumption in compared with conventional designs.

Keywords: adder, CMOS, counter, Dadda tree, encoder.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2302
410 Hybrid Machine Learning Approach for Text Categorization

Authors: Nerijus Remeikis, Ignas Skucas, Vida Melninkaite

Abstract:

Text categorization - the assignment of natural language documents to one or more predefined categories based on their semantic content - is an important component in many information organization and management tasks. Performance of neural networks learning is known to be sensitive to the initial weights and architecture. This paper discusses the use multilayer neural network initialization with decision tree classifier for improving text categorization accuracy. An adaptation of the algorithm is proposed in which a decision tree from root node until a final leave is used for initialization of multilayer neural network. The experimental evaluation demonstrates this approach provides better classification accuracy with Reuters-21578 corpus, one of the standard benchmarks for text categorization tasks. We present results comparing the accuracy of this approach with multilayer neural network initialized with traditional random method and decision tree classifiers.

Keywords: Text categorization, decision trees, neural networks, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1805
409 Inferring Hierarchical Pronunciation Rules from a Phonetic Dictionary

Authors: Erika Pigliapoco, Valerio Freschi, Alessandro Bogliolo

Abstract:

This work presents a new phonetic transcription system based on a tree of hierarchical pronunciation rules expressed as context-specific grapheme-phoneme correspondences. The tree is automatically inferred from a phonetic dictionary by incrementally analyzing deeper context levels, eventually representing a minimum set of exhaustive rules that pronounce without errors all the words in the training dictionary and that can be applied to out-of-vocabulary words. The proposed approach improves upon existing rule-tree-based techniques in that it makes use of graphemes, rather than letters, as elementary orthographic units. A new linear algorithm for the segmentation of a word in graphemes is introduced to enable outof- vocabulary grapheme-based phonetic transcription. Exhaustive rule trees provide a canonical representation of the pronunciation rules of a language that can be used not only to pronounce out-of-vocabulary words, but also to analyze and compare the pronunciation rules inferred from different dictionaries. The proposed approach has been implemented in C and tested on Oxford British English and Basic English. Experimental results show that grapheme-based rule trees represent phonetically sound rules and provide better performance than letter-based rule trees.

Keywords: Automatic phonetic transcription, pronunciation rules, hierarchical tree inference.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1924
408 A Minimum Spanning Tree-Based Method for Initializing the K-Means Clustering Algorithm

Authors: J. Yang, Y. Ma, X. Zhang, S. Li, Y. Zhang

Abstract:

The traditional k-means algorithm has been widely used as a simple and efficient clustering method. However, the algorithm often converges to local minima for the reason that it is sensitive to the initial cluster centers. In this paper, an algorithm for selecting initial cluster centers on the basis of minimum spanning tree (MST) is presented. The set of vertices in MST with same degree are regarded as a whole which is used to find the skeleton data points. Furthermore, a distance measure between the skeleton data points with consideration of degree and Euclidean distance is presented. Finally, MST-based initialization method for the k-means algorithm is presented, and the corresponding time complexity is analyzed as well. The presented algorithm is tested on five data sets from the UCI Machine Learning Repository. The experimental results illustrate the effectiveness of the presented algorithm compared to three existing initialization methods.

Keywords: Degree, initial cluster center, k-means, minimum spanning tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1552
407 Biodiversity and Climate Change: Consequences for Norway Spruce Mountain Forests in Slovakia

Authors: Jozef Mindas, Jaroslav Skvarenina, Jana Skvareninova

Abstract:

Study of the effects of climate change on Norway Spruce (Picea abies) forests has mainly focused on the diversity of tree species diversity of tree species as a result of the ability of species to tolerate temperature and moisture changes as well as some effects of disturbance regime changes. The tree species’ diversity changes in spruce forests due to climate change have been analyzed via gap model. Forest gap model is a dynamic model for calculation basic characteristics of individual forest trees. Input ecological data for model calculations have been taken from the permanent research plots located in primeval forests in mountainous regions in Slovakia. The results of regional scenarios of the climatic change for the territory of Slovakia have been used, from which the values are according to the CGCM3.1 (global) model, KNMI and MPI (regional) models. Model results for conditions of the climate change scenarios suggest a shift of the upper forest limit to the region of the present subalpine zone, in supramontane zone. N. spruce representation will decrease at the expense of beech and precious broadleaved species (Acer sp., Sorbus sp., Fraxinus sp.). The most significant tree species diversity changes have been identified for the upper tree line and current belt of dwarf pine (Pinus mugo) occurrence. The results have been also discussed in relation to most important disturbances (wind storms, snow and ice storms) and phenological changes which consequences are little known. Special discussion is focused on biomass production changes in relation to carbon storage diversity in different carbon pools.

Keywords: Biodiversity, climate change, Norway spruce forests, gap model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1643
406 A Decision Tree Approach to Estimate Permanent Residents Using Remote Sensing Data in Lebanese Municipalities

Authors: K. Allaw, J. Adjizian Gerard, M. Chehayeb, A. Raad, W. Fahs, A. Badran, A. Fakherdin, H. Madi, N. Badaro Saliba

Abstract:

Population estimation using Geographic Information System (GIS) and remote sensing faces many obstacles such as the determination of permanent residents. A permanent resident is an individual who stays and works during all four seasons in his village. So, all those who move towards other cities or villages are excluded from this category. The aim of this study is to identify the factors affecting the percentage of permanent residents in a village and to determine the attributed weight to each factor. To do so, six factors have been chosen (slope, precipitation, temperature, number of services, time to Central Business District (CBD) and the proximity to conflict zones) and each one of those factors has been evaluated using one of the following data: the contour lines map of 50 m, the precipitation map, four temperature maps and data collected through surveys. The weighting procedure has been done using decision tree method. As a result of this procedure, temperature (50.8%) and percentage of precipitation (46.5%) are the most influencing factors.

Keywords: Remote sensing and GIS, permanent residence, decision tree, Lebanon.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1010
405 Comparison between Associative Classification and Decision Tree for HCV Treatment Response Prediction

Authors: Enas M. F. El Houby, Marwa S. Hassan

Abstract:

Combined therapy using Interferon and Ribavirin is the standard treatment in patients with chronic hepatitis C. However, the number of responders to this treatment is low, whereas its cost and side effects are high. Therefore, there is a clear need to predict patient’s response to the treatment based on clinical information to protect the patients from the bad drawbacks, Intolerable side effects and waste of money. Different machine learning techniques have been developed to fulfill this purpose. From these techniques are Associative Classification (AC) and Decision Tree (DT). The aim of this research is to compare the performance of these two techniques in the prediction of virological response to the standard treatment of HCV from clinical information. 200 patients treated with Interferon and Ribavirin; were analyzed using AC and DT. 150 cases had been used to train the classifiers and 50 cases had been used to test the classifiers. The experiment results showed that the two techniques had given acceptable results however the best accuracy for the AC reached 92% whereas for DT reached 80%.

Keywords: Associative Classification, Data mining, Decision tree, HCV, interferon.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1899
404 Vibration Analysis of the Gas Turbine Considering Dependency of Stiffness and Damping on Frequency

Authors: Hamed Jamshidi, Pooya Djamshidi

Abstract:

In this paper the complete rotor system including elastic shaft with distributed mass, allowing for the effects of oil film in bearings. Also, flexibility of foundation is modeled. As a whole this article is a relatively complete research in modeling and vibration analysis of rotor considering gyroscopic effect, damping, dependency of stiffness and damping coefficients on frequency and solving the vibration equations including these parameters. On the basis of finite element method and utilizing four element types including element of shaft, disk, bearing and foundation and using MATLAB, a computer program is written. So the responses in several cases and considering different effects are obtained. Then the results are compared with each other, with exact solutions and results of other papers.

Keywords: Damping coefficients , Finite element method, Modeling , Rotor vibration

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2489
403 A Tree Based Association Rule Approach for XML Data with Semantic Integration

Authors: D. Sasikala, K. Premalatha

Abstract:

The use of eXtensible Markup Language (XML) in web, business and scientific databases lead to the development of methods, techniques and systems to manage and analyze XML data. Semi-structured documents suffer due to its heterogeneity and dimensionality. XML structure and content mining represent convergence for research in semi-structured data and text mining. As the information available on the internet grows drastically, extracting knowledge from XML documents becomes a harder task. Certainly, documents are often so large that the data set returned as answer to a query may also be very big to convey the required information. To improve the query answering, a Semantic Tree Based Association Rule (STAR) mining method is proposed. This method provides intentional information by considering the structure, content and the semantics of the content. The method is applied on Reuter’s dataset and the results show that the proposed method outperforms well.

Keywords: Semi--structured Document, Tree based Association Rule (TAR), Semantic Association Rule Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2351
402 Experiments on Element and Document Statistics for XML Retrieval

Authors: Mohamed Ben Aouicha, Mohamed Tmar, Mohand Boughanem, Mohamed Abid

Abstract:

This paper presents an information retrieval model on XML documents based on tree matching. Queries and documents are represented by extended trees. An extended tree is built starting from the original tree, with additional weighted virtual links between each node and its indirect descendants allowing to directly reach each descendant. Therefore only one level separates between each node and its indirect descendants. This allows to compare the user query and the document with flexibility and with respect to the structural constraints of the query. The content of each node is very important to decide weither a document element is relevant or not, thus the content should be taken into account in the retrieval process. We separate between the structure-based and the content-based retrieval processes. The content-based score of each node is commonly based on the well-known Tf × Idf criteria. In this paper, we compare between this criteria and another one we call Tf × Ief. The comparison is based on some experiments into a dataset provided by INEX1 to show the effectiveness of our approach on one hand and those of both weighting functions on the other.

Keywords: XML retrieval, INEX, Tf × Idf, Tf × Ief

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1335