Search results for: Similarity Index
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1392

Search results for: Similarity Index

1242 Application of KL Divergence for Estimation of Each Metabolic Pathway Genes

Authors: Shohei Maruyama, Yasuo Matsuyama, Sachiyo Aburatani

Abstract:

Development of a method to estimate gene functions is an important task in bioinformatics. One of the approaches for the annotation is the identification of the metabolic pathway that genes are involved in. Since gene expression data reflect various intracellular phenomena, those data are considered to be related with genes’ functions. However, it has been difficult to estimate the gene function with high accuracy. It is considered that the low accuracy of the estimation is caused by the difficulty of accurately measuring a gene expression. Even though they are measured under the same condition, the gene expressions will vary usually. In this study, we proposed a feature extraction method focusing on the variability of gene expressions to estimate the genes' metabolic pathway accurately. First, we estimated the distribution of each gene expression from replicate data. Next, we calculated the similarity between all gene pairs by KL divergence, which is a method for calculating the similarity between distributions. Finally, we utilized the similarity vectors as feature vectors and trained the multiclass SVM for identifying the genes' metabolic pathway. To evaluate our developed method, we applied the method to budding yeast and trained the multiclass SVM for identifying the seven metabolic pathways. As a result, the accuracy that calculated by our developed method was higher than the one that calculated from the raw gene expression data. Thus, our developed method combined with KL divergence is useful for identifying the genes' metabolic pathway.

Keywords: Metabolic pathways, gene expression data, microarray, Kullback–Leibler divergence, KL divergence, support vector machines, SVM, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2271
1241 Fighter Aircraft Selection Using Technique for Order Preference by Similarity to Ideal Solution with Multiple Criteria Decision Making Analysis

Authors: C. Ardil

Abstract:

This paper presents a multiple criteria decision making analysis technique for selecting fighter aircraft for the national air force. The selection of military aircraft is a process consisting of contradictory goals and objectives. When a modern air force needs to choose fighter aircraft to upgrade existing fleets, a multiple criteria decision making analysis and scenario planning for defense acquisition has been put forward. The selection of fighter aircraft for the air defense force is a strategic decision making process, since the purchase or lease of fighter jets, maintenance and operating costs and having a fleet is the biggest cost for the air force. Multiple criteria decision making analysis methods are effectively applied to facilitate decision making from various available options. The selection criteria were determined using the literature on the problem of fighter aircraft selection. The selection of fighter aircraft to be purchased for the air defense forces is handled using a multiple criteria decision making analysis technique that also determines a suitable methodological approach for the defense procurement and fleet upgrade planning process. The aim of this study is to originate an approach to evaluate fighter aircraft alternatives, Su-35, F-35, and TF-X (MMU), based on technique for order preference by similarity to ideal solution (TOPSIS).

Keywords: Fighter Aircraft, Fighter Aircraft Selection, Technique for Order Preference by Similarity to Ideal Solution, TOPSIS, Multiple Criteria Decision Making, Multiple Criteria Decision Making Analysis, MCDMA, Su-35, F-35, TF-X (MMU)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 564
1240 Computing Entropy for Ortholog Detection

Authors: Hsing-Kuo Pao, John Case

Abstract:

Biological sequences from different species are called or-thologs if they evolved from a sequence of a common ancestor species and they have the same biological function. Approximations of Kolmogorov complexity or entropy of biological sequences are already well known to be useful in extracting similarity information between such sequences -in the interest, for example, of ortholog detection. As is well known, the exact Kolmogorov complexity is not algorithmically computable. In prac-tice one can approximate it by computable compression methods. How-ever, such compression methods do not provide a good approximation to Kolmogorov complexity for short sequences. Herein is suggested a new ap-proach to overcome the problem that compression approximations may notwork well on short sequences. This approach is inspired by new, conditional computations of Kolmogorov entropy. A main contribution of the empir-ical work described shows the new set of entropy-based machine learning attributes provides good separation between positive (ortholog) and nega-tive (non-ortholog) data - better than with good, previously known alter-natives (which do not employ some means to handle short sequences well).Also empirically compared are the new entropy based attribute set and a number of other, more standard similarity attributes sets commonly used in genomic analysis. The various similarity attributes are evaluated by cross validation, through boosted decision tree induction C5.0, and by Receiver Operating Characteristic (ROC) analysis. The results point to the conclu-sion: the new, entropy based attribute set by itself is not the one giving the best prediction; however, it is the best attribute set for use in improving the other, standard attribute sets when conjoined with them.

Keywords: compression, decision tree, entropy, ortholog, ROC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1789
1239 Heterogenous Dimensional Super Resolution of 3D CT Scans Using Transformers

Authors: Helen Zhang

Abstract:

Accurate segmentation of the airways from CT scans is crucial for early diagnosis of lung cancer. However, the existing airway segmentation algorithms often rely on thin-slice CT scans, which can be inconvenient and costly. This paper presents a set of machine learning-based 3D super-resolution algorithms along heterogenous dimensions to improve the resolution of thicker CT scans to reduce the reliance on thin-slice scans. To evaluate the efficacy of the super-resolution algorithms, quantitative assessments using PSNR (Peak Signal to Noise Ratio) and SSIM (Structural SIMilarity index) were performed. The impact of super-resolution on airway segmentation accuracy is also studied. The proposed approach has the potential to make airway segmentation more accessible and affordable, thereby facilitating early diagnosis and treatment of lung cancer.

Keywords: 3D super-resolution, airway segmentation, thin-slice CT scans, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 214
1238 Development of Groundwater Management Model Using Groundwater Sustainability Index

Authors: S. S. Rwanga, J. M. Ndambuki, Y. Woyessa

Abstract:

Development of a groundwater management model is an important step in the exploitation and management of any groundwater aquifer as it assists in the long-term sustainable planning of the resource. The current study was conducted in Central Limpopo province of South Africa with the overall objective of determining how much water can be withdrawn from the aquifer without producing nonreversible impacts on the groundwater quantity, hence developing a model which can sustainably protect the aquifer. The development was done through the computation of Groundwater Sustainability Index (GSI). Values of GSI close to unity and above indicated overexploitation. In this study, an index of 0.8 was considered as overexploitation. The results indicated that there is potential for higher abstraction rates compared to the current abstraction rates. GSI approach can be used in the management of groundwater aquifer to sustainably develop the resource and also provides water managers and policy makers with fundamental information on where future water developments can be carried out.

Keywords: Development, groundwater, groundwater sustainability index, model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 804
1237 Ranking Genes from DNA Microarray Data of Cervical Cancer by a local Tree Comparison

Authors: Frank Emmert-Streib, Matthias Dehmer, Jing Liu, Max Muhlhauser

Abstract:

The major objective of this paper is to introduce a new method to select genes from DNA microarray data. As criterion to select genes we suggest to measure the local changes in the correlation graph of each gene and to select those genes whose local changes are largest. More precisely, we calculate the correlation networks from DNA microarray data of cervical cancer whereas each network represents a tissue of a certain tumor stage and each node in the network represents a gene. From these networks we extract one tree for each gene by a local decomposition of the correlation network. The interpretation of a tree is that it represents the n-nearest neighbor genes on the n-th level of a tree, measured by the Dijkstra distance, and, hence, gives the local embedding of a gene within the correlation network. For the obtained trees we measure the pairwise similarity between trees rooted by the same gene from normal to cancerous tissues. This evaluates the modification of the tree topology due to tumor progression. Finally, we rank the obtained similarity values from all tissue comparisons and select the top ranked genes. For these genes the local neighborhood in the correlation networks changes most between normal and cancerous tissues. As a result we find that the top ranked genes are candidates suspected to be involved in tumor growth. This indicates that our method captures essential information from the underlying DNA microarray data of cervical cancer.

Keywords: Graph similarity, generalized trees, graph alignment, DNA microarray data, cervical cancer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1714
1236 Fast Database Indexing for Large Protein Sequence Collections Using Parallel N-Gram Transformation Algorithm

Authors: Jehad A. H. Hammad, Nur'Aini binti Abdul Rashid

Abstract:

With the rapid development in the field of life sciences and the flooding of genomic information, the need for faster and scalable searching methods has become urgent. One of the approaches that were investigated is indexing. The indexing methods have been categorized into three categories which are the lengthbased index algorithms, transformation-based algorithms and mixed techniques-based algorithms. In this research, we focused on the transformation based methods. We embedded the N-gram method into the transformation-based method to build an inverted index table. We then applied the parallel methods to speed up the index building time and to reduce the overall retrieval time when querying the genomic database. Our experiments show that the use of N-Gram transformation algorithm is an economical solution; it saves time and space too. The result shows that the size of the index is smaller than the size of the dataset when the size of N-Gram is 5 and 6. The parallel N-Gram transformation algorithm-s results indicate that the uses of parallel programming with large dataset are promising which can be improved further.

Keywords: Biological sequence, Database index, N-gram indexing, Parallel computing, Sequence retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2082
1235 A Design for Supply Chain Model by Integrated Evaluation of Design Value and Supply Chain Cost

Authors: Yuan-Jye Tseng, Jia-Shu Li

Abstract:

To design a product with the given product requirement and design objective, there can be alternative ways to propose the detailed design specifications of the product. In the design modeling stage, alternative design cases with detailed specifications can be modeled to fulfill the product requirement and design objective. Therefore, in the design evaluation stage, it is required to perform an evaluation of the alternative design cases for deciding the final design. The purpose of this research is to develop a product evaluation model for evaluating the alternative design cases by integrated evaluating the criteria of functional design, Kansei design, and design for supply chain. The criteria in the functional design group include primary function, expansion function, improved function, and new function. The criteria in the Kansei group include geometric shape, dimension, surface finish, and layout. The criteria in the design for supply chain group include material, manufacturing process, assembly, and supply chain operation. From the point of view of value and cost, the criteria in the functional design group and Kansei design group represent the design value of the product. The criteria in the design for supply chain group represent the supply chain and manufacturing cost of the product. It is required to evaluate the design value and the supply chain cost to determine the final design. For the purpose of evaluating the criteria in the three criteria groups, a fuzzy analytic network process (FANP) method is presented to evaluate a weighted index by calculating the total relational values among the three groups. A method using the technique for order preference by similarity to ideal solution (TOPSIS) is used to compare and rank the design alternative cases according to the weighted index using the total relational values of the criteria. The final decision of a design case can be determined by using the ordered ranking. For example, the design case with the top ranking can be selected as the final design case. Based on the criteria in the evaluation, the design objective can be achieved with a combined and weighted effect of the design value and manufacturing cost. An example product is demonstrated and illustrated in the presentation. It shows that the design evaluation model is useful for integrated evaluation of functional design, Kansei design, and design for supply chain to determine the best design case and achieve the design objective.

Keywords: Design evaluation, functional design, Kansei design, supply chain, design value, manufacturing cost, fuzzy analytic network process, technique for order preference by similarity to ideal solution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 749
1234 Assessing and Visualizing the Stability of Feature Selectors: A Case Study with Spectral Data

Authors: R.Guzman-Martinez, Oscar Garcia-Olalla, R.Alaiz-Rodriguez

Abstract:

Feature selection plays an important role in applications with high dimensional data. The assessment of the stability of feature selection/ranking algorithms becomes an important issue when the dataset is small and the aim is to gain insight into the underlying process by analyzing the most relevant features. In this work, we propose a graphical approach that enables to analyze the similarity between feature ranking techniques as well as their individual stability. Moreover, it works with whatever stability metric (Canberra distance, Spearman's rank correlation coefficient, Kuncheva's stability index,...). We illustrate this visualization technique evaluating the stability of several feature selection techniques on a spectral binary dataset. Experimental results with a neural-based classifier show that stability and ranking quality may not be linked together and both issues have to be studied jointly in order to offer answers to the domain experts.

Keywords: Feature Selection Stability, Spectral data, Data visualization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1480
1233 Web Proxy Detection via Bipartite Graphs and One-Mode Projections

Authors: Zhipeng Chen, Peng Zhang, Qingyun Liu, Li Guo

Abstract:

With the Internet becoming the dominant channel for business and life, many IPs are increasingly masked using web proxies for illegal purposes such as propagating malware, impersonate phishing pages to steal sensitive data or redirect victims to other malicious targets. Moreover, as Internet traffic continues to grow in size and complexity, it has become an increasingly challenging task to detect the proxy service due to their dynamic update and high anonymity. In this paper, we present an approach based on behavioral graph analysis to study the behavior similarity of web proxy users. Specifically, we use bipartite graphs to model host communications from network traffic and build one-mode projections of bipartite graphs for discovering social-behavior similarity of web proxy users. Based on the similarity matrices of end-users from the derived one-mode projection graphs, we apply a simple yet effective spectral clustering algorithm to discover the inherent web proxy users behavior clusters. The web proxy URL may vary from time to time. Still, the inherent interest would not. So, based on the intuition, by dint of our private tools implemented by WebDriver, we examine whether the top URLs visited by the web proxy users are web proxies. Our experiment results based on real datasets show that the behavior clusters not only reduce the number of URLs analysis but also provide an effective way to detect the web proxies, especially for the unknown web proxies.

Keywords: Bipartite graph, clustering, one-mode projection, web proxy detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 696
1232 Defect-Based Urgency Index for Bridge Maintenance Ranking and Prioritization

Authors: Saleh Abu Dabous, Khaled Hamad, Rami Al-Ruzouq

Abstract:

Bridge condition assessment and rating provide essential information needed for bridge management. This paper reviews bridge inspection and condition rating practices and introduces a defect-based urgency index. The index is estimated at the element-level based on the extent and severity of the different defects typical to the bridge element. The urgency index approach has the following advantages: (1) It facilitates judgment submission, i.e. instead of rating the bridge element with a specific linguistic overall expression (which can be subjective and used differently by different people), the approach is based on assessing the defects; (2) It captures multiple defects that can be present within a deteriorated element; and (3) It reflects how critical the element is through quantifying critical defects and their severity. The approach can be further developed and validated. It is expected to be useful for practical purposes as an early-warning system for critical bridge elements.

Keywords: Condition rating, deterioration, inspection, maintenance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1844
1231 Influence of Drought on Yield and Yield Components in White Bean

Authors: Gholamreza Habibi

Abstract:

In order to study seed yield and seed yield components in bean under reduced irrigation condition and assessment drought tolerance of genotypes, 15 lines of White beans were evaluated in two separate RCB design with 3 replications under stress and non stress conditions. Analysis of variance showed that there were significant differences among varieties in terms of traits under study, indicating the existence of genetic variation among varieties. The results indicate that drought stress reduced seed yield, number of seed per plant, biological yield and number of pod in White been. In non stress condition, yield was highly correlated with the biological yield, whereas in stress condition it was highly correlated with harvest index. Results of stepwise regression showed that, selection can we done based on, biological yield, harvest index, number of seed per pod, seed length, 100 seed weight. Result of path analysis showed that the highest direct effect, being positive, was related to biological yield in non stress and to harvest index in stress conditions. Factor analysis were accomplished in stress and nonstress condition a, there were 4 factors that explained more than 76 percent of total variations. We used several selection indices such as Stress Susceptibility Index ( SSI ), Geometric Mean Productivity ( GMP ), Mean Productivity ( MP ), Stress Tolerance Index ( STI ) and Tolerance Index ( TOL ) to study drought tolerance of genotypes, we found that the best Stress Index for selection tolerance genotypes were STI, GMP and MP were the greatest correlations between these Indices and seed yield under stress and non stress conditions. In classification of genotypes base on phenotypic characteristics, using cluster analysis ( UPGMA ), all allels classified in 5 separate groups in stress and non stress conditions.

Keywords: Cluster analysis, factor analysis, path analysis, selection index, White bean

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2097
1230 Improving Topic Quality of Scripts by Using Scene Similarity Based Word Co-Occurrence

Authors: Yunseok Noh, Chang-Uk Kwak, Sun-Joong Kim, Seong-Bae Park

Abstract:

Scripts are one of the basic text resources to understand broadcasting contents. Topic modeling is the method to get the summary of the broadcasting contents from its scripts. Generally, scripts represent contents descriptively with directions and speeches, and provide scene segments that can be seen as semantic units. Therefore, a script can be topic modeled by treating a scene segment as a document. Because scene segments consist of speeches mainly, however, relatively small co-occurrences among words in the scene segments are observed. This causes inevitably the bad quality of topics by statistical learning method. To tackle this problem, we propose a method to improve topic quality with additional word co-occurrence information obtained using scene similarities. The main idea of improving topic quality is that the information that two or more texts are topically related can be useful to learn high quality of topics. In addition, more accurate topical representations lead to get information more accurate whether two texts are related or not. In this paper, we regard two scene segments are related if their topical similarity is high enough. We also consider that words are co-occurred if they are in topically related scene segments together. By iteratively inferring topics and determining semantically neighborhood scene segments, we draw a topic space represents broadcasting contents well. In the experiments, we showed the proposed method generates a higher quality of topics from Korean drama scripts than the baselines.

Keywords: Broadcasting contents, generalized P´olya urn model, scripts, text similarity, topic model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1775
1229 Error-Robust Nature of Genome Profiling Applied for Clustering of Species Demonstrated by Computer Simulation

Authors: Shamim Ahmed Koichi Nishigaki

Abstract:

Genome profiling (GP), a genotype based technology, which exploits random PCR and temperature gradient gel electrophoresis, has been successful in identification/classification of organisms. In this technology, spiddos (Species identification dots) and PaSS (Pattern similarity score) were employed for measuring the closeness (or distance) between genomes. Based on the closeness (PaSS), we can buildup phylogenetic trees of the organisms. We noticed that the topology of the tree is rather robust against the experimental fluctuation conveyed by spiddos. This fact was confirmed quantitatively in this study by computer-simulation, providing the limit of the reliability of this highly powerful methodology. As a result, we could demonstrate the effectiveness of the GP approach for identification/classification of organisms.

Keywords: Fluctuation, Genome profiling (GP), Pattern similarity score (PaSS), Robustness, Spiddos-shift.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1493
1228 The Result of Suggestion for Low Energy Diet (1,000-1,200 kcal) in Obese Women to the Effect on Body Weight, Waist Circumference, and BMI

Authors: S. Kumchoo

Abstract:

The result of suggestion for low energy diet (1,000-1,200 kcal) in obese women to the effect on body weight, waist circumference and body mass index (BMI) in this experiment. Quisi experimental research was used for this study and it is a One-group pretest-posttest designs measurement method. The aim of this study was body weight, waist circumference and body mass index (BMI) reduction by using low energy diet (1,000-1,200 kcal) in obese women, the result found that in 15 of obese women that contained their body mass index (BMI) ≥ 30, after they obtained low energy diet (1,000-1,200 kcal) within 2 weeks. The data were collected before and after of testing the results showed that the average of body weight decrease 3.4 kilogram, waist circumference value decrease 6.1 centimeter and the body mass index (BMI) decrease 1.3 kg.m2 from their previous body weight, waist circumference and body mass index (BMI) before experiment started. After this study, the volunteers got healthy and they can choose or select some food for themselves. For this study, the research can be improved for data development for forward study in the future.

Keywords: Body weight, waist circumference, BMI, low energy diet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 935
1227 Effect of Rollers Differential Speed and Paddy Moisture Content on Performance of Rubber Roll Husker

Authors: S. Firouzi, M.R. Alizadeh, S. Minaei

Abstract:

A study was carried out at the Rice Research Institute of Iran (RRII) to investigate the effect of rollers differential peripheral speed of commercial rubber roll husker and paddy moisture content on the husking index and percentage of broken rice. The experiment was conducted at six levels of rollers differential speed (1.5, 2.2, 2.9, 3.6, 4.3 and 5 m/s) and three levels of paddy moisture content (8-9, 10-11 and 12-13% w.b.). Two common paddy varieties namely, Binam and Khazer, were selected for this study. Results revealed that the effect of rollers differential speed and moisture content significantly (P<0.01) affected percentage of broken brown rice and paddy husking index. Average broken kernel percentage increased from 13 to 14.61% while husking index decreased from 71.64 to 61.81%, as paddy moisture content increased from 8-9 to 12-13%. It was observed that amount of broken rice decreased from 18.83 to 9.97%, when rollers differential speed varied from 1.5 to 5 m/s, while the husking index initially increased and then started to decrease. The mean value of husking index for Khazar variety (64.71%) was significantly lower than that for Binam variety (69.2%). It was concluded that rollers differential speed of 2.9 m/s and moisture content of 8-9% was the most appropriate combination for paddy husking of Binam and Khazar varieties in rubber roll husker.

Keywords: husking index, moisture content, paddy, rubber roll husker.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3243
1226 Classification Influence Index and its Application for k-Nearest Neighbor Classifier

Authors: Sejong Oh

Abstract:

Classification is an important topic in machine learning and bioinformatics. Many datasets have been introduced for classification tasks. A dataset contains multiple features, and the quality of features influences the classification accuracy of the dataset. The power of classification for each feature differs. In this study, we suggest the Classification Influence Index (CII) as an indicator of classification power for each feature. CII enables evaluation of the features in a dataset and improved classification accuracy by transformation of the dataset. By conducting experiments using CII and the k-nearest neighbor classifier to analyze real datasets, we confirmed that the proposed index provided meaningful improvement of the classification accuracy.

Keywords: accuracy, classification, dataset, data preprocessing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1452
1225 Heat Transfer in a Parallel-Plate Enclosure with Graded-Index Coatings on its Walls

Authors: Jiun-Wei Chen, Chih-Yang Wu, Ming-Feng Hou

Abstract:

A numerical study on the heat transfer in the thermal barrier coatings and the substrates of a parallel-plate enclosure is carried out. Some of the thermal barrier coatings, such as ceramics, are semitransparent and are of interest for high-temperature applications where radiation effects are significant. The radiative transfer equations and the energy equations are solved by using the discrete ordinates method and the finite difference method. Illustrative results are presented for temperature distributions in the coatings and the opaque walls under various heating conditions. The results show that the temperature distribution is more uniform in the interior portion of each coating away from its boundary for the case with a larger average of varying refractive index and a positive gradient of refractive index enhances radiative transfer to the substrates.

Keywords: Radiative transfer, parallel-plate enclosure, coatings, varying refractive index

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1422
1224 An Index based Forward Backward Multiple Pattern Matching Algorithm

Authors: Raju Bhukya, DVLN Somayajulu

Abstract:

Pattern matching is one of the fundamental applications in molecular biology. Searching DNA related data is a common activity for molecular biologists. In this paper we explore the applicability of a new pattern matching technique called Index based Forward Backward Multiple Pattern Matching algorithm(IFBMPM), for DNA Sequences. Our approach avoids unnecessary comparisons in the DNA Sequence due to this; the number of comparisons of the proposed algorithm is very less compared to other existing popular methods. The number of comparisons rapidly decreases and execution time decreases accordingly and shows better performance.

Keywords: Comparisons, DNA Sequence, Index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2331
1223 Destination Port Detection for Vessels: An Analytic Tool for Optimizing Port Authorities Resources

Authors: Lubna Eljabu, Mohammad Etemad, Stan Matwin

Abstract:

Port authorities have many challenges in congested ports to allocate their resources to provide a safe and secure loading/unloading procedure for cargo vessels. Selecting a destination port is the decision of a vessel master based on many factors such as weather, wavelength and changes of priorities. Having access to a tool which leverages Automatic Identification System (AIS) messages to monitor vessel’s movements and accurately predict their next destination port promotes an effective resource allocation process for port authorities. In this research, we propose a method, namely, Reference Route of Trajectory (RRoT) to assist port authorities in predicting inflow and outflow traffic in their local environment by monitoring AIS messages. Our RRo method creates a reference route based on historical AIS messages. It utilizes some of the best trajectory similarity measures to identify the destination of a vessel using their recent movement. We evaluated five different similarity measures such as Discrete Frechet Distance (DFD), Dynamic Time ´ Warping (DTW), Partial Curve Mapping (PCM), Area between two curves (Area) and Curve length (CL). Our experiments show that our method identifies the destination port with an accuracy of 98.97% and an f-measure of 99.08% using Dynamic Time Warping (DTW) similarity measure.

Keywords: Spatial temporal data mining, trajectory mining, trajectory similarity, resource optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 625
1222 Applications of Rough Set Decompositions in Information Retrieval

Authors: Chen Wu, Xiaohua Hu

Abstract:

This paper proposes rough set models with three different level knowledge granules in incomplete information system under tolerance relation by similarity between objects according to their attribute values. Through introducing dominance relation on the discourse to decompose similarity classes into three subclasses: little better subclass, little worse subclass and vague subclass, it dismantles lower and upper approximations into three components. By using these components, retrieving information to find naturally hierarchical expansions to queries and constructing answers to elaborative queries can be effective. It illustrates the approach in applying rough set models in the design of information retrieval system to access different granular expanded documents. The proposed method enhances rough set model application in the flexibility of expansions and elaborative queries in information retrieval.

Keywords: Incomplete information system, Rough set model, tolerance relation, dominance relation, approximation, decomposition, elaborative query.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1572
1221 The Use of Thermal Infrared Wavelengths to Determine the Volcanic Soils

Authors: Levent Basayigit, Mert Dedeoglu, Fadime Ozogul

Abstract:

In this study, an application was carried out to determine the Volcanic Soils by using remote sensing.  The study area was located on the Golcuk formation in Isparta-Turkey. The thermal bands of Landsat 7 image were used for processing. The implementation of the climate model that was based on the water index was used in ERDAS Imagine software together with pixel based image classification. Soil Moisture Index (SMI) was modeled by using the surface temperature (Ts) which was obtained from thermal bands and vegetation index (NDVI) derived from Landsat 7. Surface moisture values were grouped and classified by using scoring system. Thematic layers were compared together with the field studies. Consequently, different moisture levels for volcanic soils were indicator for determination and separation. Those thermal wavelengths are preferable bands for separation of volcanic soils using moisture and temperature models.

Keywords: Landsat 7, soil moisture index, temperature models, volcanic soils.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1046
1220 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: Data mining, knowledge discovery, machine learning, similarity measurement, supervised classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1479
1219 Image Retrieval Using Fused Features

Authors: K. Sakthivel, R. Nallusamy, C. Kavitha

Abstract:

The system is designed to show images which are related to the query image. Extracting color, texture, and shape features from an image plays a vital role in content-based image retrieval (CBIR). Initially RGB image is converted into HSV color space due to its perceptual uniformity. From the HSV image, Color features are extracted using block color histogram, texture features using Haar transform and shape feature using Fuzzy C-means Algorithm. Then, the characteristics of the global and local color histogram, texture features through co-occurrence matrix and Haar wavelet transform and shape are compared and analyzed for CBIR. Finally, the best method of each feature is fused during similarity measure to improve image retrieval effectiveness and accuracy.

Keywords: Color Histogram, Haar Wavelet Transform, Fuzzy C-means, Co-occurrence matrix; Similarity measure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2093
1218 A Numerical Solution Based On Operational Matrix of Differentiation of Shifted Second Kind Chebyshev Wavelets for a Stefan Problem

Authors: Rajeev, N. K. Raigar

Abstract:

In this study, one dimensional phase change problem (a Stefan problem) is considered and a numerical solution of this problem is discussed. First, we use similarity transformation to convert the governing equations into ordinary differential equations with its boundary conditions. The solutions of ordinary differential equation with the associated boundary conditions and interface condition (Stefan condition) are obtained by using a numerical approach based on operational matrix of differentiation of shifted second kind Chebyshev wavelets. The obtained results are compared with existing exact solution which is sufficiently accurate.

Keywords: Operational matrix of differentiation, Similarity transformation, Shifted second kind Chebyshev wavelets, Stefan problem.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1957
1217 Analysis of Food Security Situation among Nigerian Rural Farmers

Authors: Victoria A. Okwoche, Benjamin C. Asogwa

Abstract:

This paper analysed the food security situation among Nigerian rural farmers. Data collected on 202 rural farmers from Benue State were analysed using descriptive and inferential statistics. The study revealed that majority of the respondents (60.83%) had medium dietary diversity. Furthermore, household daily calorie requirement for the food secure households was 10,723 and the household daily calorie consumption was 12,598, with a surplus index of 0.04. The food security index was 1.16. The Household daily per capita calorie consumption was 3,221.2. For the food insecure households, the household daily calorie requirement was 20,213 and the household daily calorie consumption was 17,393. The shortfall index was 0.14. The food security index was 0.88. The Household daily per capita calorie consumption was 2,432.8. The most commonly used coping strategies during food stress included intercropping (99.2%), reliance on less preferred food (98.1%), limiting portion size at meal times (85.8%) and crop diversification (70.8%).

Keywords: Analysis, food security, rural areas, farmers, Nigeria.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2851
1216 Economic Factorial Analysis of CO2 Emissions: The Divisia Index with Interconnected Factors Approach

Authors: Alexander Y. Vaninsky

Abstract:

This paper presents a method of economic factorial analysis of the CO2 emissions based on the extension of the Divisia index to interconnected factors. This approach, contrary to the Kaya identity, considers three main factors of the CO2 emissions: gross domestic product, energy consumption, and population - as equally important, and allows for accounting of all of them simultaneously. The three factors are included into analysis together with their carbon intensities that allows for obtaining a comprehensive picture of the change in the CO2 emissions. A computer program in R-language that is available for free download serves automation of the calculations. A case study of the U.S. carbon dioxide emissions is used as an example. 

Keywords: CO2 emissions, Economic analysis, Factorial analysis, Divisia index, Interconnected factors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2469
1215 Generation of Photo-Mosaic Images through Block Matching and Color Adjustment

Authors: Hae-Yeoun Lee

Abstract:

Mosaic refers to a technique that makes image by gathering lots of small materials in various colors. This paper presents an automatic algorithm that makes the photo-mosaic image using photos. The algorithm is composed of 4 steps: partition and feature extraction, block matching, redundancy removal and color adjustment. The input image is partitioned in the small block to extract feature. Each block is matched to find similar photo in database by comparing similarity with Euclidean difference between blocks. The intensity of the block is adjusted to enhance the similarity of image by replacing the value of light and darkness with that of relevant block. Further, the quality of image is improved by minimizing the redundancy of tiles in the adjacent blocks. Experimental results support that the proposed algorithm is excellent in quantitative analysis and qualitative analysis.

Keywords: Photo-mosaic, Euclidean distance, Block matching, Intensity adjustment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3527
1214 Assessing Organizational Resilience Capacity to Flooding: Index Development and Application to Greek Small and Medium-Sized Enterprises

Authors: A. Skouloudis, K. Evangelinos, W. Leal-Filho, P. Vouros, I. Nikolaou, T. Tsalis

Abstract:

In this study a composite index of factors linked to the resilience capacity of small and medium-sized enterprises (SMEs) to flooding is proposed and tested. A sample of SMEs located in flood-prone areas (n = 391) was administered a structured questionnaire pertaining to cognitive, managerial and contextual factors that affect the ability to prepare, withstand, and recover from flooding events. Through the proposed index, a bottom-up, self-assessment approach is set forth that could assist in standardizing such assessments with an overarching aim of reducing the vulnerability of SMEs to floods. This is achieved by examining critical internal and external parameters affecting SMEs’ resilience capacity which is particularly important taking into account the limited resources these enterprises tend to have at their disposal and that they can generate single points of failure in dense supply chain networks.

Keywords: Floods, SMEs, organizational resilience capacity, index development, Greece.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 389
1213 Time Series Regression with Meta-Clusters

Authors: Monika Chuchro

Abstract:

This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain subgroups of time series data with normal distribution from the inflow into wastewater treatment plant data, composed of several groups differing by mean value. Two simple algorithms, K-mean and EM, were chosen as a clustering method. The Rand index was used to measure the similarity. After simple meta-clustering, a regression model was performed for each subgroups. The final model was a sum of the subgroups models. The quality of the obtained model was compared with the regression model made using the same explanatory variables, but with no clustering of data. Results were compared using determination coefficient (R2), measure of prediction accuracy- mean absolute percentage error (MAPE) and comparison on a linear chart. Preliminary results allow us to foresee the potential of the presented technique.

Keywords: Clustering, Data analysis, Data mining, Predictive models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1900