Search results for: Data Structure Normalization
9546 Image Retrieval Based on Multi-Feature Fusion for Heterogeneous Image Databases
Authors: N. W. U. D. Chathurani, Shlomo Geva, Vinod Chandran, Proboda Rajapaksha
Abstract:
Selecting an appropriate image representation is the most important factor in implementing an effective Content-Based Image Retrieval (CBIR) system. This paper presents a multi-feature fusion approach for efficient CBIR, based on the distance distribution of features and relative feature weights at the time of query processing. It is a simple yet effective approach, which is free from the effect of features' dimensions, ranges, internal feature normalization and the distance measure. This approach can easily be adopted in any feature combination to improve retrieval quality. The proposed approach is empirically evaluated using two benchmark datasets for image classification (a subset of the Corel dataset and Oliva and Torralba) and compared with existing approaches. The performance of the proposed approach is confirmed with the significantly improved performance in comparison with the independently evaluated baseline of the previously proposed feature fusion approaches.
Keywords: Feature fusion, image retrieval, membership function, normalization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13429545 Data Extraction of XML Files using Searching and Indexing Techniques
Authors: Sushma Satpute, Vaishali Katkar, Nilesh Sahare
Abstract:
XML files contain data which is in well formatted manner. By studying the format or semantics of the grammar it will be helpful for fast retrieval of the data. There are many algorithms which describes about searching the data from XML files. There are no. of approaches which uses data structure or are related to the contents of the document. In these cases user must know about the structure of the document and information retrieval techniques using NLPs is related to content of the document. Hence the result may be irrelevant or not so successful and may take more time to search.. This paper presents fast XML retrieval techniques by using new indexing technique and the concept of RXML. When indexing an XML document, the system takes into account both the document content and the document structure and assigns the value to each tag from file. To query the system, a user is not constrained about fixed format of query.
Keywords: XML Retrieval, Indexed Search, Information Retrieval.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17829544 On CR-Structure and F-Structure Satisfying Polynomial Equation
Authors: Manisha Kankarej
Abstract:
The purpose of this paper is to show a relation between CR structure and F-structure satisfying polynomial equation. In this paper, we have checked the significance of CR structure and F-structure on Integrability conditions and Nijenhuis tensor. It was proved that all the properties of Integrability conditions and Nijenhuis tensor are satisfied by CR structures and F-structure satisfying polynomial equation.Keywords: CR-submainfolds, CR-structure, Integrability condition & Nijenhuis tensor.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7929543 CompPSA: A Component-Based Pairwise RNA Secondary Structure Alignment Algorithm
Authors: Ghada Badr, Arwa Alturki
Abstract:
The biological function of an RNA molecule depends on its structure. The objective of the alignment is finding the homology between two or more RNA secondary structures. Knowing the common functionalities between two RNA structures allows a better understanding and a discovery of other relationships between them. Besides, identifying non-coding RNAs -that is not translated into a protein- is a popular application in which RNA structural alignment is the first step A few methods for RNA structure-to-structure alignment have been developed. Most of these methods are partial structure-to-structure, sequence-to-structure, or structure-to-sequence alignment. Less attention is given in the literature to the use of efficient RNA structure representation and the structure-to-structure alignment methods are lacking. In this paper, we introduce an O(N2) Component-based Pairwise RNA Structure Alignment (CompPSA) algorithm, where structures are given as a component-based representation and where N is the maximum number of components in the two structures. The proposed algorithm compares the two RNA secondary structures based on their weighted component features rather than on their base-pair details. Extensive experiments are conducted illustrating the efficiency of the CompPSA algorithm when compared to other approaches and on different real and simulated datasets. The CompPSA algorithm shows an accurate similarity measure between components. The algorithm gives the flexibility for the user to align the two RNA structures based on their weighted features (position, full length, and/or stem length). Moreover, the algorithm proves scalability and efficiency in time and memory performance.Keywords: Alignment, RNA secondary structure, pairwise, component-based, data mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9739542 First Studies of the Influence of Single Gene Perturbations on the Inference of Genetic Networks
Authors: Frank Emmert-Streib, Matthias Dehmer
Abstract:
Inferring the network structure from time series data is a hard problem, especially if the time series is short and noisy. DNA microarray is a technology allowing to monitor the mRNA concentration of thousands of genes simultaneously that produces data of these characteristics. In this study we try to investigate the influence of the experimental design on the quality of the result. More precisely, we investigate the influence of two different types of random single gene perturbations on the inference of genetic networks from time series data. To obtain an objective quality measure for this influence we simulate gene expression values with a biologically plausible model of a known network structure. Within this framework we study the influence of single gene knock-outs in opposite to linearly controlled expression for single genes on the quality of the infered network structure.Keywords: Dynamic Bayesian networks, microarray data, structure learning, Markov chain Monte Carlo.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15499541 Topological Queries on Graph-structured XML Data: Models and Implementations
Authors: Hongzhi Wang, Jianzhong Li, Jizhou Luo
Abstract:
In many applications, data is in graph structure, which can be naturally represented as graph-structured XML. Existing queries defined on tree-structured and graph-structured XML data mainly focus on subgraph matching, which can not cover all the requirements of querying on graph. In this paper, a new kind of queries, topological query on graph-structured XML is presented. This kind of queries consider not only the structure of subgraph but also the topological relationship between subgraphs. With existing subgraph query processing algorithms, efficient algorithms for topological query processing are designed. Experimental results show the efficiency of implementation algorithms.Keywords: XML, Graph Structure, Topological query.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14149540 DIFFER: A Propositionalization approach for Learning from Structured Data
Authors: Thashmee Karunaratne, Henrik Böstrom
Abstract:
Logic based methods for learning from structured data is limited w.r.t. handling large search spaces, preventing large-sized substructures from being considered by the resulting classifiers. A novel approach to learning from structured data is introduced that employs a structure transformation method, called finger printing, for addressing these limitations. The method, which generates features corresponding to arbitrarily complex substructures, is implemented in a system, called DIFFER. The method is demonstrated to perform comparably to an existing state-of-art method on some benchmark data sets without requiring restrictions on the search space. Furthermore, learning from the union of features generated by finger printing and the previous method outperforms learning from each individual set of features on all benchmark data sets, demonstrating the benefit of developing complementary, rather than competing, methods for structure classification.Keywords: Machine learning, Structure classification, Propositionalization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12219539 Semantic Spatial Objects Data Structure for Spatial Access Method
Authors: Kalum Priyanath Udagepola, Zuo Decheng, Wu Zhibo, Yang Xiaozong
Abstract:
Modern spatial database management systems require a unique Spatial Access Method (SAM) in order solve complex spatial quires efficiently. In this case the spatial data structure takes a prominent place in the SAM. Inadequate data structure leads forming poor algorithmic choices and forging deficient understandings of algorithm behavior on the spatial database. A key step in developing a better semantic spatial object data structure is to quantify the performance effects of semantic and outlier detections that are not reflected in the previous tree structures (R-Tree and its variants). This paper explores a novel SSRO-Tree on SAM to the Topo-Semantic approach. The paper shows how to identify and handle the semantic spatial objects with outlier objects during page overflow/underflow, using gain/loss metrics. We introduce a new SSRO-Tree algorithm which facilitates the achievement of better performance in practice over algorithms that are superior in the R*-Tree and RO-Tree by considering selection queries.
Keywords: Outlier, semantic spatial object, spatial objects, SSRO-Tree, topo-semantic.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16939538 Foundation of the Information Model for Connected-Cars
Authors: Hae-Won Seo, Yong-Gu Lee
Abstract:
Recent progress in the next generation of automobile technology is geared towards incorporating information technology into cars. Collectively called smart cars are bringing intelligence to cars that provides comfort, convenience and safety. A branch of smart cars is connected-car system. The key concept in connected-cars is the sharing of driving information among cars through decentralized manner enabling collective intelligence. This paper proposes a foundation of the information model that is necessary to define the driving information for smart-cars. Road conditions are modeled through a unique data structure that unambiguously represent the time variant traffics in the streets. Additionally, the modeled data structure is exemplified in a navigational scenario and usage using UML. Optimal driving route searching is also discussed using the proposed data structure in a dynamically changing road conditions.Keywords: Connected-car, data modeling, route planning, navigation system.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19629537 Fragility Analysis of Weir Structure Subjected to Flooding Water Damage
Authors: Oh Hyeon Jeon, WooYoung Jung
Abstract:
In this study, seepage analysis was performed by the level difference between upstream and downstream of weir structure for safety evaluation of weir structure against flooding. Monte Carlo Simulation method was employed by considering the probability distribution of the adjacent ground parameter, i.e., permeability coefficient of weir structure. Moreover, by using a commercially available finite element program (ABAQUS), modeling of the weir structure is carried out. Based on this model, the characteristic of water seepage during flooding was determined at each water level with consideration of the uncertainty of their corresponding permeability coefficient. Subsequently, fragility function could be constructed based on this response from numerical analysis; this fragility function results could be used to determine the weakness of weir structure subjected to flooding disaster. They can also be used as a reference data that can comprehensively predict the probability of failur,e and the degree of damage of a weir structure.
Keywords: Weir structure, seepage, flood disaster fragility, probabilistic risk assessment, Monte-Carlo Simulation, permeability coefficient.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11599536 XML Data Management in Compressed Relational Database
Authors: Hongzhi Wang, Jianzhong Li, Hong Gao
Abstract:
XML is an important standard of data exchange and representation. As a mature database system, using relational database to support XML data may bring some advantages. But storing XML in relational database has obvious redundancy that wastes disk space, bandwidth and disk I/O when querying XML data. For the efficiency of storage and query XML, it is necessary to use compressed XML data in relational database. In this paper, a compressed relational database technology supporting XML data is presented. Original relational storage structure is adaptive to XPath query process. The compression method keeps this feature. Besides traditional relational database techniques, additional query process technologies on compressed relations and for special structure for XML are presented. In this paper, technologies for XQuery process in compressed relational database are presented..Keywords: XML, compression, query processing
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18049535 Soil Resistivity Data Computations; Single and Two - Layer Soil Resistivity Structure and Its Implication on Earthing Design
Authors: M. Nassereddine, J. Rizk, G. Nasserddine
Abstract:
Performing High Voltage (HV) tasks with a multi craft work force create a special set of safety circumstances. This paper aims to present vital information relating to when it is acceptable to use a single or a two-layer soil structure. Also it discusses the implication of the high voltage infrastructure on the earth grid and the safety of this implication under a single or a two-layer soil structure. A multiple case study is investigated to show the importance of using the right soil resistivity structure during the earthing system design.Keywords: Earth Grid, EPR, High Voltage, Soil Resistivity Structure, Step Voltage, Touch Voltage.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 88229534 A New Model for Discovering XML Association Rules from XML Documents
Authors: R. AliMohammadzadeh, M. Rahgozar, A. Zarnani
Abstract:
The inherent flexibilities of XML in both structure and semantics makes mining from XML data a complex task with more challenges compared to traditional association rule mining in relational databases. In this paper, we propose a new model for the effective extraction of generalized association rules form a XML document collection. We directly use frequent subtree mining techniques in the discovery process and do not ignore the tree structure of data in the final rules. The frequent subtrees based on the user provided support are split to complement subtrees to form the rules. We explain our model within multi-steps from data preparation to rule generation.Keywords: XML, Data Mining, Association Rule Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16309533 Social Structure, Involuntary Relations, and Urban Poverty
Authors: Mahmood Niroobakhsh
Abstract:
This article deals with special structuralism approaches to explain a certain kind of social problem. Widespread presence of poverty is a reminder of deep-rooted unresolved problems of social relations. The expected role from an individual for the social system recognizes poverty derived from an interrelated social structure. By the time, enabled to act on his role in the course of social interaction, reintegration of the poor in society may take place. Poverty and housing type are reflections of the underlying social structure, primarily structure’s elements, systemic interrelations, and the overall strength or weakness of that structure. Poverty varies based on social structure in that the stronger structures are less likely to produce poverty.Keywords: Absolute poverty, relative poverty, social structure, urban poverty.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16849532 Generating Concept Trees from Dynamic Self-organizing Map
Authors: Norashikin Ahmad, Damminda Alahakoon
Abstract:
Self-organizing map (SOM) provides both clustering and visualization capabilities in mining data. Dynamic self-organizing maps such as Growing Self-organizing Map (GSOM) has been developed to overcome the problem of fixed structure in SOM to enable better representation of the discovered patterns. However, in mining large datasets or historical data the hierarchical structure of the data is also useful to view the cluster formation at different levels of abstraction. In this paper, we present a technique to generate concept trees from the GSOM. The formation of tree from different spread factor values of GSOM is also investigated and the quality of the trees analyzed. The results show that concept trees can be generated from GSOM, thus, eliminating the need for re-clustering of the data from scratch to obtain a hierarchical view of the data under study.
Keywords: dynamic self-organizing map, concept formation, clustering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14579531 Data-organization Before Learning Multi-Entity Bayesian Networks Structure
Authors: H. Bouhamed, A. Rebai, T. Lecroq, M. Jaoua
Abstract:
The objective of our work is to develop a new approach for discovering knowledge from a large mass of data, the result of applying this approach will be an expert system that will serve as diagnostic tools of a phenomenon related to a huge information system. We first recall the general problem of learning Bayesian network structure from data and suggest a solution for optimizing the complexity by using organizational and optimization methods of data. Afterward we proposed a new heuristic of learning a Multi-Entities Bayesian Networks structures. We have applied our approach to biological facts concerning hereditary complex illnesses where the literatures in biology identify the responsible variables for those diseases. Finally we conclude on the limits arched by this work.
Keywords: Data-organization, data-optimization, automatic knowledge discovery, Multi-Entities Bayesian networks, score merging.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16109530 Adaptive Hierarchical Key Structure Generation for Key Management in Wireless Sensor Networks using A*
Authors: Jin Myoung Kim, Tae Ho Cho
Abstract:
Wireless Sensor networks have a wide spectrum of civil and military applications that call for secure communication such as the terrorist tracking, target surveillance in hostile environments. For the secure communication in these application areas, we propose a method for generating a hierarchical key structure for the efficient group key management. In this paper, we apply A* algorithm in generating a hierarchical key structure by considering the history data of the ratio of addition and eviction of sensor nodes in a location where sensor nodes are deployed. Thus generated key tree structure provides an efficient way of managing the group key in terms of energy consumption when addition and eviction event occurs. A* algorithm tries to minimize the number of messages needed for group key management by the history data. The experimentation with the tree shows efficiency of the proposed method.
Keywords: Heuristic search, key management, security, sensor network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16839529 Balancing Strategies for Parallel Content-based Data Retrieval Algorithms in a k-tree Structured Database
Authors: Radu Dobrescu, Matei Dobrescu, Daniela Hossu
Abstract:
The paper proposes a unified model for multimedia data retrieval which includes data representatives, content representatives, index structure, and search algorithms. The multimedia data are defined as k-dimensional signals indexed in a multidimensional k-tree structure. The benefits of using the k-tree unified model were demonstrated by running the data retrieval application on a six networked nodes test bed cluster. The tests were performed with two retrieval algorithms, one that allows parallel searching using a single feature, the second that performs a weighted cascade search for multiple features querying. The experiments show a significant reduction of retrieval time while maintaining the quality of results.
Keywords: balancing strategies, multimedia databases, parallelprocessing, retrieval algorithms
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14229528 The Resource Description Framework (RDF) as a Modern Structure for Medical Data
Authors: Gabriela Lindemann, Danilo Schmidt, Thomas Schrader, Dietmar Keune
Abstract:
The amount and heterogeneity of data in biomedical research, notably in interdisciplinary fields, requires new methods for the collection, presentation and analysis of information. Important data from laboratory experiments as well as patient trials are available but come out of distributed resources. The Charité - University Hospital Berlin has established together with the German Research Foundation (DFG) a new information service centre for kidney diseases and transplantation (Open European Nephrology Science Centre - OpEN.SC). Beside a collaborative aspect to create new research groups every single partner or institution of this science information centre making his own data available is allowed to search the whole data pool of the various involved centres. A core task is the implementation of a non-restricting open data structure for the various different data sources. We decided to use a modern RDF model and in a first phase transformed original data coming from the web-based Electronic Patient Record database TBase©.
Keywords: Medical databases, Resource Description Framework (RDF), metadata repository.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20309527 Comparing Data Analysis, Communication and Information Technologies Expertise Levels in Undergraduate Psychology Students
Authors: Ana Cázares
Abstract:
Aims for this study: first, to compare the expertise level in data analysis, communication and information technologies in undergraduate psychology students. Second, to verify the factor structure of E-ETICA (Escala de Experticia en Tecnologias de la Informacion, la Comunicacion y el Análisis or Data Analysis, Communication and Information'Expertise Scale) which had shown an excellent internal consistency (α= 0.92) as well as a simple factor structure. Three factors, Complex, Basic Information and Communications Technologies and E-Searching and Download Abilities, explains 63% of variance. In the present study, 260 students (119 juniors and 141 seniors) were asked to respond to ETICA (16 items Likert scale of five points 1: null domain to 5: total domain). The results show that both junior and senior students report having very similar expertise level; however, E-ETICA presents a different factor structure for juniors and four factors explained also 63% of variance: Information E-Searching, Download and Process; Data analysis; Organization; and Communication technologies.Keywords: Data analysis, Information, Communications Technologies, Expertise'Levels.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12859526 Content-based Retrieval of Medical Images
Authors: Lilac A. E. Al-Safadi
Abstract:
With the advance of multimedia and diagnostic images technologies, the number of radiographic images is increasing constantly. The medical field demands sophisticated systems for search and retrieval of the produced multimedia document. This paper presents an ongoing research that focuses on the semantic content of radiographic image documents to facilitate semantic-based radiographic image indexing and a retrieval system. The proposed model would divide a radiographic image document, based on its semantic content, and would be converted into a logical structure or a semantic structure. The logical structure represents the overall organization of information. The semantic structure, which is bound to logical structure, is composed of semantic objects with interrelationships in the various spaces in the radiographic image.Keywords: Semantic Indexing, Content-Based Retrieval, Radiographic Images, Data Model
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14929525 Family Structure between Muslim and Santal Communities in Rural Bangladesh
Authors: Md. Emaj Uddin
Abstract:
Family structure that is culturally constructed in every society is the basic unit of social structure. Purpose of the study was to compare family structure, including marriage, residence, family size, type, role sharing, authority, and communication patterns between Muslim and Santal communities in rural Bangladesh. For this we assumed that family structure with the elements was significantly different between the two communities in rural Bangladesh. In so doing, 288 active couples (145 for Muslim and 143 for Santal) selected by cluster random sampling were intensively interviewed with a semi-structured questionnaire method. The results of Pearson Chi-Squire Test reveal that there were significant differences in the family structure followed by the two communities in the study area. Further cross-cultural study should be done on why family structure varies between the communities in Bangladesh.Keywords: Bangladesh, Cross-Cultural Comparison, Family Structure, Muslim, Santal.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27989524 Hybrid Structure Learning Approach for Assessing the Phosphate Laundries Impact
Authors: Emna Benmohamed, Hela Ltifi, Mounir Ben Ayed
Abstract:
Bayesian Network (BN) is one of the most efficient classification methods. It is widely used in several fields (i.e., medical diagnostics, risk analysis, bioinformatics research). The BN is defined as a probabilistic graphical model that represents a formalism for reasoning under uncertainty. This classification method has a high-performance rate in the extraction of new knowledge from data. The construction of this model consists of two phases for structure learning and parameter learning. For solving this problem, the K2 algorithm is one of the representative data-driven algorithms, which is based on score and search approach. In addition, the integration of the expert's knowledge in the structure learning process allows the obtainment of the highest accuracy. In this paper, we propose a hybrid approach combining the improvement of the K2 algorithm called K2 algorithm for Parents and Children search (K2PC) and the expert-driven method for learning the structure of BN. The evaluation of the experimental results, using the well-known benchmarks, proves that our K2PC algorithm has better performance in terms of correct structure detection. The real application of our model shows its efficiency in the analysis of the phosphate laundry effluents' impact on the watershed in the Gafsa area (southwestern Tunisia).
Keywords: Classification, Bayesian network; structure learning, K2 algorithm, expert knowledge, surface water analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5109523 Weakly Generalized Closed Map
Authors: R. Parimelazhagan, N. Nagaveni
Abstract:
In this paper we introduce a new class of mg-continuous mapping and studied some of its basic properties.We obtain some characterizations of such functions. Moreover we define sub minimal structure and further study certain properties of mg-closed sets.
Keywords: M-structure, mg-continuous mapping, minimal structure, mg T2 space, sub minimal structure, T12 space, mg-compact set.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15359522 An Evaluation Method of Accelerated Storage Life Test for Typical Mechanical and Electronic Products
Authors: Jinyong Yao, Hongzhi Li, Chao Du, Jiao Li
Abstract:
Reliability of long-term storage products is related to the availability of the whole system, and the evaluation of storage life is of great necessity. These products are usually highly reliable and little failure information can be collected. In this paper, an analytical method based on data from accelerated storage life test is proposed to evaluate the reliability index of the long-term storage products. Firstly, singularities are eliminated by data normalization and residual analysis. Secondly, with the preprocessed data, the degradation path model is built to obtain the pseudo life values. Then by life distribution hypothesis, we can get the estimator of parameters in high stress levels and verify failure mechanism consistency. Finally, the life distribution under the normal stress level is extrapolated via the acceleration model and evaluation of the actual average life is available. An application example with the camera stabilization device is provided to illustrate the methodology we proposed.
Keywords: Accelerated storage life test, failure mechanism consistency, life distribution, reliability.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22849521 Optimized Preprocessing for Accurate and Efficient Bioassay Prediction with Machine Learning Algorithms
Authors: Jeff Clarine, Chang-Shyh Peng, Daisy Sang
Abstract:
Bioassay is the measurement of the potency of a chemical substance by its effect on a living animal or plant tissue. Bioassay data and chemical structures from pharmacokinetic and drug metabolism screening are mined from and housed in multiple databases. Bioassay prediction is calculated accordingly to determine further advancement. This paper proposes a four-step preprocessing of datasets for improving the bioassay predictions. The first step is instance selection in which dataset is categorized into training, testing, and validation sets. The second step is discretization that partitions the data in consideration of accuracy vs. precision. The third step is normalization where data are normalized between 0 and 1 for subsequent machine learning processing. The fourth step is feature selection where key chemical properties and attributes are generated. The streamlined results are then analyzed for the prediction of effectiveness by various machine learning algorithms including Pipeline Pilot, R, Weka, and Excel. Experiments and evaluations reveal the effectiveness of various combination of preprocessing steps and machine learning algorithms in more consistent and accurate prediction.
Keywords: Bioassay, machine learning, preprocessing, virtual screen.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9809520 Protein Secondary Structure Prediction Using Parallelized Rule Induction from Coverings
Authors: Leong Lee, Cyriac Kandoth, Jennifer L. Leopold, Ronald L. Frank
Abstract:
Protein 3D structure prediction has always been an important research area in bioinformatics. In particular, the prediction of secondary structure has been a well-studied research topic. Despite the recent breakthrough of combining multiple sequence alignment information and artificial intelligence algorithms to predict protein secondary structure, the Q3 accuracy of various computational prediction algorithms rarely has exceeded 75%. In a previous paper [1], this research team presented a rule-based method called RT-RICO (Relaxed Threshold Rule Induction from Coverings) to predict protein secondary structure. The average Q3 accuracy on the sample datasets using RT-RICO was 80.3%, an improvement over comparable computational methods. Although this demonstrated that RT-RICO might be a promising approach for predicting secondary structure, the algorithm-s computational complexity and program running time limited its use. Herein a parallelized implementation of a slightly modified RT-RICO approach is presented. This new version of the algorithm facilitated the testing of a much larger dataset of 396 protein domains [2]. Parallelized RTRICO achieved a Q3 score of 74.6%, which is higher than the consensus prediction accuracy of 72.9% that was achieved for the same test dataset by a combination of four secondary structure prediction methods [2].Keywords: data mining, protein secondary structure prediction, parallelization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15959519 Very-high-Precision Normalized Eigenfunctions for a Class of Schrödinger Type Equations
Authors: Amna Noreen , Kare Olaussen
Abstract:
We demonstrate that it is possible to compute wave function normalization constants for a class of Schr¨odinger type equations by an algorithm which scales linearly (in the number of eigenfunction evaluations) with the desired precision P in decimals.
Keywords: Eigenvalue problems, bound states, trapezoidal rule, poisson resummation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28539518 Study and Evaluation of Added Stresses under Foundation due to Adjacent Structure
Authors: Alireza M. goltabar, Issa shooshpasha , Reza Shamstabar kami , Mostafa Habibi
Abstract:
Added stresses due to adjacent structure should be considered in foundation design and stress control in soil under the structure. This case is considered less than other cases in design and calculation whereas stresses in implementation are greater than analytical stress. Structure load are transmitted to earth by foundation and role of foundation is propagation of load on the continuous and half extreme soil. This act cause that, present stresses lessen to allowable strength of soil. Some researchers such as Boussinesq and westergaurd by using of some assumption studied on this issue, theorically. Target of this paper is study and evaluation of added stresses under structure due to adjacent structure. For this purpose, by using of assumption, theoric relation and numeral methods, effects of adjacent structure with 4 to 10 storeys on the main structure with 4 storeys are studied and effect of parameters and sensitivity of them are evaluated.Keywords: stress, soil, adjacent structure, foundation, loading.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14529517 The Effect on Lead Times When Normalizing a Supply Chain Process
Authors: Bassam Istanbouli
Abstract:
Organizations are living in a very competitive and dynamic environment which is constantly changing. In order to achieve a high level of service, the products and processes of these organizations need to be flexible and evolvable. If the supply chains are not modular and well designed, changes can bring combinatorial effects to most areas of a company from its management, financial, documentation, logistics and its information structure. Applying the normalized system’s concept to segments of the supply chain may help in reducing those ripple effects, but it may also increase lead times. Lead times are important and can become a decisive element in gaining customers. Industries are always under the pressure in providing good quality products, at competitive prices, when and how the customer wants them. Most of the time, the customers want their orders now, if not yesterday. The above concept will be proven by examining lead times in a manufacturing example before and after applying normalized systems concept to that segment of the chain. We will then show that although we can minimize the combinatorial effects when changes occur, the lead times will be increased.Keywords: Supply chain, lead time, normalization, modular.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 602