Search results for: Extraction and data integration

8451 A Survey of Semantic Integration Approaches in Bioinformatics

Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir

Abstract:

Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.

Keywords: Semantic data integration, biological ontology, linked data, semantic web, OWL, RDF.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1777

8450 Use of Bayesian Network in Information Extraction from Unstructured Data Sources

Authors: Quratulain N. Rajput, Sajjad Haider

Abstract:

This paper applies Bayesian Networks to support information extraction from unstructured, ungrammatical, and incoherent data sources for semantic annotation. A tool has been developed that combines ontologies, machine learning, and information extraction and probabilistic reasoning techniques to support the extraction process. Data acquisition is performed with the aid of knowledge specified in the form of ontology. Due to the variable size of information available on different data sources, it is often the case that the extracted data contains missing values for certain variables of interest. It is desirable in such situations to predict the missing values. The methodology, presented in this paper, first learns a Bayesian network from the training data and then uses it to predict missing data and to resolve conflicts. Experiments have been conducted to analyze the performance of the presented methodology. The results look promising as the methodology achieves high degree of precision and recall for information extraction and reasonably good accuracy for predicting missing values.

Keywords: Information Extraction, Bayesian Network, ontology, Machine Learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2174

8449 Thermodynamic Study of Seed Oil Extraction by Organic Solvents

Authors: Zhila Safari, Ali Ashrafizadeh, Najaf Hedayat

Abstract:

Thermodynamics characterization Sesame oil extraction by Acetone, Hexane and Benzene has been evaluated. The 120 hours experimental Data were described by a simple mathematical model. According to the simulation results and the essential criteria, Acetone is superior to other solvents but under certain conditions where oil extraction takes place Hexane is superior catalyst.

Keywords: Liquid-solid extraction, seed oil, ThermodynamicStudy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2019

8448 Biological Data Integration using SOA

Authors: Noura Meshaan Al-Otaibi, Amin Yousef Noaman

Abstract:

Nowadays scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data accessing, analyzing, and visualization tools. This research suggests the use of Service Oriented Architecture (SOA) to integrate biological data from different data sources. This work shows SOA will solve the problems that facing integration process and if the biologist scientists can access the biological data in easier way. There are several methods to implement SOA but web service is the most popular method. The Microsoft .Net Framework used to implement proposed architecture.

Keywords: Bioinformatics, Biological data, Data Integration, SOA and Web Services.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2416

8447 A Materialized Approach to the Integration of XML Documents: the OSIX System

Authors: H. Ahmad, S. Kermanshahani, A. Simonet, M. Simonet

Abstract:

The data exchanged on the Web are of different nature from those treated by the classical database management systems; these data are called semi-structured data since they do not have a regular and static structure like data found in a relational database; their schema is dynamic and may contain missing data or types. Therefore, the needs for developing further techniques and algorithms to exploit and integrate such data, and extract relevant information for the user have been raised. In this paper we present the system OSIX (Osiris based System for Integration of XML Sources). This system has a Data Warehouse model designed for the integration of semi-structured data and more precisely for the integration of XML documents. The architecture of OSIX relies on the Osiris system, a DL-based model designed for the representation and management of databases and knowledge bases. Osiris is a viewbased data model whose indexing system supports semantic query optimization. We show that the problem of query processing on a XML source is optimized by the indexing approach proposed by Osiris.

Keywords: Data integration, semi-structured data, views, XML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1541

8446 CFD Simulation of Dense Gas Extraction through Polymeric Membranes

Authors: Azam Marjani, Saeed Shirazian

Abstract:

In this study is presented a general methodology to predict the performance of a continuous near-critical fluid extraction process to remove compounds from aqueous solutions using hollow fiber membrane contactors. A comprehensive 2D mathematical model was developed to study Porocritical extraction process. The system studied in this work is a membrane based extractor of ethanol and acetone from aqueous solutions using near-critical CO2. Predictions of extraction percentages obtained by simulations have been compared to the experimental values reported by Bothun et al. [5]. Simulations of extraction percentage of ethanol and acetone show an average difference of 9.3% and 6.5% with the experimental data, respectively. More accurate predictions of the extraction of acetone could be explained by a better estimation of the transport properties in the aqueous phase that controls the extraction of this solute.

Keywords: Solvent extraction, Membrane, Mass transfer, Densegas, Modeling

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1535

8445 A New Method for Rapid DNA Extraction from Artemia (Branchiopoda, Crustacea)

Authors: R. Manaffar, R. Maleki, S. Zare, N. Agh, S. Soltanian, B. Sehatnia, P. Sorgeloos, P. Bossier, G. Van Stappen

Abstract:

Artemia is one of the most conspicuous invertebrates associated with aquaculture. It can be considered as a model organism, offering numerous advantages for comprehensive and multidisciplinary studies using morphologic or molecular methods. Since DNA extraction is an important step of any molecular experiment, a new and a rapid method of DNA extraction from adult Artemia was described in this study. Besides, the efficiency of this technique was compared with two widely used alternative techniques, namely Chelex® 100 resin and SDS-chloroform methods. Data analysis revealed that the new method is the easiest and the most cost effective method among the other methods which allows a quick and efficient extraction of DNA from the adult animal.

Keywords: APD, Artemia, DNA extraction, Molecularexperiments

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3139

8444 Integration of Image and Patient Data, Software and International Coding Systems for Use in a Mammography Research Project

Authors: V. Balanica, W. I. D. Rae, M. Caramihai, S. Acho, C. P. Herbst

Abstract:

Mammographic images and data analysis to facilitate modelling or computer aided diagnostic (CAD) software development should best be done using a common database that can handle various mammographic image file formats and relate these to other patient information. This would optimize the use of the data as both primary reporting and enhanced information extraction of research data could be performed from the single dataset. One desired improvement is the integration of DICOM file header information into the database, as an efficient and reliable source of supplementary patient information intrinsically available in the images. The purpose of this paper was to design a suitable database to link and integrate different types of image files and gather common information that can be further used for research purposes. An interface was developed for accessing, adding, updating, modifying and extracting data from the common database, enhancing the future possible application of the data in CAD processing. Technically, future developments envisaged include the creation of an advanced search function to selects image files based on descriptor combinations. Results can be further used for specific CAD processing and other research. Design of a user friendly configuration utility for importing of the required fields from the DICOM files must be done.

Keywords: Database Integration, Mammogram Classification, Tumour Classification, Computer Aided Diagnosis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1902

8443 Wavelet and K-L Seperability Based Feature Extraction Method for Functional Data Classification

Authors: Jun Wan, Zehua Chen, Yingwu Chen, Zhidong Bai

Abstract:

This paper proposes a novel feature extraction method, based on Discrete Wavelet Transform (DWT) and K-L Seperability (KLS), for the classification of Functional Data (FD). This method combines the decorrelation and reduction property of DWT and the additive independence property of KLS, which is helpful to extraction classification features of FD. It is an advanced approach of the popular wavelet based shrinkage method for functional data reduction and classification. A theory analysis is given in the paper to prove the consistent convergence property, and a simulation study is also done to compare the proposed method with the former shrinkage ones. The experiment results show that this method has advantages in improving classification efficiency, precision and robustness.

Keywords: classification, functional data, feature extraction, K-Lseperability, wavelet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1404

8442 Mechanisms of Ginger Bioactive Compounds Extract Using Soxhlet and Accelerated Water Extraction

Authors: M. N. Azian, A. N. Ilia Anisa, Y. Iwai

Abstract:

The mechanism for extraction bioactive compounds from plant matrix is essential for optimizing the extraction process. As a benchmark technique, a soxhlet extraction has been utilized for discussing the mechanism and compared with an accelerated water extraction. The trends of both techniques show that the process involves extraction and degradation. The highest yields of 6-, 8-, 10-gingerols and 6-shogaol in soxhlet extraction were 13.948, 7.12, 10.312 and 2.306 mg/g, respectively. The optimum 6-, 8-, 10-gingerols and 6-shogaol extracted by the accelerated water extraction at 140oC were 68.97±3.95 mg/g at 3min, 18.98±3.04 mg/g at 5min, 5.167±2.35 mg/g at 3min and 14.57±6.27 mg/g at 3min, respectively. The effect of temperature at 3mins shows that the concentration of 6-shogaol increased rapidly as decreasing the recovery of 6-gingerol.

Keywords: Mechanism, bioactive compounds, soxhlet extraction, accelerated water extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5296

8441 Design of a Service-Enabled Dependable Integration Environment

Authors: Fuyang Peng, Donghong Li

Abstract:

The aim of information systems integration is to make all the data sources, applications and business flows integrated into the new environment so that unwanted redundancies are reduced and bottlenecks and mismatches are eliminated. Two issues have to be dealt with to meet such requirements: the software architecture that supports resource integration, and the adaptor development tool that help integration and migration of legacy applications. In this paper, a service-enabled dependable integration environment (SDIE), is presented, which has two key components, i.e., a dependable service integration platform and a legacy application integration tool. For the dependable platform for service integration, the service integration bus, the service management framework, the dependable engine for service composition, and the service registry and discovery components are described. For the legacy application integration tool, its basic organization, functionalities and dependable measures taken are presented. Due to its service-oriented integration model, the light-weight extensible container, the service component combination-oriented p-lattice structure, and other features, SDIE has advantages in openness, flexibility, performance-price ratio and feature support over commercial products, is better than most of the open source integration software in functionality, performance and dependability support.

Keywords: Application integration, dependability, legacy, SOA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1133

8440 Analyzing the Technology Affecting on the Social Integration of Students at University

Authors: Sujit K. Basak, Simon Collin

Abstract:

The aim of this paper is to examine the technology access and use on the affecting social integration of local students at university. This aim is achieved by designing a structural equation modeling (SEM) in terms of integration with peers, integration with faculty, faculty support and on the other hand, examining the socio demographic impact on the technology access and use. The collected data were analyzed using the WarpPLS 5.0 software. This study was survey based and it was conducted at a public university in Canada. The results of the study indicated that technology has a strong impact on integration with faculty, faculty support, but technology does not have an impact on integration with peers. However, the social demographic has also an impact on the technology access and use.

Keywords: Faculty, integration, peer, technology access and use.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1489

8439 A New Type of Integration Error and its Influence on Integration Testing Techniques

Authors: P. Prema, B. Ramadoss

Abstract:

Testing is an activity that is required both in the development and maintenance of the software development life cycle in which Integration Testing is an important activity. Integration testing is based on the specification and functionality of the software and thus could be called black-box testing technique. The purpose of integration testing is testing integration between software components. In function or system testing, the concern is with overall behavior and whether the software meets its functional specifications or performance characteristics or how well the software and hardware work together. This explains the importance and necessity of IT for which the emphasis is on interactions between modules and their interfaces. Software errors should be discovered early during IT to reduce the costs of correction. This paper introduces a new type of integration error, presenting an overview of Integration Testing techniques with comparison of each technique and also identifying which technique detects what type of error.

Keywords: Integration Error, Integration Error Types, Integration Testing Techniques, Software Testing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2140

8438 Motion Recognition Based On Fuzzy WP Feature Extraction Approach

Authors: Keun-Chang Kwak

Abstract:

This paper is concerned with motion recognition based fuzzy WP(Wavelet Packet) feature extraction approach from Vicon physical data sets. For this purpose, we use an efficient fuzzy mutual-information-based WP transform for feature extraction. This method estimates the required mutual information using a novel approach based on fuzzy membership function. The physical action data set includes 10 normal and 10 aggressive physical actions that measure the human activity. The data have been collected from 10 subjects using the Vicon 3D tracker. The experiments consist of running, seating, and walking as physical activity motion among various activities. The experimental results revealed that the presented feature extraction approach showed good recognition performance.

Keywords: Motion recognition, fuzzy wavelet packet, Vicon physical data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1595

8437 High Resolution Images: Segmenting, Extracting Information and GIS Integration

Authors: Erick López-Ornelas

Abstract:

As the world changes more rapidly, the demand for update information for resource management, environment monitoring, planning are increasing exponentially. Integration of Remote Sensing with GIS technology will significantly promote the ability for addressing these concerns. This paper presents an alternative way of update GIS applications using image processing and high resolution images. We show a method of high-resolution image segmentation using graphs and morphological operations, where a preprocessing step (watershed operation) is required. A morphological process is then applied using the opening and closing operations. After this segmentation we can extract significant cartographic elements such as urban areas, streets or green areas. The result of this segmentation and this extraction is then used to update GIS applications. Some examples are shown using aerial photography.

Keywords: GIS, Remote Sensing, image segmentation, feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1597

8436 Extraction of Data from Web Pages: A Vision Based Approach

Authors: P. S. Hiremath, Siddu P. Algur

Abstract:

With the explosive growth of information sources available on the World Wide Web, it has become increasingly difficult to identify the relevant pieces of information, since web pages are often cluttered with irrelevant content like advertisements, navigation-panels, copyright notices etc., surrounding the main content of the web page. Hence, tools for the mining of data regions, data records and data items need to be developed in order to provide value-added services. Currently available automatic techniques to mine data regions from web pages are still unsatisfactory because of their poor performance and tag-dependence. In this paper a novel method to extract data items from the web pages automatically is proposed. It comprises of two steps: (1) Identification and Extraction of the data regions based on visual clues information. (2) Identification of data records and extraction of data items from a data region. For step1, a novel and more effective method is proposed based on visual clues, which finds the data regions formed by all types of tags using visual clues. For step2 a more effective method namely, Extraction of Data Items from web Pages (EDIP), is adopted to mine data items. The EDIP technique is a list-based approach in which the list is a linear data structure. The proposed technique is able to mine the non-contiguous data records and can correctly identify data regions, irrespective of the type of tag in which it is bound. Our experimental results show that the proposed technique performs better than the existing techniques.

Keywords: Web data records, web data regions, web mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1858

8435 Extraction Condition of Echinocactus grusonii

Authors: R. Oonsivilai, N. Chaijareonudomroung, Y. Huantanom, A. Oonsivilai

Abstract:

The optimal extraction condition of dried Echinocactus grusonii powder was studied. The three independent variables are raw material drying temperature, extraction temperature, and extraction time. The dependent variables are both yield percentage of crude extract and total phenolic quantification as gallic acid equivalent in crude extract. The experimental design was based on central composite design. Highest yield percentage of crude extract could get from extraction condition at raw material drying temperature at 60°C, extraction temperature at 15°C, and extraction time for 25 min °C. Moreover, the crude extract with highest phenolic occurred by extraction condition of raw material drying temperature at 60°C, extraction temperature at 35 °C, and extraction lasting 25 min.

Keywords: Drying temperature, Extraction temperature, Optimal condition, Total phenolic

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2132

8434 Optimization and Kinetic Study of Gaharu Oil Extraction

Authors: Muhammad Hazwan H., Azlina M.F., Hasfalina C.M., Zurina Z.A., Hishamuddin J

Abstract:

Gaharu that produced by Aquilaria spp. is classified as one of the most valuable forest products traded internationally as it is very resinous, fragrant and highly valuable heartwood. Gaharu has been widely used in aromatheraphy, medicine, perfume and religious practices. This work aimed to determine the factors affecting solid liquid extraction of gaharu oil using hexane as solvent under experimental condition. The kinetics of extraction was assumed and verified based on a second-order mechanism. The effect of three main factors, which were temperature, reaction time and solvent to solid ratio were investigated to achieve maximum oil yield. The optimum condition were found at temperature 65°C, 9 hours reaction time and solvent to solid ratio of 12:1 with 14.5% oil yield. The kinetics experimental data agrees and well fitted with the second order extraction model. The initial extraction rate (h) was 0.0115 gmL-1min-1; the extraction capacity (Cs) was 1.282gmL-1; the second order extraction constant (k) was 0.007 mLg-1min-1 and coefficient of determination, R2 was 0.945.

Keywords: Gaharu, solid liquid extraction, optimization, kinetics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3220

8433 Modeling and Prediction of Zinc Extraction Efficiency from Concentrate by Operating Condition and Using Artificial Neural Networks

Authors: S. Mousavian, D. Ashouri, F. Mousavian, V. Nikkhah Rashidabad, N. Ghazinia

Abstract:

PH, temperature and time of extraction of each stage, agitation speed and delay time between stages effect on efficiency of zinc extraction from concentrate. In this research, efficiency of zinc extraction was predicted as a function of mentioned variable by artificial neural networks (ANN). ANN with different layer was employed and the result show that the networks with 8 neurons in hidden layer has good agreement with experimental data.

Keywords: Zinc extraction, Efficiency, Neural networks, Operating condition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1542

8432 Thermodynamic Study of Uranium Extraction from Tunisian Wet Process Phosphoric Acid

Authors: N. Khleifia, A. Hannachi, N. Abbes

Abstract:

In the present paper, an experimental investigation was conducted to study the thermodynamic of uranium extraction from Tunisian wet phosphoric acid using the synergistic solvent mixture of di-2-ethylhexyl phosphoric acid (DEHPA) and trioctyl phosphine oxid (TOPO) diluted in kerosene. The effect of different factors affecting the extraction process (temperature, TOPO and DEHPA concentrations) has been investigated. The obtained data of temperature effect on the extraction showed that the enthalpy change is -35.8 kJ.mol^-1. The slope analysis method was used for determining the stoichiometry of the extracted species.

Keywords: DEHPA-TOPO, extraction, phosphoric acid, stoichiometry, uranium.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2390

8431 Information Extraction from Unstructured and Ungrammatical Data Sources for Semantic Annotation

Authors: Quratulain N. Rajput, Sajjad Haider, Nasir Touheed

Abstract:

The internet has become an attractive avenue for global e-business, e-learning, knowledge sharing, etc. Due to continuous increase in the volume of web content, it is not practically possible for a user to extract information by browsing and integrating data from a huge amount of web sources retrieved by the existing search engines. The semantic web technology enables advancement in information extraction by providing a suite of tools to integrate data from different sources. To take full advantage of semantic web, it is necessary to annotate existing web pages into semantic web pages. This research develops a tool, named OWIE (Ontology-based Web Information Extraction), for semantic web annotation using domain specific ontologies. The tool automatically extracts information from html pages with the help of pre-defined ontologies and gives them semantic representation. Two case studies have been conducted to analyze the accuracy of OWIE.

Keywords: Ontology, Semantic Annotation, Wrapper, Information Extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2067

8430 GeNS: a Biological Data Integration Platform

Authors: Joel Arrais, João E. Pereira, João Fernandes, José Luís Oliveira

Abstract:

The scientific achievements coming from molecular biology depend greatly on the capability of computational applications to analyze the laboratorial results. A comprehensive analysis of an experiment requires typically the simultaneous study of the obtained dataset with data that is available in several distinct public databases. Nevertheless, developing a centralized access to these distributed databases rises up a set of challenges such as: what is the best integration strategy, how to solve nomenclature clashes, how to solve database overlapping data and how to deal with huge datasets. In this paper we present GeNS, a system that uses a simple and yet innovative approach to address several biological data integration issues. Compared with existing systems, the main advantages of GeNS are related to its maintenance simplicity and to its coverage and scalability, in terms of number of supported databases and data types. To support our claims we present the current use of GeNS in two concrete applications. GeNS currently contains more than 140 million of biological relations and it can be publicly downloaded or remotely access through SOAP web services.

Keywords: Data integration, biological databases

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1593

8429 Effect of Enzyme and Heat Pretreatment on Sunflower Oil Recovery Using Aqueous and Hexane Extractions

Authors: E. Danso-Boateng

Abstract:

The effects of enzyme action and heat pretreatment on oil extraction yield from sunflower kernels were analysed using hexane extraction with Soxhlet, and aqueous extraction with incubator shaker. Ground kernels of raw and heat treated kernels, each with and without Viscozyme treatment were used. Microscopic images of the kernels were taken to analyse the visible effects of each treatment on the cotyledon cell structure of the kernels. Heat pretreated kernels before both extraction processes produced enhanced oil extraction yields than the control, with steam explosion the most efficient. In hexane extraction, applying a combination of steam explosion and Viscozyme treatments to the kernels before the extraction gave the maximum oil extractable in 1 hour; while for aqueous extraction, raw kernels treated with Viscozyme gave the highest oil extraction yield. Remarkable cotyledon cell disruption was evident in kernels treated with Viscozyme; whereas steam explosion and conventional heat treated kernels had similar effects.

Keywords: Enzyme-assisted aqueous and hexane extraction, heatpretreatment, sunflower cotyledon structure, sunflower oil extraction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3420

8428 Subcritical Water Extraction of Mannitol from Olive Leaves

Authors: S. M. Ghoreishi, R. Gholami Shahrestani, S. H. Ghaziaskar

Abstract:

Subcritical water extraction was investigated as a novel and alternative technology in the food and pharmaceutical industry for the separation of Mannitol from olive leaves and its results was compared with those of Soxhlet extraction. The effects of temperature, pressure, and flow rate of water and also momentum and mass transfer dimensionless variables such as Reynolds and Peclet Numbers on extraction yield and equilibrium partition coefficient were investigated. The 30-110 bars, 60-150°C, and flow rates of 0.2-2 mL/min were the water operating conditions. The results revealed that the highest Mannitol yield was obtained at 100°C and 50 bars. However, extraction of Mannitol was not influenced by the variations of flow rate. The mathematical modeling of experimental measurements was also investigated and the model is capable of predicting the experimental measurements very well. In addition, the results indicated higher extraction yield for the subcritical water extraction in contrast to Soxhlet method.

Keywords: Extraction, Mannitol, Modeling, Olive leaves, Soxhlet extraction, Subcritical water.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3011

8427 XML Schema Automatic Matching Solution

Authors: Huynh Quyet Thang, Vo Sy Nam

Abstract:

Schema matching plays a key role in many different applications, such as schema integration, data integration, data warehousing, data transformation, E-commerce, peer-to-peer data management, ontology matching and integration, semantic Web, semantic query processing, etc. Manual matching is expensive and error-prone, so it is therefore important to develop techniques to automate the schema matching process. In this paper, we present a solution for XML schema automated matching problem which produces semantic mappings between corresponding schema elements of given source and target schemas. This solution contributed in solving more comprehensively and efficiently XML schema automated matching problem. Our solution based on combining linguistic similarity, data type compatibility and structural similarity of XML schema elements. After describing our solution, we present experimental results that demonstrate the effectiveness of this approach.

Keywords: XML Schema, Schema Matching, SemanticMatching, Automatic XML Schema Matching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1783

8426 Investigating Relationship between Product Features and Supply Chain Integration

Authors: Saied Rasul Hosseini Baharanchi

Abstract:

This paper addresses integration issues in supply chain, and tries to investigate how different aspects of integration are linked with some product features. Integration in this study is interpreted as "internal", "upstream" (supply), and "downstream" (demand). Two features of product innovative and quality are considered. To examine the relationships between supply chain integrations – as mentioned above, and product features, this research follows the survey method in automotive industry.The results imply that supply chain upstream integration has a higher impact on product quality, comparing to internal and supply chain downstream integrations. It is also found that the influence of supply chain downstream integration on product innovation is greater than other variables. In brief, this study mainly tackles the importance of specific level of supply chain integrations and its effects on two product features.

Keywords: Supply chain upstream integration, supply chaindownstream integration, internal integration, product features

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1784

8425 High Performance in Parallel Data Integration: An Empirical Evaluation of the Ratio Between Processing Time and Number of Physical Nodes

Authors: Caspar von Seckendorff, Eldar Sultanow

Abstract:

Many studies have shown that parallelization decreases efficiency [1], [2]. There are many reasons for these decrements. This paper investigates those which appear in the context of parallel data integration. Integration processes generally cannot be allocated to packages of identical size (i. e. tasks of identical complexity). The reason for this is unknown heterogeneous input data which result in variable task lengths. Process delay is defined by the slowest processing node. It leads to a detrimental effect on the total processing time. With a real world example, this study will show that while process delay does initially increase with the introduction of more nodes it ultimately decreases again after a certain point. The example will make use of the cloud computing platform Hadoop and be run inside Amazon-s EC2 compute cloud. A stochastic model will be set up which can explain this effect.

Keywords: Process delay, speedup, efficiency, parallel computing, data integration, E-Commerce, Amazon Elastic Compute Cloud (EC2), Hadoop, Nutch.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1581

8424 Effect of Wheat Flour Extraction Rates on Flour Composition, Farinographic Characteristics and Sensory Perception of Sourdough Naans

Authors: Ghulam Mueen-ud-Din, Salim-ur-Rehman, Faqir M. Anjum, Haq Nawaz, Mian A. Murtaza

Abstract:

The effect of wheat flour extraction rates on flour composition, farinographic characteristics and the quality of sourdough naans was investigated. The results indicated that by increasing the extraction rate, the amount of protein, fiber, fat and ash increased, whereas moisture content decreased. Farinographic characteristic like water absorption and dough development time increased with an increase in flour extraction rate but the dough stabilities and tolerance indices were reduced with an increase in flour extraction rates. Titratable acidity for both sourdough and sourdough naans also increased along with flour extraction rate. The study showed that overall quality of sourdough naans were affected by both flour extraction rate and starter culture used. Sensory analysis of sourdough naans revealed that desirable extraction rate for sourdough naan was 76%.

Keywords: Extraction rates, Farinographic characteristics, Flour composition, Sourdough naans, Wheat flour.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4629

8423 Using Automated Database Reverse Engineering for Database Integration

Authors: M. R. Abbasifard, M. Rahgozar, A. Bayati, P. Pournemati

Abstract:

One important problem in today organizations is the existence of non-integrated information systems, inconsistency and lack of suitable correlations between legacy and modern systems. One main solution is to transfer the local databases into a global one. In this regards we need to extract the data structures from the legacy systems and integrate them with the new technology systems. In legacy systems, huge amounts of a data are stored in legacy databases. They require particular attention since they need more efforts to be normalized, reformatted and moved to the modern database environments. Designing the new integrated (global) database architecture and applying the reverse engineering requires data normalization. This paper proposes the use of database reverse engineering in order to integrate legacy and modern databases in organizations. The suggested approach consists of methods and techniques for generating data transformation rules needed for the data structure normalization.

Keywords: Reverse Engineering, Database Integration, System Integration, Data Structure Normalization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1803

8422 Eclectic Rule-Extraction from Support Vector Machines

Authors: Nahla Barakat, Joachim Diederich

Abstract:

Support vector machines (SVMs) have shown superior performance compared to other machine learning techniques, especially in classification problems. Yet one limitation of SVMs is the lack of an explanation capability which is crucial in some applications, e.g. in the medical and security domains. In this paper, a novel approach for eclectic rule-extraction from support vector machines is presented. This approach utilizes the knowledge acquired by the SVM and represented in its support vectors as well as the parameters associated with them. The approach includes three stages; training, propositional rule-extraction and rule quality evaluation. Results from four different experiments have demonstrated the value of the approach for extracting comprehensible rules of high accuracy and fidelity.

Keywords: Data mining, hybrid rule-extraction algorithms, medical diagnosis, SVMs

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1654