A System to Integrate and Manipulate Protein Database Using BioPerl and XML

Zurinahni Zainol; Rosalina Abdul Salam; Rosni Abdullah; Nur'Aini; Wahidah Husain

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33093

A System to Integrate and Manipulate Protein Database Using BioPerl and XML

Authors: Zurinahni Zainol, Rosalina Abdul Salam, Rosni Abdullah, Nur'Aini, Wahidah Husain

Abstract:

The size, complexity and number of databases used for protein information have caused bioinformatics to lag behind in adapting to the need to handle this distributed information. Integrating all the information from different databases into one database is a challenging problem. Our main research is to develop a tool which can be used to access and manipulate protein information from difference databases. In our approach, we have integrated difference databases such as Swiss-prot, PDB, Interpro, and EMBL and transformed these databases in flat file format into relational form using XML and Bioperl. As a result, we showed this tool can search different sizes of protein information stored in relational database and the result can be retrieved faster compared to flat file database. A web based user interface is provided to allow user to access or search for protein information in the local database.

Keywords: Protein sequence database, relational database, integrated database.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1328634

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1442

References:

[1] Guochun Xie,Reynold DeMarco,Richard Blevins and Yuhong Wang, Stroing biological sequence databases in relational form, http://www.bioinformatic.oupjournals.org, 1999.
[2] Andre Bergholz,Jorg A. schenk, stepehn Heyman,Johann Christoper , Sequence comparison using a relational database approach,http://www.citeseer.ist.psu.edi/bergholz97sequence.html, 1997.
[3] P.mork,A.halevy, P.tarczy, A model for data integration system of Biomedical Data Applied to Online Genetic Databases, 2000.
[4] Wang L., Riethiven-Tom, P., N,McNail P.,Robinso Redaschi, A.,Lijnzaad,Exploiting XML with CORBA to improve Distributing EMBL data, EMBL Outstation , European Bioinformatics Institute,2001
[5] Wang L., Riethiven-Tom, P., N,McNail P.,Robinso, Accessing and distributing EMBL data using CORBA, Genome Biology 2000 1(5): research, 2000
[6] E.V. Kriventseva, W.Flieschman, E.M Zdobnov, R. Apweiler, CluSTr: A database of clusters of Swis-sprot + Trembl Protiens, Nucleic Asids Research, Vol 29, No1, pg 33 - 36, 2001
[7] Emmanuel, B,Leser,U. Lijnzaad,P,Cussat-Blanc,Jungferm K.Guyon,F., Vaysseix, G, Jhelgesen,C., and Rodriguez-Tome, P. A Proposal for a standard CORBA interface for genome Maps, Bioinformatics, vol 15, No 2, , pg 157 - 169, 1999
[8] http://www.w3.org/XML/
[9] http://www.bio.perl.org/
[10] http://www.ebi.uniprot.org/uniprot-srv/uniprotsearch
[11] http://au.expasy.org/
[12] http://pir.georgetown.edu/pirwww/dbinfo/pirpsd.html
[13] http://pfam-wust1.edu/hmmsearch.shtml
[14] http://umber.sbs.man.ac.uk/dbrowser/OWL
[15] S.F. Altschul et al., "Basic Local Alignment Search Tool,", Journal of Molecular Biology 215, 403-420, 1990