Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30458
Tool for Metadata Extraction and Content Packaging as Endorsed in OAIS Framework

Authors: B. K. Murthy, Payal Abichandani, Rishi Prakash, Paras Nath Barwal


Information generated from various computerization processes is a potential rich source of knowledge for its designated community. To pass this information from generation to generation without modifying the meaning is a challenging activity. To preserve and archive the data for future generations it’s very essential to prove the authenticity of the data. It can be achieved by extracting the metadata from the data which can prove the authenticity and create trust on the archived data. Subsequent challenge is the technology obsolescence. Metadata extraction and standardization can be effectively used to resolve and tackle this problem. Metadata can be categorized at two levels i.e. Technical and Domain level broadly. Technical metadata will provide the information that can be used to understand and interpret the data record, but only this level of metadata isn’t sufficient to create trustworthiness. We have developed a tool which will extract and standardize the technical as well as domain level metadata. This paper is about the different features of the tool and how we have developed this.  

Keywords: Metadata, xml, Digital Preservation, OAIS, PDI

Digital Object Identifier (DOI):

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1476


[1] Lavoie, Brian F. "The open archival information system reference model: Introductory guide." Microform & imaging review 33.2 (2004): 68-81.
[2] Consultative Committee for Space Data Systems (Accessed on 2nd July 2015)
[3] Gasser, Morrie, and Ellen McDermott. "An architecture for practical delegation in a distributed system." Research in Security and Privacy, 1990. Proceedings, 1990 IEEE Computer Society Symposium on. IEEE, 1990.
[4] d'Inverno, Mark, et al. "The dMARS architecture: A specification of the distributed multi-agent reasoning system." Autonomous Agents and Multi-Agent Systems 9.1-2 (2004): 5-53.
[5] Giuffrida, Giovanni, Eddie C. Shek, and Jihoon Yang. "Knowledgebased metadata extraction from PostScript files." Proceedings of the fifth ACM conference on Digital libraries. ACM, 2000.
[6] Han, Hui, et al. "Automatic document metadata extraction using support vector machines." Digital Libraries, 2003. Proceedings. 2003 Joint Conference on. IEEE, 2003.
[7] Jung, Kil-soo, and Kwang-Min Kim. "Manifest file structure, method of downloading contents usng the same, and apparatus for reproducing the contents." U.S. Patent Application 11/322,354.
[8] Hartig, Olaf. "Provenance Information in the Web of Data." LDOW 538 (2009).
[9] Tansley, Robert, Mick Bass, and MacKenzie Smith. "DSpace as an open archival information system: Current status and future directions." Research and advanced technology for digital libraries. Springer Berlin Heidelberg, 2003. 446-460.