Personal tools

Tutorials:UsingDCSchema

From Adapt

Revision as of 23:22, 11 September 2008 by Scsong (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Adding a new schema to adapt-xml

Overview

A Dublin Core schema has been added to adapt-xml to begin testing PAWN's ability to digest collections of digital media. The first collection to be used is the Human-Computer Interaction Laboratory's digital library. Dublin Core has been choosen as the schema used to describe the digital objects within this collection.

Dublin Core Schema

Dublin Core is a simple metadata set of 15 elements used primarily for card cataloging digital libraries and electronic information resources. The name of the Dublin Core package in adapt-xml is gov.loc.dc and is used to incorporate Dublin Core into METS files. The package consists of a 17 classes and 2 interfaces as follows:

Interfaces
Namespace - contains schema location details; needed to validate Dublin Core elements.
QNameElement - Used for writing the Dublin Core elements to a METS file.

Classes
DCElement - subclass of edu.umiacs.xml.BaseElement and parent class to all the classes below. Implements writeXMLData() which is used by the other classes to write the tags to the METS file.
DC - Tag used to enclose the Dublin Core elements and specify the schema location for validation.
Creator - person(s) or organization(s) responsible for the intellectual content of the resource.
Contributor - additional person(s) or organization(s) responsible for the intellectual content other than those specified in the Creator element.
Coverage - spatial locations and temporal durations characteristics of the resource.
Date - date resource was made available in its present form.
Description - textual description of the content of the resource.
Format - data representation of the resouce (e.g. text/html, ASCII, JPEG, etc...).
Identifier - string or number used to uniquely identify the resource.
Language - language(s) of the intellectual content of the resource.
Publisher - entity responsible for making the resource available.
Relation - relationship to other resources.
Rights - intended to be a link to a copyright notice or rights management statement.
Source - the work from which this resource is delivered (if applicable).
Subject - topic, keywords, phrases or classification descriptors of describing the subject of the resource.
Title - name given to the resource.
Type - category of the rexource (e.g. home page, novel, poem, etc...).

Usage

Only three functions are called for each class instance: writeStart(), writeXMLData(), and writeEnd(). For example:

DC.writeStart(mw);
  Title.writeStart(mw);
  Title.writeXMLData(mw, "Bm:display_title");
  Title.writeEnd(mw);
  Description.writeStart(mw);
  Description.writeXMLData(mw, "Bm:abstract");
  Description.writeEnd(mw);
  Contributor.writeStart(mw);
  Contributor.writeXMLData(mw, "Cs:name");
  Contributor.writeEnd(mw);
  Coverage.writeStart(mw);
  Coverage.writeXMLData(mw, "B:url");
  Coverage.writeEnd(mw);
  .
  .
  .
DC.writeEnd(mw);

Assuming that the rest of the elements are typed in between the DC tag, the code above should produce the following:


  • Dublin Core in METS:

Dublin Core in METS


Future Developments

With the package completed and available in adapt-xml, the next step is accessing HCIL's MySQL database that contains all their digital books. The Dublin Core elements will be used to store some information from the database tables per digital book and the rest of the METS file incorporating the Dublin Core will point to the digital content representing the book.