:: Digital Libraries Columns


Library Journal "Digital Libraries" Columns 1997-2007, Roy Tennant

Please note: these columns are an archive of my Library Journal column from 1997-2007. They have not been altered in content, so please keep in mind some of this content will be out of date. :: Digital Libraries Columns

The $64,000 Question


   "What digital library software application should I buy?" asks the
   typical harried library staff member. It usually means that his (or
   her) boss decided that, to be modern, they must go digital. The
   librarian or library assistant then bravely surfs the Internet to find
   the appropriate software. It's not that simple.

   What's a digital library after all? Is it a pile of licensed content or
   databases? Is it having your archival finding aids digitized and
   online? Is it having the content those finding aids describe digitized
   and online? Is it accepting and providing online access to preprints?
   Is it publishing books online?

   It is all that, and more. But there's no single software application to
   deal with the variety and complexity of digital library collections and

   Finding the tools
   Unfortunately, we're still not even close to finding the right tools.
   Except for online databases, we are mostly dealing with objects, at
   various levels of granularity. An archival finding aid describes a
   collection of objects. The objects described by a finding aid, such as
   individual historical photographs or a book, are discrete objects that
   may also be part of a logical collection. They can be described
   individually and placed within the context of a collection when
   appropriate. They are the "atom" of librarianship--an irreducible

   However, these "atoms" vary greatly. The description and structure of
   objects as diverse as photographs, journal articles, books, manuscripts
   (such as diaries), and datasets can vary significantly. How can we
   create one software application to manage and provide access to such a
   diverse group? One solution is to "encapsulate" all objects in a
   standardized descriptive structure so that software written to that
   specification will understand each type of object.

   Consider trying to make a handwritten diary available online in a
   complete and usable manner. This requires individual page images (to
   see the actual handwriting), a transcription of each page (so the diary
   is readable and perhaps searchable), and a way to navigate the diary.
   Any given item might comprise hundreds of individual files, and each
   image file would have to be matched up with its associated
   transcription. Also, these would all have to be appropriately slotted
   within the whole so a reader can page through the manuscript.

   Metadata (information that describes an object) is needed--and lots of
   it! More specifically, structural metadata is needed to specify which
   image file corresponds with "page one" and which text represents the
   transcription of that page. If we can agree on a standard method for
   encoding this structural metadata, and make this specification open and
   public, then anyone who wishes can write software to interact with
   these objects. This is the goal of the University of California at
   Berkeley's Making of America II Project (MOA II).

   Back to the future
   To see this in action, go to the web site and select the link for "The
   HTML-based MOA II Document Viewer." This leads to an application (a
   Java servlet) that lists a number of objects that have been encoded
   using the MOA II XML document type definition (DTD), which defines a
   method for encoding structural metadata.

   To see the system capabilities, select the "Patrick Breen Diary" (Breen
   was a member of the ill-fated Donner Party). If you click on a day in
   the left frame, the associated diary page appears in the right window.
   Drop-down menus above the right window allow different resolutions of
   the page image, even the transcribed text. Buttons allow browsing
   forward or back. The lower left corner shows the XML source document.

   The MOA II web site offers more information about the MOA II DTD and
   how the project is producing objects encoded to that specification. The
   Digital Library Federation publication "The Making of America II
   Testbed Project: A Digital Library Service Model" provides some
   essential background.

   The latest work is moving the specification from a DTD model to an XML
   schema, called the "Metadata Encoding and Transmission Standard"
   (METS). This not only brings the MOA II work into an XML framework that
   allows more software flexibility, it also adds the ability to either
   embed metadata (e.g., author and title) into the object description
   itself, or to point to it in an outside source such as a separate

   Other questions remain, such as how the objects are discovered by users
   in the first place, since some of these objects are not represented in
   library catalog records. But by encapsulating digital objects in
   standardized ways, we can create new modes of discovery. For example, a
   library might write an application to crawl remote collections of MOA
   II objects, extract descriptive metadata from them, and index them for
   searching. When an item of interest is discovered, the record link
   would enable a user to retrieve it from the library holding it.

   We likely won't have a single "digital library application" for every
   library. But with projects like MOA II, we can avoid the problems
   caused when each library makes up its own rules for encapsulating and
   describing digital objects. This may not be the answer to the $64,000
   question, but it is close.


   The Making of America II Project

   The Making of America II Testbed Project: A Digital Library Service


   Metadata Encoding and Transmission Standard (METS)