Library Journal "Digital Libraries" Columns 1997-2007, Roy Tennant

Please note: these columns are an archive of my Library Journal column from 1997-2007. They have not been altered in content, so please keep in mind some of this content will be out of date.
roytennant.com :: Digital Libraries Columns
Building a New Bibliographic Infrastructure

01/15/2004
   More than a year ago I called for the death of MARC (see [123]LJ
   10/15/02 , p. 26ff.). That column sparked a lively discussion among
   librarians--especially catalogers. As I thought about it and discussed
   the issue with others, I decided I had convicted the wrong suspect. Let
   MARC die of old age rather than homicide.

   I thought that MARC (the MARC record syntax, MARC elements, and AACR2)
   was too limiting for modern library needs and opportunities. I now
   realize that with a robust bibliographic infrastructure we could
   profitably use any bibliographic metadata standard that we could
   imagine, including MARC.

   The point is we need to craft standards, software tools, and systems
   that can accept, manipulate, store, output, search, and display
   metadata from a wide variety of bibliographic or related standards. Our
   systems should be able to accept an ONIX record from a publisher, which
   contains basic bibliographic fields and elements like cover art, and
   use it as a prototype cataloging record or to enrich an existing
   record. Our systems should be able to output a record easily in Dublin
   Core for harvesting through the Open Archives Initiative's Protocol for
   Metadata Harvesting. In other words, we need a new bibliographic
   infrastructure that allows for the easy and effective sharing of
   various types of records.

   This requires a transfer format or schema that can encapsulate
   everything you need to associate with a particular intellectual object.
   Luckily, such a schema is being developed: the Metadata Encoding and
   Transfer Syntax (METS). Some libraries are already using this schema to
   embed multiple bibliographic records from different schemata into one
   package.
   Supporting all records

   This is our future; we can no longer rely on only one record structure.
   We must be able to accept many different kinds of bibliographic record
   structures, from ONIX to Dublin Core to whatever else comes along that
   contains useful information. To use these record formats we will need
   rules and guidelines to follow in their application. We need both
   general rules and schema-specific rules, similar to the way we have
   used AACR2 to define what information we capture in MARC.

   Beyond rules and guidelines, best practices as developed by libraries
   working with this new infrastructure will help guide other libraries.
   "Crosswalks" or standard translations from one bibliographic schema to
   another will also be required, although in many cases it will be better
   to retain records in their original format and map elements into common
   fields upon indexing.
   No more homogenization

   An undertaking of this type has many challenges. One of the first and
   most significant will be moving from a relatively homogenous
   bibliographic environment (the MARC21/AACR2 hegemony) to a diverse one.
   We will need to find useful ways of ingesting records in a variety of
   formats, both by crosswalking and by indexing the original formats and
   virtually merging the diverse records.

   Moving to a new infrastructure would, at minimum, require upgrades to
   our existing integrated library systems. At worst, it could require
   migrating to entirely new systems. Neither of these solutions is
   without problems, as anyone who has switched systems will confirm. But
   migrating is unlikely to be as difficult as it will be to change us.
   Those of us who have only known a MARC world may find it difficult to
   learn how to build and use a diverse bibliographic environment
   effectively.

   Despite these challenges, such changes are both necessary and
   achievable. They are necessary to exploit new metadata opportunities
   and technologies like XML and the Internet. Our choice is to remake our
   bibliographic infrastructure and achieve new levels of service, or to
   maintain the status quo and risk becoming increasingly (and deservedly)
   marginalized.
   The challenge to systems, utilities

   Will our integrated library systems be up to the task? Although a
   number of vendors clearly see a future based on XML, it will be some
   time before their systems can easily accommodate records from a variety
   of formats. Our major bibliographic utilities, RLIN and WorldCat, are
   moving in the right direction. OCLC has remade WorldCat from the bottom
   up, employing XML and an in-house XML schema dubbed "XWC" that
   accommodates Dublin Core, MARC, and other formats. This is just the
   beginning of a rich bibliographic infrastructure that could employ
   metadata in just about any XML-encoded form.

   If you are intrigued by this vision for a bibliographic infrastructure,
   watch for my article "A Bibliographic Metadata Infrastructure for the
   21st Century" in an upcoming issue of Library Hi Tech. Meanwhile,
   consider where you think our bibliographic systems should be and let me
   know your thoughts.
     __________________________________________________________________

                                   LINK LIST
   METS
   [124]www.loc.gov/standards/mets MODS
   [125]www.loc.gov/standards/mods ONIX
   [126]www.editeur.org/onix.html