Library Journal "Digital Libraries" Columns 1997-2007, Roy Tennant
Please note: these columns are an archive of my Library Journal column from 1997-2007. They have not been altered in content, so please keep in mind some of this content will be out of date.
Not Your Mother's Union Catalog
04/15/2003
For years the largest bibliographic databases in the world, OCLC's WorldCat and RLG's Eureka, have remained nearly unchanged from the user's perspective--except for new records. That will change, dramatically altering how librarians think about their catalogs. OCLC is reengineering WorldCat so thoroughly that nothing remains unquestioned, including the MARC record. Meanwhile, RLG's RedLightGreen project shares many of the same aims. As RLG's James Michalko puts it, "This is the library community's chance to reinvent what a catalog is." FRBRizing In 1998, the International Federation of Library Associations and Institutions (IFLA) published a report on the Functional Requirements of Bibliographic Records (FRBR)--a revolutionary recasting of the bibliographic record on behalf of library users. It defined such principles as the hierarchical dimensions of a creative product: work (distinct creation), expression (realization of a work), manifestation (physical embodiment), and item (a single exemplar). The revolutionary aspect is best understood as it is being applied by OCLC and RLG to their catalogs. When a user searches for a book in a catalog, they are faced with the variations of that work as multiple, separate records. Which one to choose? With FRBR, all manifestations can be collapsed into one virtual record, with methods for users to narrow in on the items (e.g., language, form, etc.). Draft screen designs for RLG's new system are compelling evidence of the benefit of "FRBRizing." One screenshot shows a list of search results. Under the title and author of an item is the notation "19 editions published between 1916-2001 in 5 languages." The full record for that item provides a way for the user to see only editions in a specific language or the two audio versions. Blowing open the catalog The most promising development is OCLC's re-creation of the WorldCat database. WorldCat is no longer based on MARC but on an internal XML metadata schema devised by OCLC to accept nearly any type of metadata. It can also associate all kinds of related information with a book record, such as reviews and cover art, without squeezing it into an ill-suited format (i.e., MARC). WorldCat can crawl OAI-compliant repositories and incorporate those records as well; the world of metadata is now their oyster. Since RLG has similarly recast its MARC records in XML, it is likely that it will pursue opportunities to merge other kinds of metadata. Data mining Gathering metadata is a wonderful start, but even better is studying what you've got. Data mining is analyzing a database to discover information and linkages that are not immediately apparent. For example, the same book may have different subject headings at different libraries. By comparing the subjects, data mining software can imply relationships between those terms. Doing this across a large bibliographic database creates a web of related terms that can be profitably exploited for users. Here's how RLG's description of how its data mining, provided by software from Recommind, Inc., works: "A student might enter a search for the keywords 'Civil War' without specifying the American, Spanish, or other civil wars. Using Recommind, RedLightGreen can organize the results in clusters of related items, letting the student pick which civil war interests her." OCLC is conducting similar experiments. Slices, skins, and grails When searching WorldCat or RLG's Eureka, the user is faced with two issues: everything is seen, even items that can't easily be accessed, and the screen offers no localized information. But a project from OCLC addresses this. By summer, a Midwestern state will offer a "slice" of WorldCat to its citizens--holdings from libraries in the state. Although users will first receive local holdings, they can also search the broader WorldCat database. In addition, what users see will be tailored by the state, allowing links to local services. This ability to layer on a different "skin," or user interface, is crucial to providing local services such as reference. Taken together, this is revolutionary. Should library catalogs be allowed out of the back room that I relegated them to in an earlier column? (See [123]Digital Libraries , LJ 2/15/03, p. 28.) Not quite, since these impressive systems are still not the unified information finding tool I envision. But with WorldCat becoming so flexible it can ingest virtually any metadata, along with the other features that OCLC and RLG are working on, we may be on the road to the Holy Grail of librarianship: one-stop search service for information, wherever it may be. __________________________________________________________________ Link List FRBR [125]www.ifla.org/VII/s13/frbr/frbr.pdf OAI [126]www.openarchives.org OCLC's Research Activities [127]www.oclc.org/research/projects ONIX [128]www.editeur.org/onix.html Recommind [129]www.recommind.com RedLightGreen [130]www.rlg.org/redlightgreen