Library Journal "Digital Libraries" Columns 1997-2007, Roy Tennant
Please note: these columns are an archive of my Library Journal column from 1997-2007. They have not been altered in content, so please keep in mind some of this content will be out of date.
The Digital Library Divide
08/15/2001
All digital library projects are not created equal. Some projects attract millions of dollars of research funding, while others create useful collections without a dime of outside support. Some projects serve only for research purposes, while others produce specific online collections or services. Knowing the difference should help you to determine which projects may help you today and which may help you tomorrow. Research projects Digital library research projects are usually conducted by university faculty--generally in computer science. The aim is to expand our collective knowledge. A research project can be termed successful even if the only outcome is a description of the research results (and a call for further research). Some research projects may also have goals that approach those of production projects (to create a useful collection or service), but they need not do so--which is often the case. Most projects in the National Science Foundation's (NSF) Digital Libraries Initiative Phase One fall into this category. After more than $24 million in federal funding was spent over four years (1994-98), there was very little available that could be considered a library collection or service. Whether any of the technologies explored in these projects will ever result in real production digital library services is anyone's guess. But at least with the second phase of the Digital Libraries Initiative beginning in 1998, the NSF and the other funding agencies (which include the Library of Congress [LC]) supported production projects as well as research. Also, ostensible digital library projects may not even have much to do with libraries as they are today, or as we might envision them in the near future. For example, see the research by the University of California, Berkeley, Computer Science department that went into "Blobworld," retrieving images by querying shapes and colors. I find it difficult to imagine a circumstance in which such an interface would be useful in a typical library setting. If you don't believe me, go there and try it out for yourself. Production projects Production digital library projects aim at meeting actual user needs today with existing technologies--and should be more helpful to peer libraries. For example, the Colorado Digitization Project has specific production goals for digitizing content and making it available to the citizens of Colorado and beyond. Other clear examples of production digital library projects include the LC's American Memory Project and the New York Public Library Digital Library Collection, among many others. Although these types of projects do not tend to be truly cutting edge, they often must set up new methods and procedures, from which other libraries can learn. LC has been particularly good at sharing its methods with others. (See the Building Digital Collections site.) Librarians vs. computer scientists There is a professional aspect to the digital library divide as well. Librarians as a whole will tend to find production projects more instructive for everyday library needs--at least in the short term--than research projects. The relatively small number of library schools in the United States means that the profession lacks a solid core of library-focused research efforts in digital libraries. Since much of the digital library research dollars are awarded to computer scientists, computer scientists set the research agenda. Generally, they lack the in-depth knowledge of library and user needs that librarians possess. Since this situation is unlikely to change dramatically, librarians should team up with computer scientists on research projects whenever possible. Funding sources for production-oriented digital library projects are also available. Bridges across the divide Thankfully, the divide is not without bridges. The following are examples of projects that bring together librarians and computer scientists to create functioning services that also push the knowledge envelope. These projects demonstrate that research and production can go hand in hand, and when they do, they can be both groundbreaking and effective. One of the first digital library projects that actually created a production digital library collection was the Networked Computer Science Technical Reports Library (NCSTRL, pronounced "ancestral"). NCSTRL is a worldwide network of computer science technical report repositories. All collections are searchable from any one repository, although when a particular paper is selected, it is retrieved from the remote location. As a production service, this project has library participation on a number of levels. A Stanford librarian helped create a record format for the bibliographic information and several libraries (UC-Berkeley, for one) are now supporting the NCSTRL project on their campuses. The Open Archives initiative (OAI) (see "Open Archives: A Key Convergence," LJ 2/15/00) has had library participation from the outset, as it promises to provide a key piece of interoperability infrastructure. To some degree, it builds on the NCSTRL work in this area. Key library organizations involved in this effort include the Digital Library Federation and the Association of Research Libraries. Since the purpose of the Dublin Core metadata initiative is to create a low-barrier syntax for encoding metadata about digital objects, it should come as no surprise that the library community led the way (via OCLC staff). But the effort now includes a number of computer science professionals and organizations worldwide, as well as participants from other disciplines. This emerging metadata standard is being used today to describe a wide variety of electronic resources in a number of different technical environments. Building more bridges The digital library divide is real, and it can present a barrier to communication and cooperation. But overcoming this barrier is important to both librarians and computer scientists. Librarians can learn much about new possibilities from current computer science research. Computer scientists can understand better what people want and how they expect to be able to use it. Working to bridge the digital library divide is not just for large research libraries. Some projects, like OAI, have fairly low barriers to participation. Free software such as the University of Southampton eprints package can give any library an Open Archives-compliant e-print repository if it has a staff member with a modicum of technical skill. We all must work to expand our horizons about what is possible and good for our users. __________________________________________________________________ LINK LIST Blobworld [118]elib.cs.berkeley.edu/photos/blobworld Building Digital Collections [119]memory.loc.gov/ammem/ftpfiles.html Collaboration Through the Colorado Digitization Project [120]www.firstmonday.org/issues/issue5_6/allen Dublin Core [121]dublincore.org eprints Software [122]www.eprints.org NSF Digital Libraries Initiative Phase I [123]www.dli2.nsf.gov/dlione NSF Digital Libraries Initiative Phase II [124]www.dli2.nsf.gov NCSTRL [125]www.ncstrl.org NYPL Digital Library Collection [126]digital.nypl.org Open Archives Initiative [127]www.openarchives.org