Library Journal "Digital Libraries" Columns 1997-2007, Roy Tennant
Please note: these columns are an archive of my Library Journal column from 1997-2007. They have not been altered in content, so please keep in mind some of this content will be out of date.
Digital Potential and Pitfalls
With the recent influx of millions of dollars in grant funds and a fast-growing awareness of the potential of digital technologies to transform access to information, digital library hype has outrun reality. Scenarios of digital library futures sometimes imply that nirvana will have been reached by the early 21st century, wherein the whole of recorded knowledge (or at least enough for most people) will be no further away than the nearest Internet-connected computer. Therefore I feel compelled to break my rule against predicting the future (an exercise best left to fools and geniuses) and assert: * only a very small fraction of the millions of print items currently held by the world's libraries will ever be in digital form; * digital library collections and services enhance traditional libraries, they don't replace them; and, * the need for services provided by real libraries and those who staff them will grow, not disappear. Yes, we still need real libraries and skilled people to staff them. But what those staff will be doing will be different, sometimes very different, from what they do today. This is not a prediction but a certainty. How many reference librarians can completely ignore online databases, CD-ROM databases, and the web and still do their job? Not many can, or should. But think back 20 years. Most reference librarians performed their jobs just fine without the very things that are now essential to adequate service. The game has changed. How the game is changing is at the core of this column. Those who build digital libraries are helping to change the rules and by so doing forge a framework that will take libraries well into the next century. However, among the techniques, technologies, and draft standards that will serve as the foundation for these new libraries are research projects that will go nowhere, technologies that will never be adopted, and new paradigms that are a dime short. I will try to steer you through the mess and highlight what should be on your horizon: projects and people to watch, technologies to learn about, and how to keep current. Defining the digital library There are almost as many definitions of "digital library" as there are projects using the term, but the Association of Research Libraries (ARL), in its "Definitions and Purposes of a Digital Library," has defined a digital library as having these qualities: * The digital library is not a single entity; * The digital library requires technology to link the resources of many; * The linkages between the many digital libraries and information services are transparent to the end user; * Universal access to digital libraries and information services is a goal; * Digital library collections are not limited to document surrogates: they extend to digital artifacts that cannot be represented or distributed in printed formats. ARL's broad definition of a digital library leaves a great deal of room for diversity, which is reflected in projects by the Library of Congress, the University of Virginia, the Fine Arts Museums of San Francisco, and many others that will be profiled in future columns. For now, the six projects funded with multimillion dollar grants from NSF/DARPA/NASA can illustrate this diversity. Carnegie Mellon is focusing on creating a multimedia library consisting of video, audio, images, and text. Its research involves speech recognition, image understanding, and natural language support. In one of its projects, a spoken query can result in the playback of an appropriate portion of a news broadcast. The project does not yet have anything online for public viewing. The University of Illinois project is attempting to bring together disparate sources of scientific information and improve search results for federated (joined) databases. Its prototype system, Desktop Link to Virtual Engineering Resources (or "DeLiver"), includes a number of engineering journals in full text. Anyone can search it, but to see the articles you must be an authorized user. About 40 percent of the articles are in Standard Generalized Markup Language (SGML), which requires the use of a free program for viewing (available only for MS Windows). The remainder of the articles are Adobe Acrobat format, viewable on any platform with the appropriate free viewer. Stanford is tackling the issue of interoperability, or how disparate databases can be treated as one by the user. A number of separate projects have been started to treat different aspects of the problem. Unfortunately, there is not yet much to see, just technical papers not for the faint of heart. The University of Michigan aims to create intelligent agents to aid in locating information. Its prototype explores the subject domain of earth and space sciences. The project's prototype interface, "Artemis," is written in Java. Another component of Michigan's work is a "Teaching and Learning" project that "provides guidelines and design standards for teaching and learning materials to support science inquiry through online resources." The Alexandria Digital Library at UC-Santa Barbara is trying to create an interface to distributed collections of spatially referenced information (that is, anything that can be associated with a particular region of the earth). Examples include aerial photographs, seismic data, and remotely sensed digital imagery. They could be used for environmental monitoring and development planning. Presently one must be authorized to gain access to the prototype interface, which is being written in Java. The public gets digital Out of all the projects, UC-Berkeley's clearly has the most actual content available to the public. The project uses environmental information for subject content and thus includes environmental impact reports, a California dams database, a California flowers database, and a variety of other resources. In all there are presently over 55,000 images (but access to half of these, stock photos from Corel, requires a password), nearly 2000 documents, and almost 100,000 database records. The project uses Java quite a bit, and my advice is to avoid the Java applets until some serious performance issues are addressed. Should you investigate the site, first save your work in any other open applications. It is a "hard hat" area that can crash your computer. Don't let these warnings keep you away from UC--Berkeley's site, or elsewhere in the digital world. Stick your neck out. Take the plunge. Leave your comfort zone and expand your horizons. Digital libraries are here to stay, and there has never been a better time to learn about what they will mean for you, your library, and your users. 