Library Journal "Digital Libraries" Columns 1997-2007, Roy Tennant
Please note: these columns are an archive of my Library Journal column from 1997-2007. They have not been altered in content, so please keep in mind some of this content will be out of date.
The Grand Challenges
12/01/1997
Decades ago, our professional predecessors laid the foundation upon which so much of library technology rests. Through sheer hard work and determination, they created a metadata standard for the interchange of bibliographic data between computers called machine-readable cataloging (MARC). Now we are at a similar professional crossroads, wherein the particular technologies, standards, and models we choose to adopt will shape digital libraries for decades to come. But today the stakes are even higher. While MARC solved one particular problem, with digital libraries, we are re-creating libraries from the bottom up, with literally no function unaffected. How digital items are selected, acquired, organized, accessed, and preserved is all different from our familiar methods. Thus, making the wrong decisions or selecting the wrong standards or technologies can have disastrous consequences. This column looks at some main challenges; future columns will address them in more detail. Interoperability A popular vision of the digital library is that a library user at one location using one interface could seamlessly search the digital collections of hundreds of libraries. To make this vision a reality, the systems we build must be able to interoperate very effectively. This will require digital library systems nationwide and internationally to comply with certain key standards. In reality, we will probably see islands of noncompliance among regions of interoperability. Interoperability on a grand scale is the digital library Holy Grail; we are doomed forever to seek it without attainment. We have seen some initial successes in limited domains where draft standards are emerging. Standards Standards help create interoperability among different systems and a smooth migration path among technologies. One of the most important areas for standards development in digital libraries is metadata--information about information or, as librarians know it, cataloging. Metadata is the information required to manage an item, organize it among other items, make it retrievable, and in some cases make it navigable. There are two basic reasons why we need metadata standards. First, we need metadata standards for situations in which we want to record much less metadata in a much simpler fashion than full MARC cataloging. For example, many digital library collections consist of thousands (and, soon, millions) of individual items (such as photographs), many of which it would be too labor-intensive to fully catalog. These items may need to be managed as a collection instead, which calls for a different style of metadata. Also, we need standards when we wish to record more metadata. While a book's catalog record includes the author, title, publisher, and some other key information, for the digital version, we also may need to record information about how it was created (e.g., scanning resolution, bit depth, file format) and also information that allows us to create navigation systems (e.g., which is the first page?). Rights management Librarians are quite familiar with rights management in the print world. We buy a book, and it's ours to lend as we choose. Photocopiers only gave us a brief pause, until we posted signs by our machines warning against copyright infraction. But in the digital environment we face some serious ambiguity that has yet to be decided by either statute or case law, although Congress is currently debating it (but then let the court battles begin!). What comprises "fair use" in a digital environment? You will likely get very different answers from a librarian and a publisher. Given such a confusing situation, most digital libraries avoid any copyrighted material. Luckily there is much material that falls into this category, largely manuscript and archival items. Preservation You will find no digital format among the currently accepted preservation formats. Computers haven't been around long enough to determine if any digital format can be preserved effectively. Rather, when we talk about digital preservation we typically talk about a strategy, the main parts of which--as outlined in the [128]Preserving Digital Information report--are storage, refreshment, and migration. Storage must be backed up with recovery systems should a disk drive crash or a natural disaster occur. As particular instances of storage media start to fail (e.g., a CD-ROM begins to delaminate), the information must be moved to a fresh copy of the same medium. Migration is the most difficult problem. As entire technologies die (remember 8-track tape?), the information must be rescued in an entirely different technology, which may require entirely new or rescued access software. Much work remains to determine rescue procedures, failure prevention, and organizational structures that can manage these responsibilities. Look to the Research Libraries Group, the Coalition for Networked Information, and other large consortial organizations for leadership on this issue. Only by banding together will we be able to make headway on this problem. Resources We do not create digital libraries to save money. We create them to greatly expand access, increase usability and effectiveness, and establish entirely new ways for individuals to interact with information. Rather than being cheaper to create, maintain, and preserve than print collections, the evidence so far seems to indicate the opposite. Meanwhile, we are retaining and expanding our print collections. How can we create new models for sustaining an expanded operation within a budget climate of stable or shrinking resources? How can we recruit or retrain staff for digital libraries when the entire field is only a few years old? User interface design You know the problem. Library patrons line up behind a CD-ROM index to wait their turn while a better print index lies unused. University students consider it a point of pride when they complete their paper solely from web resources and thus avoid entering the library. One of our most important yet most difficult tasks in creating digital libraries will be to build structures that do not leave print resources behind. We must fight our patrons' tendency to believe that everything on the net is now (or will be) free. Also, we must design intuitive yet powerful interfaces to a wide variety of systems and resources. And the resources to be discovered will require user environments that support sophisticated operations such as selecting and charting statistics, or selecting and overlaying geographic information system coverages on a base map. Digital libraries are still possible without solutions to these grand challenges. But they will never achieve their potential if solutions aren't found. When our professional descendants look back, as we now look back at the creators of MARC, it is up to us to make sure they see that challenges were met with imagination, perseverance, and skill. LINK LIST Interoperability [123]http://www.ncstrl.org/ Metadata Resources (from IFLA) [124]http://www.nlc-bnc.ca/ifla/II/metadata.htm Copyright and Fair Use [125]http://fairuse.stanford.edu/ Preservation [126]http://sunsite.berkeley.edu/Preservation/ Preserving Digital Information: Final Report and Recommendations [127]http://www.rlg.org/ArchTF/