:: Digital Libraries Columns


Library Journal "Digital Libraries" Columns 1997-2007, Roy Tennant

Please note: these columns are an archive of my Library Journal column from 1997-2007. They have not been altered in content, so please keep in mind some of this content will be out of date. :: Digital Libraries Columns

Mass Digitization


   In my very first LJ Column, I wrote, "Only a very small fraction of the
   millions of print items currently held by the world's libraries will
   ever be in digital form" (LJ 11/15/97). In 1997, the company that would
   prove me wrong was starting up in a Stanford lab. Now, near-weekly
   announcements of massive digitization projects by Google and the Open
   Content Alliance (OCA) promise to change forever the way people find
   and use books.

   According to Google, we are only a few years away from having the
   entire collections of large research libraries completely digitized and
   searchable. Owing to copyright restrictions, only "snippets" of many
   books will be displayed, but they will be newly discoverable, and many
   full-text books will be browsable online.

   These efforts are likely to be transformational in ways we have yet to
   realize. Mass digitization, like many heralded new technologies, will
   take longer than predicted to make a serious mark and will create
   unforeseen impacts and enable unpredicted kinds of interactions with
   books. Whatever the outcome, libraries will be affected. We just don't
   know exactly how yet.


   Chip Nilges, vice president of new services for OCLC (and an [145]LJ
   Mover & Shaker 2005), is working on getting information about 
   books into Open WorldCat to make them more discoverable. If the book is
   available at the Google Book Search web site in snippet view only,
   owing to copyright barriers, Open WorldCat stands ready to show users
   which nearby libraries hold the book.

   Down the line, OCLC also plans to provide related services. "I
   absolutely want to do this in such a way that libraries can download
   records [of digitized books] to their local catalog. It's definitely
   the next step," Nilges said. OCLC also intends to work with the OCA to
   make sure its records are also in WorldCat. Libraries can also use
   OCLC's collection analysis service to discover which items in their
   collections have been digitized by these projects.


   Some books digitized by Google can be browsed in full-page image and
   downloaded in PDF format from the Google Book Search site, along with
   publisher-provided texts in either "snippet" or "limited preview."
   Since Google Book Search does not provide access to the thousands of
   public domain books available via Project Gutenberg and similar
   efforts, users are better off searching regular Google for them.

   The library catalog promises to provide access to digitized books as
   well as print books. The University of Michigan (UM) is the first
   library participating in the Google Library project to do so. MBooks
   allows users to search for books that have been scanned by Google and
   other content digitized by the UM libraries. Once users get to a
   specific digitized title, they can search within that item.

   Where Google makes the books it digitizes available for downloading
   only in Adobe Acrobat format, the OCA makes all the files available for
   downloading--from the raw images to the metadata. Books digitized by
   the OCA are freely available for browsing and downloading on the
   Internet Archive site.


   The libraries involved with mass digitization must deal with a
   logistical nightmare; thousands of books each day must be paged,
   packed, and shipped off-site for scanning. The same number must be
   received and reshelved. The scans themselves are sometimes poorly done
   or are missing pages. Projects scanning massive numbers of books
   sometimes sacrifice quality.

   In the end, the sheer number of books involved is amazing. Perry
   Willett at the University of Michigan estimates they now have around
   100,000 books from the Google project digitized and online. "I think we
   are seeing the first glimpses of what a real digital research
   collection will look like," he said. "It's definitely going to change
   the way we think about everything we do in libraries. I hope this
   starts a lot of discussion about 'what this all means,' because the
   potential is just limitless."

                                                  Link List
   Google Library Project
   [146] Internet Archive
   [147] MBooks
   Open Content Alliance
   [149] Open Worldcat