roytennant.com :: Digital Libraries Columns

 

Library Journal "Digital Libraries" Columns 1997-2007, Roy Tennant

Please note: these columns are an archive of my Library Journal column from 1997-2007. They have not been altered in content, so please keep in mind some of this content will be out of date.

roytennant.com :: Digital Libraries Columns

Is Metasearching Dead?


07/15/2005

   The best thing about Google Scholar, the beta Google service for
   searching scholarly information, is Anurag Acharya. Acharya, the
   architect of Google Scholar ([145]Scholar.google.com), is approachable,
   bright, and focused on building a usable interface for those seeking
   scholarly information. And, mostly, he has been successful.
   Scholar successes

   Google Scholar crawls scholarly, web-based content--predominantly by
   targeting specific publishers with which Google has contractual
   arrangements. Open access material is crawled as well. These full-text
   materials, abstracts, and citations are being indexed, and citations
   from these papers to other sources are also extracted and indexed. So
   besides returned links to full-text articles, search results can also
   include citations to nonscholarly materials and scholarly materials
   that Google has not crawled.

   Google Scholar claims to have agreements to crawl its full-text content
   with all major publishers except Elsevier and the American Chemical
   Society, but at least one other publisher is missing--the American
   Psychological Association.

   Search results are returned in rank order, using an algorithm that
   includes such criteria as where the search words appear (e.g., search
   words in an article title provide a higher rank than words in the body
   of the document) and how often an article has been cited. Searching
   Scholar, however, demonstrates that the ranking weight afforded to
   highly cited articles is in most cases the most compelling factor. The
   most highly cited articles consistently float to the top. From one
   perspective, this is useful since it tends to highlight the most
   historically important articles.

   A clear success of Google Scholar from the library perspective is that
   its staff have cooperated with libraries to implement a mechanism for
   library OpenURL links to appear in the search results. Libraries need
   to register for this service (see
   [146]scholar.google.com/scholar/libraries.html) and provide holdings
   information to Scholar (usually done by a configuration setting in link
   resolver software). But there are problems, which is why Acharya is so
   good. He is up to the challenges.
   Scholar challenges

   One challenge is to serve multiple purposes and audiences effectively
   with a service that is Google-simple and therefore not very flexible.
   As noted, highly cited articles often appear at the top of the results.
   But for users like scientific researchers familiar with their field,
   these are the articles they don't want to see. They want the newest (as
   yet uncited) research, which sifts lower in the results. The Scholar
   interface provides no way to sort results by publication date.

   The service is also plagued with timeliness issues. It can take months
   for articles that have appeared in PubMed to appear in Scholar.
   Scientific researchers will find such a lag time unacceptable. Acharya
   is aware of this issue, and the company is working on it.

   There are also searching anomalies that prevent articles from being
   found with standard techniques. If you search the entire title of an
   article as a phrase, the very length of the search string can cause the
   search to fail. Meanwhile, using selected words out of the title often
   returns the article far down in the results list, since full-text
   searching will often retrieve other articles more heavily cited.
   Library metasearching: RIP?

   Will Google Scholar replace the need for library-based metasearch
   services? Some of my colleagues believe so, but I don't, no matter how
   good Scholar gets (and it will get better).

   Unlike Acharya, who thinks ranking renders selection unimportant, I
   believe what you don't search can be as important as what you do.
   Search "Hamlet" on Google Scholar and you will be inundated with
   scientific articles by various Hamlets. Even limiting to words in the
   title (the most specific search one can do) results in many scientific
   articles interspersed among the literary.

   I believe in creating search interfaces crafted for a specific audience
   or purpose, and Scholar's one-stop shopping can be a
   less-than-compelling generic solution to some rather specific problems.

   Even if Google Scholar eventually gains access to a reasonably large
   collection of the scholarly record, librarians will still need to unify
   searching of two or more sources on behalf of their clientele. There
   will still be a need for metasearch services.

   In the end, Scholar is a tremendous advance for those who have little
   or no access to the licensed databases and content repositories that
   libraries provide. But for those who are served by large research
   libraries, it is very much an open question whether the generic Google
   Scholar can serve their needs better than services tailored
   specifically for them.