Library 2.0 and the catalogue

By Margaret Adolphus

Introduction

Libraries are increasingly using Web 2.0 technologies (such as media sharing, blogs, wikis, RSS, etc.) to involve the user in their services, but, however much libraries have evolved as something more than the sum of their collections, they are still an important part of their offering, and the catalogue the main means of discovery.

Yet catalogues are losing out as search tools to the Internet.

The main problem here is the way in which we have been spoilt by the Web. Historically, the catalogue has never been a particularly user-friendly tool (I can't speak from experience, I seem to have got by with a not totally undistinguished academic career without using it much), but its discrepancies are emphasized by the ease with which it is possible to search the Web.

Think of the single search box, ranked results and the fact that many commercial retailers such as Amazon embellish their search results with more than the desultory bibliographic information – you can, for example, see an item's cover, browse content and read reviews.

So why aren't libraries flocking en masse to follow Amazon's lead? The truth is, it's not that simple. Whereas adding some 2.0 features often involves small, relatively easy initiatives, such as uploading photos of events to Flickr or writing a blog, making changes to the catalogue is often more difficult, simply because it lies at the heart of the library's infrastructure. Thus it's often the Cinderella of the bold new Web 2.0 world, the service left well behind in the 20th century.

It is however critical that the catalogue is a user-friendly experience; it is the major patron discovery tool and the way in which users interact with the library's resources. Not only that, a number of web-based commercial reference sources have sprung up, such as Amazon's Askville, which actually threaten library reference services.

Eli Neiburger, associate director for information technology and production at Ann Arbor District Library, has this to say about the catalogue:

"It's crucial that library catalogues deliver as satisfying and rewarding of a web experience as the best commercial web products; libraries are in competition with these services whether we realize it or not. It's a critical time for libraries and it's very dangerous for our most prominent public interfaces to be weighed in the balance and found wanting."

What's wrong with the catalogue?

There has been a general consensus that the OPAC does not rate highly as a search tool. Wang and Lim (2009) comment that we can no longer expect information seekers to confine themselves to the catalogue than we can expect patrons to come into the physical building and not use the virtual library.

The criticisms of the "old fashioned" catalogue tend to focus on the fact it evolved to service a world based on print technology, and one where search meant looking for a particular item. Neither are suitable for an age where there is an abundance of electronic resources, and where most search engines offer intuitive interfaces suitable for exploring a wide range of possibilities.

More specific criticisms are:

Print-based systems are inadequate for digital resources.
While good for known item searching, they are not good research and discovery tools: keyword searching is often limited by the underlying index; there is no spellcheck ("Did you mean ...?") or relevance algorithms. This functionality compares poorly with our ability to use natural language and multiple terms that we have got used to with the Internet. With the Internet, we can browse as well as target, and we want to do this with the catalogue as well (Merčun and Žumer, 2008b).
If browsing is limited, so is discovery. In most catalogues, the user is presented with very little more than bibliographic information. This does not answer the question, does this fit my exact requirements? It would be helpful if information could be enriched with a picture of the cover, a table of contents, summaries, reviews, ratings, and other recommendations (Merčun and Žumer, 2008b).
Online searching can rank results, which helps us decide what to select. This does not happen with the traditional catalogue, leaving the user with the need to go through many items which are not clustered or prioritized. Results clustering (for example by theme) and faceted navigation, which allows the user to search an item in multiple ways (for example, advanced search where you can search by year, by subject, etc.), are helpful here (Merčun and Žumer, 2008b).
Not only is there a dearth of information and poor searching functionality, but many catalogues are not, compared with other search tools, intuitive. The end-user interface is limited and there is "too much focus on library administration in terms of data structures, design and workflows" (Wang and Lim, 2009: pp. 27-28).
There is limited support for multiple metadata standards, or for functional requirements for bibliographic records (FRBR) (Wang and Lim, 2009).
Many catalogues are limited to one collection, whereas users want access to a broad range of material (Wang and Lim, 2009).
They lack social networking facilities (Wang and Lim, 2009).

However, criticism of catalogues dates back to a time when the Web itself was hard to use. There has long been a need to improve their functionality around search, presentation and navigation, and to add such features as a keyword search box and results ranking (Merčun and Žumer, 2008a).

In fact, 20 years ago, Web 2.0 functionality for catalogues was anticipated when Hildreth wrote about "third generation catalogues" having natural language search, browsing and ranked results, as well as:

"expanded coverage and scope, relevance feedback methods ('more like this', 'not interested'), user-popularity tracking, and different aids (spelling corrections, synonyms, automatic term conversion)" (Hildreth, 1982, quoted in Merčun and Žumer, 2008a).

Closer to our time, Maness (2006, quoted in Wang and Lim, 2009) sees next generation catalogues as:

"a social network interface, a personalized OPAC that is user-centred, socially rich, and communally innovative, including access to instant messaging (IM), RSS feeds, wikis, tags and public and private profiles within the library's network" (Wang and Lim, 2009: p. 28).

All in all, a catalogue should offer swift, efficient, deep and intuitive searching, provide a rich selection of resources, and enable users to add to the catalogue (e.g. their views, ranking etc.) (Wang and Lim, 2009).

Some new developments in catalogues

Tanja Merčun and Maja Žumer are researchers in the Department of Library and Information Science and Book Studies at the University of Ljubljana in Slovenia, and carried out research on Web 2.0 functionality in six library catalogues. Although completed in 2008, it is worth a brief summary for the way in which it reveals how ahead of the game are some US public libraries.

Their literature review revealed a number of desirable features:

search
results page and navigation
enriched content and recommendation lists
user participation
user profile and personalization
other trends.

They evaluated six library catalogues, including one traditional:

COBISS, a union catalogue that allows access to the bibliographic facilities of libraries within Slovenia and the surrounding area, as well as resources such as WorldCat.

And five more modern and innovative ones:

Merčun and Žumer summarized the research in two tables as follows, one on functionality and the other on participation and personalization (note, this represents the situation as of 2008, and findings may no longer be strictly representative of the current state of the catalogues).

© Merčun and Žumer 2008b, reproduced with kind permission

One of their conclusions was that catalogues tended either to concentrate on improving functionality, or on increasing user participation and personalization. The former were, they maintained, easier to use.

Using Merčun and Žumer's headings, I shall now examine some trends in catalogues which adopt Web 2.0 features.

Search and results

Merčun and Žumer (2008a) comment favourably on the search facility of Phoenix Library, which offers a single entry search box, a feature favoured by Wang and Lim (2009). There are possibilities for more advanced searching, with results ranked and opportunities for refining the search.

Screenshot: Phoenix Library search.

Screenshot of a Phoenix Library search

This search system is ideal for a more "browsing" sort of search, where a general interest can be narrowed down through the suggestion of possibilities. To test this, I typed in a very general term, Russia, and got the opportunity to refine my search according to subject, format, whether the item was in stock and where, as well as the opinions of others. Thus "Russia" can lead one to Russian food, Tolstoy, or Russian film.

The single search box is popular because it mimics popular search engines such as Google and online retailers such as Amazon. It's an approach favoured by more and more university libraries, for example, Bristol University Library, Penn State University Libraries, and Exeter University Library (for more detail, see the article on academic search engines which describes how federated search engines are popular with libraries [and students] because they enable searching of multiple databases via a single search box, and offer options of "quick search" or "multisearch").

Not everyone wanting to make search user friendly would agree with this approach, however. Eli Neiburger of Ann Arbor District Library comments that her library has never really felt that the single search box is that important, and that few people have asked for it:

"There's more to a library than just the collection".

Ann Arbor Library is however currently looking to develop its catalogue to have a larger focus on keyword searching.

When the Danish National Library authority opened its union catalogue (http://bibliotek.dk) to the public, it decided to offer a feature-rich range of different search options for different users, which it termed the "Swiss army knife approach" (Larsen, 2007). The user can carry out a fairly simple search using bibliographic information, an advanced search which limits the search by format, language, year of publication or library type, or, for the more sophisticated searcher who can use parentheses, operators and truncation, there is the command mode.

Screenshot: bibliotek.dk catalogue search.

Screenshot of bibliotek.dk search

However good the search facility, it may still not reveal the results you want. A clever touch is to embed an e-mail or an IM facility into a "No search results page" so that frustrated users can get hold of a real, live librarian (see Levine, 2008).

Enhanced content and user participation

Libraries are beginning to offer more information on their holdings, for example, Ann Arbor offers (in some cases) a list of contents, a cover image, the opportunity to browse via Google Books, and some reviews. Hennepin offers a tabbed menu with details of holdings, a summary of contents, and some press reviews.

Getting users to generate, as well as consume, content is a key Web 2.0 trend. New Internet technologies are allowing the addition of user reviews, ratings, tags, and lists to the catalogue. However, one problem is that this works best when there is a critical mass of users who behave in this way, which many libraries may not yet have achieved.

Both Ann Arbor and Hennepin libraries offer users the opportunity to include reviews of books in their catalogues; Hennepin Library has "Bookspace", where you can leave a comment:

Screenshot: Hennepin Library Bookspace comments page.

Hennepin Library's "Bookspace"

Sutton Libraries in south London have signed up for ChiliFresh, which provides reviews from all over the world. Setting it up in their SirsiDynix iBistro OPAC proved very easy, as SirsiDynix have a partnership with ChiliFresh, and, according to Graham Dash from Sutton Libraries,

"the technical team at ChiliFresh sent very clear documentation for making the relevant additions to parts of the iBistro system".

Personalization

Personalization involves storing information about the user, and then providing information which is particularly relevant to their needs. Thus a store such as Amazon will make suggestions based on previous purchases. In the case of libraries, the suggestions are based on borrowing records.

More and more libraries have "My account" options, through which the user can manage items borrowed. Those with more options for user participation gather more information on the user (in the form of tags, etc.), and are therefore able to develop more "personalized" pages.

The software

One of the main principles behind the new catalogues is that they are no longer solely reliant on one integrated library system: frequently, other propriety systems are incorporated, for example Google Scholar often appears as a search option in academic libraries.

Many library catalogues have a decoupled architecture where there is a separate platform that supports the user experience. This enriches not only search, but also discovery, as the new solution can harvest information from a number of different sources.

There are a number of new products and solutions which have recently come on the market and which help improve catalogue functionality: here is a brief overview of a selection.

AquaBrowser

One of the most popular is AquaBrowser, recently selected by Harvard University Library as its research and discovery platform. Its features include:

faceted navigation,
word cloud discovery,
relevance ranking,
an integrated authority file,
real-time availability of items, and
the ability to integrate with the ILS for patron services such as placing reservations and requests.

Several leading Web 2.0 libraries use AquaBrowser, including Santa Cruz Public Library, which for the past 18 months has been offering it as an alternative to the traditional online catalogue. Its webmaster commented:

"We wanted to offer a more intuitive interface to our patrons, but we also wanted to give them access to resources outside of our catalogue through the catalogue search interface. For example, we are dumping the content from several of our locally created databases into AquaBrowser every day so that patrons get results from our community information database, local newspaper index, and newspaper clipping file index in their AquaBrowser search results. Results also include other information from our Library website – booklists, local history articles, etc. I think this helps lead people to information that they might not otherwise have found.

"Patrons seem to really like all of the 'refine' options that AquaBrowser offers. There are some mixed reviews on the tag cloud – although those who like it really like it. The comment that patrons make most often is that the interface gives them something of the feeling of being able to browse the shelves (without actually being in the library). Of course, there are others who prefer to use the old interface because they feel it allows them to be more specific in their search."

Thus patrons are given a choice: they can search for something very specific, or they can just browse.

Screenshot: Santa Cruz Public Libraries catalogue search.

Screenshot of a Santa Cruz Public Library catalogue search

The "conventional" catalogue returns bibliographic details along with call numbers and availability, however an AquaBrowser search broadens as opposed to narrows the search, because of its use of folksonomies. These will be discussed further below, but are essentially related words, tag clouds, including spelling variations and associated terms. Using them opens up the possibilities of browsing and discovery by serendipity: in the above example, a patron may be thinking about getting a golden retriever type pet, but can also find out about similar dogs such as labradors.

Endeca

AquaBrowser is one of a range of search and discover tools described by Wang and Lim (2009: p. 29). Another is Endeca which replicates the physical aspect of browsing along the shelves by allowing the user the option of virtual subject browsing.

Koha and VuFind

Open source software has proliferated under Web 2.0, and several integrated library systems have been developed. Examples include Koha, and VuFind. The advantage of open source is flexibility, and the addition of features not generally found in more conventional systems, which enhance browsing or deepen search.

For example, Stow-Munroe Falls Library's catalogue enables users to refine searches, and save tags, and the main catalogue page announces that it will shortly be introducing "virtual shelves". Meanwhile, there is some facility to browse by format and genre (e.g. fiction video, junior reference, etc.).

Screenshot of Stow-Munroe Falls Library catalogue.

Screenshot of Stow-Munroe Falls Library catalogue

BiblioCommons

BiblioCommons is described as a "next generation discovery tool, a social network, and an OPAC" (Whitehead, 2008). Currently still being worked on, it's been described as "the first truly social online catalogue", and works on building connections between users and content. One user is Oakville Public Library: there are a large number of tools that people can use to add content, tag and share, as is demonstrated in the screenshot below. However, the function suffers from lack of critical mass (a point mentioned earlier) and it's quite hard to find comments on all but the most well-known items.

Screenshot: Oakville public library catalogue. Reproduced by courtesy of Oakville Public Library.

Screenshot of BiblioCommons as used by Oakville Public Library (by courtesy of Oakville Public Library)

SOPAC 1.0

Ann Arbor Library catalogue is an interesting case of incremental software development, with social features added on. John Blyberg, who now blogs in blyberg.net, and works at Darien Library, was lured away from Lotus to work on the initial software development in Drupal in 2005. Blyberg subsequently developed SOPAC 1.0, a set of social extensions to the catalogue, which was launched in January 2007.

The whole idea behind the catalogue, according to Eli Nieburger, is to have

"a completely integrated user experience",

and that,

"the catalogue and the website should not be separate products and that the line between the systems remain purely infrastructural, and never noticed by the user".

Also that the catalogue should be,

"more of a living product that our patrons could participate in, which was the impetus of SOPAC".

Blyberg (2008) describes SOPAC 1.0 as

"an expansive Drupal module that completely integrates all online catalog and patron activity".

The result is a product which is highly customizable: you can make your catalogue look just the way you want it to. It has two independent software libraries: one (Locum) is for all catalogue-related activity, the other is for social data (Insurge). Blyberg has subsequently developed SOPAC 2.0, to which Ann Arbor is upgrading.

The power of the crowd

Much of the software discussed in the previous section is based on folksonomies, a new, user-centred way of classifying knowledge.

Traditional library classification systems, such as Dewey, organize knowledge hierarchically. Folksonomies, on the other hand, are taxonomies of web content generated by the user from natural language. The difference between them can be described as follows (Noruzi, 2007):

Table showing Noruzi's description of differences between library classification systems and folksonomies
Library classification systems	Folksonomies
Are hierarchical, for example siamese cat -> cat -> felis -> felidae	Are flat structured, so a work on Siamese cat could be tagged "pets", or "detective fiction" for Lilian Jackson Braun's delightful books
Are compiled by librarians, using their own keywords or a formal classification system such as Dewey or Library of Congress, according to the purpose	Are chosen by the user, and can reflect their own interests, values, vocabulary, etc.
Require item to be put in a fixed place in the library	Item can be placed in more than one place

The benefit of folksonomies is that they represent a democratic approach to knowledge organization, and that they shift the responsibility from the vast task of cataloguing the Web away from the librarian. Spiteri (2007) believes that allowing users to tag resources makes a considerable contribution to public libraries.

One contributor to the Library 2 Ning network comments favourably on the possible addition of folksonomic tags to "normal records":

"I think that it would allow books that discuss several topics to have a better chance of being found. If, for example, you were looking for a non-war novel that took place in the civil-war era, if you searched civil war, you would get only civil war books. With folksonomies, you get books such as Little Women, which wouldn't necessarily have civil war in its record" (Layton, 2009).

The problem of folksonomies, however, is lack of standardization, which can cause some ambiguity, particularly in relation to:

Plurals: if both the singular and plural form are used (cat and cats) the search will not necessarily retrieve both.
Polysemy: words with more than one meaning, for example "apple" can mean both a fruit and a computer.
Synonymy: different words with the same meaning, again the search may not reveal all relevant items if different terms are not used.

Another problem with folksonomies is the depth to which the resource has been tagged: for example, is it at the level of the title, or are sections and paragraphs also included?

The problem of ambiguity can be partly overcome with the use of controlled vocabulary and thesauri – agreed or preferred terms used as guidelines. Spiteri (2007) refers to the National Information Standards Organisation guidelines on thesaurus construction, and points out that the major folksonomy-based sites such as Delicious adhere to these. She also advises that if libraries decided to incorporate folksonomies, then they should provide clear guidelines, for example on the use of singular/plural, creation of multiterms with more than one word (e.g. do you leave a space or use an underscore?).

The power of networks of institutions

One of the main principles behind Web 2.0 is sharing, and this applies to institutions as much as individuals. Many libraries share their collections with other libraries through union catalogues: one such example is COBISS (discussed in Part 3), which covers Slovenia and the surrounding area.

The Danish union catalogue, bibliotek.dk, has also already been mentioned, and is a service whereby people can search the holdings of Danish libraries. Currently, it is planning various improvements to its services, including the ability to renew and cancel reservations, using the NISO Circulation Interchange Part 1 (NCIP) protocol to access local catalogues; users will also be able to rate and comment on items.

By far the largest union catalogue is WorldCat, which describes itself as a "global network of library content and services", with members in 112 countries (as of March 2009) and with holdings in every conceivable format, including those of several national libraries, and its records conform to high standards of metadata.

WorldCat can be searched online through www.worldcat.org, and users have the opportunity to create their own accounts, search for items and locate them in a nearby library. There is plenty of opportunity for participation and personalization: users can create lists, save searches, add their favourite libraries, write reviews, and create profiles.

WorldCat is also the underlying database to FirstSearch, an academic search interface which can be integrated within library web pages and allows you to link directly to your own resources and those of other libraries.

Google Book Search enables search for books online, and in some cases the full text of an item is available; it is particularly useful in the case of books which are rare or out of date. The objective of the Google Books Library Project is to:

"work with publishers and libraries to create a comprehensive, searchable, virtual card catalog of all books in all languages that helps users discover new books and publishers discover new readers" (see http://books.google.com/googlebooks/library.html).

Libraries which include their collections can share basic catalogue information, go into more detail about the content, and provide extracts from the text.

Conclusion

In this article I have examined ways in which libraries can augment their catalogues to include information from participants, and to enhance the search process by providing a browse facility. Users want powerful catalogues with access to large numbers of resources, which can help them work efficiently; they want fast searches that deliver the results in a sensible order; they want to tap into community views, and they want tools such as instant messaging and RSS feeds.

John Blyberg, the technical brains behind the Ann Arbor catalogue, reminds us that what he terms "social OPACs" are part of the community responsibility of librarians. When someone comes looking for a resource on teenage pregnancy, domestic abuse, or cystic fibrosis, the richer the search capacity of the catalogue, the more likely they are to find the resource that meets their need at that moment:

"It may be that in that moment when a patron is about to turn away from the library, something catches their eye – a tag, a comment, some marginalia, perhaps, that puts the patron in front of the material they truly need.

"The key component in growing social OPACs is community. Once you put the community you service into the process of delivering content back out into the very same community, you initiate a loop that will become exponentially richer over time as those neural connections glom on to each other. Findability is not the goal, but the activity and the experience which is why I say that OPACs have the potential to be fascinating places to visit and browse. They will not embody the comforting, muffled presence of the old card catalog. No, they'll be their own individual entities--borderless, shapeless creatures that somehow fit the people they represent.

"That's a goal truly worth striving for" (Blyberg, 2006).

References

Blyberg, J. (2006), "Why bother: the impact of social OPACs", blyberg.net, March 20, available at:
http://www.blyberg.net/2006/03/20/why-bother-the-impact-of-social-opacs [accessed April 2 2009].

Blyberg, J. (2008), "SOPAC 2.0: What to expect", blyberg.net, August 16, available at:
http://www.blyberg.net/2008/08/16/sopac-20-what-to-expect [accessed April 2 2009].

Larsen, K. (2007), "bibliotek.dk: opening the Danish union catalogue to the public", Interlending & Document Supply, Vol. 35 No. 4, pp. 205-210.

Layton, D. (2009), "User tags in library catalogues", Library 2 Ning network, response posted February 24, available at: http://library20.ning.com/forum/topics/user-tags-in-library [accessed April 2 2009].

Levine, J. (2008), "More undead ends", The Shifted Librarian, February 12 , available at:
http://theshiftedlibrarian.com/archives/2008/02/12/more-undead-ends.html [accessed April 8 2009].

Merčun, T. and Žumer, M. (2008a), "New generation of catalogues for the new generation of users", Program: electronic library and information systems, Vol. 42 No. 3, pp. 243-261.

Merčun, T. and Žumer, M. (2008b), "Education and training in digital libraries & reference in digital environments", in Selthofer, J. Aparac-Jelusic, T. and Krtalic, M. (Eds), Proceedings of LIDA 2008 (Libraries in the Digital Age) conference, Dubrovnik and Mljet, Croatia, June 2-7, pp. 79-85.

Noruzi, A. (2007), "Folksonomies: Why do we need controlled vocabulary?", Webology, Vol. 4 No. 2, June, available at: http://www.webology.ir/2007/v4n2/toc.html [accessed April 2 2009].

Spiteri, L.F. (2007), "Structure and form of folksonomy tags: The road to the public library catalogue", Webology, Vol. 4 No. 2, June, available at: http://www.webology.ir/2007/v4n2/a41.html [accessed April 2 2009].

Wang, J. and Lim, A. (2009), "Local touch and global reach: The next generation of network-level information discovery and delivery services in a digital landscape", Library Management, Vol. 30 No. 1/2, pp. 25-34.

Whitehead, M. (2008), "User-generated content and social discovery in the academic library catalogue: Findings from user research – access 2008", RSS4Lib, available at: http://www.rss4lib.com/2008/10/usergenerated_content_and_soci.html [accessed April 2 2009].