Thursday, September 21, 2006

Description and Access?

The revision of the library cataloging rules that is underway is being called "Resource Description and Access" or RDA. Although it is undoubtedly an unpopular view point, I would like to suggest that description and access are two very different functions and that they should not be covered by a single set of rules, nor should they necessarily be performed by a single metadata record.

The pairing of description and access is functionality based on card catalog technology. A main purpose of the 19th and 20th century card catalog was access. Indeed, great discussion took place in the late 19th century about the provision of a public access point for library users: access through authors, titles, and subjects. The descriptive element, the main body of the card, was essentially a bibliographic surrogate helping users make their decision on whether to go over to the shelf to look for the book. Before easy reproduction of cards, that is, before LoC began selling card sets early in the 20th century, access cards did not carry the full description of the book. Instead, all catalog cards except the main entry card had a brief entry that would allow the user to find the main entry card which had the full description.

The combination of description and access is a habit that has carried over from the card catalog and has left a legacy that to many of us is so natural that we have trouble seeing it for what it is. For example, it is because of this combination that we create artificial "headings" that cause us to display author names in the famed "last name first" order. The heading is designed for access in a system where the means of finding items is through a linear alphabetical order, which even in library systems is no longer the predominant finding method. These headings set library systems apart from popular information systems such as Amazon, Barnes & Noble, Google Books. As a matter of fact, you can find examples of library catalogs that attempt a popular display by displaying the title and statement of responsibility as the main display, hiding the now odd-looking headings. What these headings say to anyone who is tech-savvy is that libraries are hindered by an obsolete technology. Libraries still create these contorted headings when markup of data can make display and ordering of data flexible and user friendly.

Not only does our use of arcane headings set libraries apart from more popular information resources, our concepts of "description" and "access" are not serving our users. The description provided by libraries might serve to identify the work bibliographically (something that matters to libraries for collection development purposes but is not of great interest to library users), however it doesn't describe the work to users in a way that can help them make a selection. We need at least reviews, thumbnails of images, sample chapters, and even local commentary ("Required reading for Professor Smith's class in European History"). And as for access, we know that the library-assigned subject headings are woefully inadequate discovery tools.

RDA claims that its purpose in the area of description "should enable the user to: a) identify the resource described...." Yet today we are in dire need of machine-to-machine identification, which RDA does not address. Increasingly our catalogs are interacting with other sources of discovery, such as web sites, search engines, and courseware. "Identification" that must be interpreted by a human being is going to be less and less useful as we go forward into an increasingly digital and networked information environment.

We are also greatly in need of an ability to share our data with systems that are not based on library cataloging. Each rule that varies from what would be common practice moves libraries further from the information world that our users occupy in their daily life. It is somewhat ironic that many pages of rules instruct catalogers on the choice of the "title proper," which is then marred by the addition of the statement of responsibility, a bit of library arcana that no one else considers to be part of the title of the work. And who else would create a title heading "I [heart symbol] New York"?

All this to say that the next generation library catalog cannot succeed if it is to be based on a set of rules that still carry artifacts from the days of the physical card catalog. It's time to get over the concepts of description and access that were developed in the 19th century. Let's move on, for goodness sake.

Wednesday, September 13, 2006

WiFi and the children

CNET has an article about some concerns arising around ubiquitous WiFi and children's access to the Internet. There's no mention of libraries or what they went through, but it will be interesting to see if cities that are providing open WiFi will face some of the challenges that libraries did. And if not, why not? Isn't ubiquitous, open WiFi Internet access the "worst case scenario" for those who so hotly opposed open Internet access in libraries?

I see that some libraries that offer WiFi are only allowing WiFi access to "older" children. Loudoun County, VA, has a rule: "Patrons age 17 and under must have a parent or guardian sign the necessary forms." Others, like Lansing Public Library, are making their WiFi entirely open: "Our wireless Internet access is open to patrons of all ages; parents or guardians of children under the age of 18 are responsible for supervising and guaranteeing their child's proper and safe use of the Internet." This brings up all kinds of interesting questions in my mind about the differences between accessing the Internet via the library's computer stations and accessing it from your own laptop. Is the obligation to protect children related to who provides the hardware? Are schools providing wireless, and if so are they using filters?

Thursday, September 07, 2006

Google Books and Federal Documents

The Google Books blog today announces with some fanfare that Diane Publications, a publishing house that specializes in (re)publishing Federal documents, is making all of its documents available for full viewing. The publisher states:
The free flow of government information to a democratic society is utmost in our mind.
So I did a publisher search on Google and found a publication called "Marijuana Use in America" -- which is a reprint of a 104th Congress hearing, and on each page there is a watermark that says:
Copyrighted material

Now you all know that this is wrong, because Federal documents are in the public domain, but no where does the Google blog or the publisher mention the "PD" word. This troubles me because it will now require effort to undo this misinformation.

And, of course, just to add salt to the wound, I was easily able to find a book that the Diane Publishing company sells for $30 that you can get from GPO for $2.95. This really hurts.

Tuesday, September 05, 2006

MARC - We Can Do It!

It seems pretty clear that there is interest in exploring the mur... uh, morphing of MARC. There are now some active email discussions (on the MARC list and on the ngc4lib list). However, neither the blog format nor email discussion are going to move us forward very well. We need a collaborative workplace where we can all add to lists of requirements, where we can share "prior art" links, and where we can mock-up solutions. I cannot easily run a wiki on my web site, so this is a call for a donation of a well-run sandbox where we can take these ideas a bit further, or information as to where one can do this.

Meanwhile, I'll gather up comments from the various discussions and post a summary.

Thank you all!

Friday, September 01, 2006

Murdering MARC

It's been almost four years now since Roy Tennant's rallying cry of "MARC Must Die" and little has been done to further that goal. It seems pretty clear that the MARC format will not expire of its own accord, so it may be time to contemplate murder. (I'm not usually taken to violent actions. Perhaps I've been reading a bit too much medieval history of late.)

There's understandably a great reluctance to tackle a change that will have such a wide-ranging effect on our everyday library operations. However, like all large tasks, it becomes more manageable when it has been analyzed into a number of smaller tasks, and I'm convinced that if we put our minds to it we can move on to a bibliographic record format that meets our modern needs. I'm also convinced that we can transition systems from the current MARC format to its successor without having to undergo a painful revolution.

The alternative to change is that library systems will cobble on kludge after kludge in their attempts to provide services that MARC does not support. It will be very costly for us to interact with non-library services and we'll continue to be viewed as adhering to antiquated technology. Since I don't like this alternative, I propose that we begin the "Death to MARC" process ASAP. It should start with a few analysis tasks, some of which are underway:

a) Analysis of the data elements in the MARC record. I have done some work on this, although as yet unpublished. But I will share here some of the data I have gathered over the past few years.
b) Analysis of how the MARC record has actually been used. This is underway at the University of North Texas where Bill Moen and colleagues are studying millions of OCLC records to discover the frequency with which data elements are actually used. This data is important because we absolutely will have to be able to transition our current bibliographic data to the new format or formats. (Yes, I said "formats.") Another aspect of this would be to investigate how library systems have made use of the data elements in the MARC record, with the hope of identifying ones that may not be needed in the future, or whose demise would have a minimum impact.
c) A functional analysis of library systems. There's a discussion taking place on a list called ngc4lib about the "Next Generation Catalog" for libraries. In that discussion it quickly becomes clear that the catalog is no longer (if it ever was) a discrete system, and our "bibliographic record" is really serving a much broader role than providing an inventory of library holdings. This was the bee in my bonnet when I wrote my piece for the Library Hi Tech special issue on the future of MARC. I'm not sure I could still stand by everything I said in that article, but the issues that were bugging me then are bugging me still now. If we don't understand the functions we need to serve, our basic bibliographic record will not further our goals.

If there's interest in this topic, perhaps we can get some discussion going that will lead to action. I'm all ears (in the immortal words of Ross Perot).