Saturday, December 25, 2010

Signs of success

Either this:


Or this:

(reduced to using a raw ip address)

Tuesday, December 14, 2010

OCLC Motion to Dismiss, Pt II

Continuing on...


Here's a somewhat extended quote from the Motion that quotes the original complaint:
"At other points in the Complaint, without addressing the text of the records use policy, Plaintiffs characterize the policy as placing broad restriction on a library's use of its own records. ([Complaint] paras. 34-36) However, these conclusory allegations are belied by the actual terms of the records use policy pled above.. For example, Plaintiffs claim that 'a member library may not transfer or share records of its own holdings with commercial firms' ([complaint] para 35), but the records use policy states no such thing. Throughout these allegations, moreover, Plaintiffs confuse and obscure the terms 'OCLC records' and 'library records.' In reality, the situation is simple: OCLC does not prohibit a library from sharing its original cataloging records with whomever it pleases; it does, consistent with the fact that the WorldCat database is copyright, claim a legal right to the unique identifier information used to link and make usable records in WorldCat." (Motion, pp 7-8)
"Again, at most, the Complaint pleads only that libraries cannot share OCLC's records, not that they cannot share the records they themselves created." (Motion, p. 14)
This is a very interesting set of statements. First, it plays with the ambiguity in talking about "library records," denying that libraries cannot convey records of their holdings, as stated in the Complaint, then stating that they can share their original cataloging records, which is not what most in the library world would consider equivalent to "library holdings." What it comes down to is the ownership of the records in the library catalogs that represent the holdings of the library. By "the holdings of the library" I understand not just some holdings, but either all of the holdings or some useful set of those holdings. The set of records that were originally cataloged by the library is a somewhat random set, and not useful as "library holdings." OCLC claims ownership in all records in a library's catalog that were not created as original cataloging by that library. Although this is a distinction it is not a distinction that relates to any particular functionality or useful library projects relating to their holdings. It's useless nonsense, is what it is, nitpicky, and proof that OCLC was boxed into a corner as it tried to claim ownership over the millions of records created by libraries around the world.

OCLC also states in the second quote above that those records in the library data are "OCLC's records" and are not records that the libraries created. Here, "created" is a key verb. Any library that has done significant modification and upgrading to a record can probably claim at least an amount of co-creation with other libraries. The claim that those records belong to OCLC is an insult to the libraries that have put so much effort into the shared pool of bibliographic data. Of course, OCLC would counter that the libraries and OCLC are one and the same. The unilateral actions of OCLC around the record use policy definitively shattered that view.

Equally interesting is the claim of copyright on the database, a claim that has not been challenged and that might not survive a challenge. A database of bibliographic data may just be seen as a compilation of facts, essentially sweat of the brow rather than a creative output. Add to that the fact that much of the sweat was not OCLC's but was on the part of thousands of libraries, and the copyright claim looks thin. Ditto the claim to the OCLC number, which is purely a sequential number assigned to records as they enter the system. The claim that the OCLC identifier makes OCLC records usable is not defensible, IMO, in that every database assigns numbers to things as part of the mechanical database management process. There's nothing new or creative about the fact that OCLC records have OCLC database numbers.

Remember, though, that these statements are not meant for you and me; they are addressed to a court that may have very little knowledge in these matters. Obfuscation of the facts is undoubtedly part of the trial process, and on the part of all parties involved. Unfortunately, OCLC's motion goes beyond obfuscation -- it gets nasty.

Sarcasm and Nastiness

I've only read the legal documents for a few cases that I'm particularly interested in, so my experience here is limited. However, I would assume that a court case would best be won on cleverness, wily strategies and the ability to out-wit ones' opponent. In this as in other professional and public endeavors, I would expect the participants to affect a tone of detached politeness, even while skewering their rival. The OCLC motion plummets into sarcasm and nastiness. Here are some quotes:
"...Plaintiffs have thrown a plethora of allegations of OCLC's purportedly anticompetitive actions into the Complain to see if any stick..." (Motion, pp. 1-2)
"While OCLC denies that either of these libraries has suffered as the result of anything other than purchasing the Plaintiff's inferior cataloging software..." (Motion, p. 17)

"... vigorous competition against a company offering less expensive, but inferior products, is perfectly lawful." (Motion, p. 1)
"Nevertheless, what is sauce for the goose is sauce for the gander -- having pled a fiction that undercuts the existence of any claims they can pursue, Plaintiffs cannot claim to have been injured..." (Motion, p. 4, footnote)
"Nothing in the antitrust laws requires OCLC to subsidize SkyRiver's inferior product by setting its pricing for registering holdings into WorldCat as low as possible." (Motion, p. 28)
I find these statements to be embarrassingly unprofessional in nature, although for all I know this is the norm in legal arguments.

Separate Realities

I suppose that one of the main skills for legal argumentation is the ability to present "facts" in ways that benefit your client, regardless of the facts. (If I were a judge and had to listen to this stuff, I'm sure I'd be driven to homicide.) Here are some examples from the motion to dismiss:

1. The named libraries, Michigan State and Cal State Long Beach, were not harmed by OCLC, they simply declined to purchase OCLC's record upload service. This is cited as proof that they were not coerced into making a purchase (which appears to be one of the antitrust offenses). (p. 29) There is no mention that the libraries could not afford the price that OCLC offered, that the price changed without warning, etc.

2. WorldCat Local is not a competitor to ILS systems because it exists in addition to the ILS system. The Motion of course completely fails to connect WC Local, its attempt to limit use of the bibliographic data, and the upcoming "in the cloud" library systems platform. Are they worried that it might actually look like improper use of the WorldCat database?

3. SkyRiver does have bibliographic records, so OCLC cannot be accused of having a monopoly on bibliographic records. (As if any bunch of bibliographic records will do.) Elsewhere in the document they boast of having the largest bibliographic database. Are we back to the Goose and the Gander?


These are just a few of the topics in the Motion, and just the ones that I found most interesting. They may not even be the most relevant topics relating to the lawsuit. I suggest that you read the Motion and other documents for yourself.

OCLC Motion to Dismiss, Pt I

OCLC has filed a motion to dismiss in the anti-trust lawsuit brought by SkyRiver/III. I presume that this is Standard Operating Procedure in cases of this type. As someone who is not versed in the complexities of antitrust law, I have no idea if OCLC makes a good case in its motion. My impression is that the OCLC lawyers are quite adept, and that bodes well for OCLC in the case.

I will comment on some interesting text and subtext of the motion. Since this will get long, here is quick summary of what follows:

  • The motion states that SkyRiver has so far offered little proof of harm due to OCLC's business practices.
  • The motion may play on the court's ignorance of the library world and of OCLC's definitions.
  • OCLC makes some interesting claims to rights.
  • The motion makes claims that twist the words of SkyRiver's complaint.
  • The motion contains some unfortunate use of sarcasm and nastiness.
  • The motion undermines some previous OCLC claims as to the force of the Record Use policy.

Little Proof

The motion claims that the SkyRiver complaint contains few hard facts that could be used to back up the anti-trust claims. (Although I have no idea how detailed such a complaint is supposed to be.) It doesn't explain the library market and OCLC's role in it. What I find particularly lacking is that there is no comparison of pricing for record uploads between the libraries that moved to SkyRiver for cataloging and other libraries that upload records to OCLC. (According to the 2009 annual report, only 12% of records added to WorldCat were added via cataloging on OCLC; the rest were batch loaded.)

Ignorance and Definitions

OCLC plays heavily on the confusion between WorldCat, the database, and the records in libraries' catalogs. This is not an easy concept to grasp, and it is not explained well in the SkyRiver complaint. Wherever SkyRiver's complaint refers to "library records" OCLC counters using "WorldCat" in its place. It makes a huge difference to be talking about the records in a library's catalog vs. the entire WorldCat database. OCLC claims that SkyRiver is demanding that OCLC make all of WorldCat available for free to competitors. What is actually said is:
"Library records should be freely and openly available for use and re-use either in the public domain or by reasonable means of access for all, including for-profit library services firms." (Complaint, para. 76)

But OCLC re-words this in its response as:
"... (a) library records should be free, regardless of OCLC's inestment in aggregating, normalizing, enhancing, maintaing(sic), and delivering services based on them..." (Motion, p. 10)
OCLC also says:
"Plaintiffs pled, at most, only that libraries cannot share OCLC records, not that they are prevented from sharing records they created." (Motion, p. 21)
What is clear here, as it is throughout the motion document, is that SkyRiver is talking about the records that are in library catalogs, and OCLC is talking about "OCLC" or "WorldCat" records. By referring to the records in library catalogs as "OCLC" records, OCLC thus claims ownership to those records. In the former meaning, the libraries are prevented from making use of the records in their catalogs as they wish; in the latter, OCLC is the owner of a database and claims are being made against that database. Unless these definitions are cleared up, the two parties are just talking past each other, and no member of the court is going to make sense of it all. That, of course, would probably be to OCLC's advantage.

Record Use Policy

The original complaint cites the OCLC record use policy as a means by which OCLC maintains
"strict control over its members' access and use of the WorldCat database...". (Complaint, para. 33)
OCLC's motion first complains that SkyRiver did not attach a copy of the Policy with its original filing (but did so to their response to the Motion to Transfer). This is irrelevant to the case, I believe, and therefore is a bit of sniping at SkyRiver's lawyers, hinting that they aren't doing a good job. Anyway, here's how OCLC replies to that:
"The nature of these documents is not pled: it is not claimed that these documents are anything other than 'guidelines' OCLC publishes or that OCLC has ever used these documents to prevent a library from providing its catalog records to Plaintiffs or any other entity." (Motion, p. 7)
There's more, but let's first examine this statement. During the big broo-ha-ha about the policy, Karen Calhoun published "Notes on OCLC's updated Record Use Policy" on the OCLC blog, and stated:
"The updated policy is a legal document. Being a player on the Web, working on behalf of libraries, requires that the policy be a legal document."
That is of course the opposite of what is said in the motion.
(See comment below by Jennifer Younger: "The new 2010 policy is correctly characterized in OCLC's Motion to Dismiss as a code of good practice to guide members' choices about how they share their copies of WorldCat records.")

What is sad, however, is the statement, true as far as I know, that OCLC has never used these documents to prevent libraries from sharing their records. It hasn't had to, because the mere threat has been enough to prevent libraries from acting. The libraries that have released their records have done so unscathed, but they are few. There are of course two ways to interpret this: libraries are afraid to release their records, fearing retribution, or that libraries agree with OCLC's argument that WorldCat would be endangered should library records be openly shared.

I'll pause here and take up again shortly.

Monday, December 06, 2010

Online 2010 and SWIB

I'm just back from a lengthy trip that ended at the Semantic Web in Bibliotheken (SWIB)(#swib10) conference in Cologne, Germany, followed by Online Information 2010 in London ( #online2010). These are some thoughts from those events.


I saw two examples of uses of FRBR that do not follow the structure provided in the FRBR documentation and both made good sense to me.
  • The Bibliotheque Nationale of France (BNF) is working to export its data in a linked data format. They are linking the Manifestation directly to the Work and to the Expression, rather than following the M -> E -> W order that is defined in FRBR. I need to think about this some more, but it seems to remove some of the rigidity of the linear WEMI.
  • The Deutsche Nationalbibliothek is using an identifier method that seems to resolve the (long) discussion I instigated on the FRBR list about identifying WEMI with a single identifier. They give an identifier to the single WEMI group (one work, one expression, one manifestation, and presumably one Item, but no one seems to be talking about items.) There is also an identifier for each W, E, M, I. This works well for input and output (and sharing). When a matching W or WE is found, a "merged" identifier is coined for the FRBR units. I couldn't follow the presentation, as it was in German, but from the slides it looked to me that all of these identifiers could co-exist, and therefore would represent different views simultaneously of the bibliographic data that would depend on the function in play (e.g. export of data about a book v. support of shared cataloging).
The key thing that I learned, though, was that there is a plethora of semantic web activities in libraries in Europe. Among these, the British Library has released the National Bibliography (1956-); the BNF will soon make data available, as will the German National Library. What do these libraries have in common? Among other things, their data is not bound by the OCLC record policy, so they are able to make it freely available.

Online 2010

I was the opening speaker on a panel about the Semantic Web at this conference and unfortunately that was the only bit of the conference I was able to attend other than the exhibits. Online Info is a combined publisher/library conference, with the publishing side being primary. At the conference one of the three tracks was "Exploiting Open and Linked Data." In the exhibits the term "semantic" was everywhere. I would like to attend this conference (because I can't really say that I have) to get a view of linked data from another industry's perspective.

My co-speakers were Sarah Barlett of Talis, and Martin Malmsten from the Swedish National Library. Sarah did something that had never occurred to me, but now I just think "Doh!" it's so obvious. Her talk walked through a literary, rather than bibliographic, view of some library materials. She showed how you could use linked data to support the humanities. It was, as the British say, brilliant. It's also a great way to teach people about linked data, and she advised everyone to come up with something they have a passion for and use it as an exercise in linking. Now I want to come up with some fun linking exercises for teaching purposes.

Martin talked about the motivation for making LIBRIS, the Swedish union catalog, open as linked data since 2008. He and I agreed that we really need a good linked data app that would allow people to explore the linked data space. He quoted Corey Harper saying that the killer app for linked data will probably be created by a 13-year-old, someone for whom the idea of open linking is neither novel nor new. I am really interested to see what the "linked open data" generation comes up with!

Response to JPW

Note: John Price Wilkin of Michigan wrote a post on the Open Knowledge Foundation blog that is very critical of the library linked data movement and the creation of numerous disjoint files of bib data in linked data formats. I admit that it isn't clear to me what he thinks should happen, but it seems to be something like this photo, which I took at the Online 2010 exhibit hall. This is OCLC's booth.

A separate cloud for libraries. Totally the wrong idea.

I must say that I see things quite differently from JPW. Although I agree that a bunch of static bibliographic files do not open library linked data make, my view is:

1) Each file represents a person or group who got interested in transforming library data and went through the learning process of actually doing it. Therefore each file is a contribution to our collective knowledge about linked data. When we add these files to heterogeneous stores like Open Library or Freebase, we exercise that knowledge.

2) These files are the fodder for further experimentation with mixing library data and non-library data, which to me is one of the main points of linked library data. We are in the "training wheels" stage of this change, and like training wheels these early files may end up being discarded when we finally learn to ride. I see no harm in that.

3) This experimentation is taking place primarily outside of the US in places where the OCLC record use policy does not apply. The British Library, the National Library of Sweden, soon the Bibliotheque Nationale, and a handful of German libraries are at the forefront of this. If you cannot release your bibliographic data openly, you cannot participate in the linked data movement.

4) I do think that we will have library systems that make use of a different data format to the one we have today, but those are not the same as linked data, and are definitely not the linked open data that is the main focus of the linked data activity. How we manage our data for ourselves may well be different from how we share it with the world. We do need a well-ordered library data universe where we do our bibliographic work. That should exist in parallel with open sharing that reaches beyond the library cataloging community.