Beyond OAI-PMH Report

Thanks to everyone who participated in the Beyond OAI-PMH session this morning.  There seems to be a number of places where others have posted their session summaries, but I thought it would be useful to include that here on the LOD-LAM site as well.

This is my brief (and somewhat tardy) account of our meeting based on the notes that I jotted down.  Comments and responses are welcome and any misrepresentations are my own.

Two main themes:

1. We need to leverage existing OAI-PMH installation base for Linked Data, because after all it does fit within the basic requirements (three stars) of Linked Data goals.
We should acknowledge current OAI-PMH for their existing contributions to Linked Open Data and emphasize that they already are participating in LOD through OAI-PMH.  While there are some additional things we can do to make the metadata we share more LOD friendly,  LOD is not a completely new idea.  Additional documentation about how OAI-PMH has succeed and failed – and what lessons that holds for the future of LOD – would be welcome.
2. We don’t necessarily need an OAI-PMH 3.0
It would be better for the community to look towards broadly adopted web standards.  Repositories need to provide what users want in multiple serializations, not limited to XML (let alone a specific XML schema).
Some suggestions for alternatives:
  • Sitemaps
  • Open Search
  • Atom
While there wasn’t a strong sense that a new OAI standard was needed, there is a recognized need to provide the existing repositories some guidance about alternative approaches. Such guidance should be promoted by funders to help new and existing projects understand how they can contrbute to the Linked Data cloud.  There was also a sense that some features of the current OAI protocal might be included in the development of web services:
  • ability to acquire incremental sets (what’s changed, what’s new)
  • an understanding of the “scope” of what’s provided (OAI sets/collections)
  • a minimal set of shared properties (a stub) that is linked directly to richer representations
  • some consensus around shared service models to make discovery and use easier
  • ability to request sets based on supplied criteria (“search” not just pre-constructed sets)
Of course the devil is in the details, during the session we had several tangential conversations about the technical details of how to implement some of these alternatives that I haven’t fully captured here.  To me this indicates that further discussion about these different options and how they might be shaped into a common framework is needed and would be valuable guidance for the community.
Additional Comment
There was also a suggestion that OAI-PMH may still be the best way to share large sets of “records” between partners.  Rather than worry about making OAI-PMH more LOD friendly,  LAMs may wish to focus their energy on providing other kinds of data as LOD (use cases?)

Beyond OAI-PMH

C & NW RR, a general view of a classification yard at Proviso Yard, Chicago, Ill. (LOC)
OAI-PMH is a great way to ship large "collections" of records between repositories.

The Open Archives Initiative – Protocol for Metadata Harvesting (OAI-PMH) is the foundation on which the IMLS Digital Collections and Content project and the companion Opening History aggregation are built.* Although small increases in the use of OAI-PMH were seen over the course of the project, less than a quarter of IMLS National Leadership grant projects provide item-level metadata using OAI-PMH [1, 2]. In some cases, the projects in the the missing 75% are legitimate – they are not collection with readily available item-level metadata (e.g. narrative exhibits, interactives/games, etc.). But this still leaves many projects/collections out of a broader network of resources. OCLC/RLG found a higher percentage (48%) of member organizations using OAI-PMH, but it is unclear how much of their metadata was shared this way [3]. While recognizing that OAI-PMH has been successful at making millions of descriptions available, it’s worth pausing to wonder if 25-50% adoption is good enough.

In light of the rapid growth of LOD in the last few years, I’ve been wondering how a large-scale aggregation like IMLS DCC might fit into this environment. Here are a few questions to discuss at #LODLAM:

  • What are the lessons from OAI-PMH that will be important for LOD-LAM?
  • How is the lack of one, common protocol for sharing data a benefit and/or a danger?
  • Will Linked Open Data be “low barrier” for some, but untouchable for many?
  • Can/should we build LOD on top of existing OAI-PMH installations? (see [4, 5])
  • Should we abandon OAI in favor of more web-friendly approaches? (See @edsu Digital Public Library as a Generative Platform)
  • What are the lessons from the Museums and the Machine-Processable Web and Europeana for U.S. organizations?
  • One of the reasons that OAI-PMH succeeded was through support of funders – what should funding agencies tell projects about implementing LOD?
Nottingham School at the Interstates edge," in Teaching & Learning Cleveland
LOD offer the opportunity to move smaller units of information, quickly, to more access points.
  1. Palmer, C., Zavalina, O., Mustafoff, M. (2007) Trends in Metadata Practices: A Longitudinal Study of Collection Federation. pre-print available at: http://imlsdcc.grainger.illinois.edu/docs/JCDL07_final.pdf
  2. Jett, J.G. (2010). Supplementing OAI-PMH in the IMLS Digital Collections & Content Aggregation. Masters Thesis. Available at: http://goo.gl/SaWPE
  3. Ayers, L. , Camden, B. P. , German, L. , Johnson, P. , Miller, C. and Smith-Yoshimura, K. (2009) What we’ve learned from the RLG partners metadata creation workflows survey— Retrieved March 2, 2009 fromhttp://www.oclc.org/programs/publications/reports/2009-04.pdf
  4. Haslhofer, B., & Schandl, B. (2008). The OAI2LOD Server: Exposing OAI-PMH metadata as linked data. International Workshop on Linked Data on the Web (LDOW2008), co-located with WWW. Available at: http://events.linkeddata.org/ldow2008/papers/03-haslhofer-schandl-oai2lod-server.pdf
  5. Haslhofer, Bernhard and Schandl, Bernhard (2010) Interweaving OAI-PMH Data Sources with the Linked Data Cloud. Int. J. Metadata, Semantics and Ontologies, 1 (5). pp. 17-31. Available at: http://eprints.cs.univie.ac.at/73/1/ijmso2010_haslhofer_schandl.pdf

Images:

* Disclaimer: these opinions are my own and may not reflect official opinions of the project or my colleagues.

This post has been cross-posted on Inherent Vice.