Apologies for the brevity of this blog post – I’m keeping this brief to make sure I get it posted before LOD-LAM.
So, archival description.
Archival records are hard to find. They’re often in large bodies of records, difficult to browse through and generally less cut-and-dry than publications which are intended for formal publication and/or public consumption. Archival finding aids are the researcher’s traditional first point of contact, providing background biographical information on the organization and/or personal creator(s), as well as a description of how the records are arranged and description of the various levels of organizational hierarchy. They’re useful!
But they’re also a bit old-fashioned, at least as typically implemented. The finding aid structure imposes a few issues for linked open data applications.
I see two[1] major problems with current archival description:
- They’re hierarchical
Most countries’ archival description standards are based on a strict hierarchy from higher levels of description (fonds, etc.) to more precise levels of description (series, sub-series, file, item) with fairly rigidly prescribed relationships between items. The finding aid also assumes a “paper” whole-body approach, rather than a linking approach. This is kind of non-webby, and imposes a stricter order on documents than their creators may have had, in many cases.
(The Australians, of course, are a few steps ahead of the rest of us already.)
Perhaps even more though, a major problem is that:
- They’re imprecise.
This is the real issue, or at least the most immediate issue. Archival descriptions are designed for human eyes in a paper world, and so they’re often encoded with a level of ambiguity that’s difficult for machines to extract. (LOCAH has been doing a great job of identifying points of concern and trying to route around them.)
Archival descriptions have some inherent ambiguity because interpretation of archival holdings is not always cut and dry, but that doesn’t mean that we have to be ambiguous in how we create those descriptions. We can be precise about the ways in which our collections are ambiguous.
I’d love to get a conversation going about revising descriptive standards to enhance precision in finding aids in order to enhance the ability to use them as computer-readable metadata. I can see a number of areas for improvement:
- More strongly-typed data fields, rather than “fuzzy” fields that can hold a variety of types of subjectively-defined data
- More focus on “globally-scoped” names rather than “locally scoped” (as pointed out by Pete@LOCAH here)
- A stricter, clearer inheritance model rather than ISAD(G)’s rule of non-repetition (Thanks to Pete again)
- Certainly more, which we can talk about at LOD-LAM!
The extent to which all this can be implemented will depend on the organization, of course – retrofitting older archival descriptions for all of this would be time-consuming, if practical at all. But I think there are a lot of benefits to be gained by changing practices going forward, and I see this as an enhancement to current descriptive standards/practices that can benefit more than just linked open data applications.
[1] Probably more than two, but for now I’ll focus on these.
This is one of the major goals of the group that is currently authoring the next major revision of EAD. Unfortunately Mark Matienzo isn’t going to be at LOD-LAM (last I heard), so he can’t tell you more about what the EAD working group is up to. In developing EADitor, an XForms-based approach to creating really complex finding aids using next-gen web forms, one of my chief goals is controlled vocabulary service interactivity. So far users can tap into LCSH and geonames. I think it’s important to establish firm controlled vocabulary out of the box so that relationships between objects and collections can be made more easily, especially if you have a user interface that can use these terms as search facets. The open endedness of @type in EAD can be a blessing or curse, depending on usage. You’re definitely right about that.
Mark (@anarchivist) tweeted a link to this paper, which I think does a nice job of introducing some of the current problems with EAD and how OAI-ORE might address them.
Archival Description in OAI-ORE
http://journals.tdl.org/jodi/article/view/1814/1769
I agree that archival finding aids and referencing needs some work. A concern is that some organizational procedures have not kept up and insist on referencing a document as ‘RG 45, Box 22, H’. This really means ‘look in Box 45 of Section RG45 and open sub-folder H’ to the archivist who expects another form with a free-hand description of the desired document so he knows what to look for.
Not that the current batch of OAI, Dublin Core and XML Marc are the answer to everything. This blog [1] post hammers the problem of implementation them, but how can we provide a means for people to reference specific objects within a collection using linked open data?
[1] http://reprog.wordpress.com/2010/09/02/bibliographic-data-part-1-marc-and-its-vile-progeny-2/
Ethan, that’s great news! I wasn’t aware that a new EAD was being worked on, and it sounds like that will help a great deal. That said, though, I’m not convinced that the technical structure alone is the solution to fixing description. It helps, but many of the problems (rigid hierarchical structure, tolerance for loosely typed data, reliance on unstructured data fields) come from the content standards too – ISAD(G), RAD, whatevs. I think we need to look at refining practice, or the standards, to ensure consistency in how the people inputting and creating finding aids work.
Richard, thanks! I saw him post the link on Twitter.