Curation of LOD

These are the session notes (sketchy I’m afraid) for the discussion on curation of linked open data on day 1 of the 2013 LODLAM summit in Montreal.  There are multiple ways to look at curation and that can be seen in the different slants brought into the mix – curation of the data that the agency or person has (and its state or fitness for reuse and supply) and the data that it is desirable to link to (why and what does that mean).  It is no surprise that questions of control and authority emerged and questions around reliance and co-contribution.  What is the perfect combination and how long will those combinations of data complement each other?

Moules, frites, bière
Moules, frites, bière
CC-BY Ingrid Mason

 

The wording in (brackets) is mine from recall.  Please feel free to comment and correct me if I’ve misinterpreted the notes.

  • Who to link to? (whose data to link the data you have to)
  • Why link to them? (is there a working relationship, how much prior collaboration, does this matter?)
  • How good they are? (what is the quality of the LOD you want to use and its relevance to your data?)
  • Who to trust may change over time?
  • Multiple suppliers of data (what to choose?)
  • Ecosystem (developing and changing)
  • Engagement of the curator in the ecosystem
  • Mediator, editor and value add through curation (to the use of LOD)
  • Mappings between different ontologies not just controlled vocabularies
  • Identity – automated linking (issues?)
  • Is VIAF a big enough grid? c/- IFLA hosted by OCLC
  • Wide reliance in (north American) libraries, e.g. OCLC example (Australia has the NLA People Australia service and there is ORCID too)
  • Linking is curation!
  • Is shared curation possible?
  • Institutional support – local, national and global linkages (follow culture, history, economics, language, trade routes and politics and there will be links?)
  • Whose requirements are being met?
  • Who pays for curation?
  • Who or what is a curator (of LOD)?
  • Curating what? (is it the data and the meaning or the interfaces too and the user experience of search and discovery too?)
  • Persistent URI exist as long as the web exists
  • Quid pro quo – get it (LOD) out quick to get it improved (co-contribution of correction or uptake for testing?)
  • !! Editorial decisions of the consuming organisation !! (of LOD) (this is curation?)
  • “publishing (LOD) with the authority of the institution” (surely this is curation?)
  • Some access is better than no access (is that always true?)
  • Data always links with a person (?) (multiple links to data sources provides diversity and useful redundancy?)
  • Open curation to the masses
  • Curation ups the quality but need good processes to help with cleaning or correction
  • Pressure on public institutions to participate in the commons
  • There is a social dimension between the curator, the community and the LOD ecosystem
  • Can use redundancy (see as an opportunity) to track errors, support consensus, and self-helping
  • Unattributed assertions (how to manage these, whether to integrate these, or not to allow them?)
  • Bidirectional (is this always the case, you link to me, I link to you?)
  • Embrace messiness and get over control issues (provide notices where the data hasn’t been checked or gone through curation process?)
  • (Use LOD) to provide supplementary information (see BBC Music)
  • Encode linking and curation as LOD, use W3C PROV-O ontology for provenance
  • Social quality – link Geodata – use: ID, City, Picture, Depiction
  • Example: OpenStreetMap
  • Buddy up with citizen curator (akin to citizen scientists)
  • BBC Wildlife trust of Wikipedia content, it filled in the gaps
  • See: Connecting the Smithsonian American Art Museum to Linked Data Cloud (US artists)
  • Flavours of LOD from well maintained and quality controlled provenance data to anonymous
  • Issues around how you present your LOD
  • Consumers’ may trust organisations may not always want to trace it (the LOD)
  • Attribution and usage (don’t conflate these two concepts for dealing with rights)
  • CC0 is “no rights reserved” effectively releasing the work into the public domain whereas CC-BY-NC is an acknowlegement of copyright and defines the nature of the use (as a licence) requiring attribution and non-commercial use
  • Note CC0 likely does not apply under Australian law and possibly also not New Zealand

Making the Case for LOD

These are the session notes (rough I’m afraid) for the discussion on making the case for linked open data on day 1 of the 2013 LODLAM summit in Montreal.  At some point I’d really like to summarise these ideas better or maybe get to a point where it is possible to tell success stories and cautionary tales so that those interested in making or reusing LOD can pick up and expand on the precious work done thus far.

Gold leaf floating caught on the wind
Gold leaf floating caught on the wind
CC-BY Ingrid Mason

The wording in (brackets) is mine from recall.  Please feel free to comment and correct me if I’ve misinterpreted the notes.

  • What are the pain points? (also who feels the pain)
  • Should the O in LOD be K for knowledge and have it rebadged? (perhaps LOD isn’t the terminology for everyone to understand what LOD can do)
  • Explain LOD so people understand it (keep is simple smarty-pants)
  • Different elevator pitches to stakeholders to get support (headlines for execs perhaps and technical speak for techs?)
  • Internal use case (who will invest and put their support behind you in a LOD project in your organisation)
  • Public use case (who are the public stakeholders and are their any general or specific needs that could be filled with LOD)
  • Listening (to stakeholders, to others experience, etc)
  • Benefits? (work out what these are and who will value what you do)
  • Responsibility? (who leads this work and/or needs to be involved to make it a success)
  • Demystifying LOD for stakeholders (non-tech speak and maybe outcomes in lay terms)
  • Keep LOD ‘under the hood’ (see slide 80, ALIAOnline Practical Linked (Open) Data for Libraries, Archives and Museums, to see how the web view and the underlying linked data are presented)
  • Who for? (make sure it is clear who the audience is for LOD project)
  • Why? (be clear about the goals for a LOD project)
  • What? (have a good think about what data to generate and integration and why)
  • Issues? Backlogs of wobbly data (this is very common and often underestimated, so perhaps including this in a LOD project outline ensures this doesn’t turn into a SNAFU)
  • Type of project – demo or BAU? (depends on how much traction with key supporters and how experimental a LOD project is)
  • Creative Commons (0), revenue risk (something to do with pressure around capacity to generate income if data isn’t CC0 (which is valid in US but not Australia or NZ btw)
  • Focus on your own data – less risk and less cost
  • Example, BBC Music – point out (use other LOD)
  • Users – what are their drivers?
  • Find ways to communicate to them (the users) e.g. via discovery
  • Scale – take care with this – ecosystem grows
  • Metrics e.g. AustLit.edu.au  (to justify investment and uptake)
  • What legal or funding requirements need to be surmounted to enable the data to be released as LOD?
  • Upfront deal with rights and costs (sic and offer value or benefits)
  • Attribution – how to deal with this or ask for it
  • Galaxy Zoo and gamification of the classification of galaxies
  • Work acknowledgement (perhaps rather than at triple level, which seems quite insane)
  • Figshare as an example (of the strength of openness in support of scholarly communication)
  • Scholarly practice and new practices of tagging (as part of a LOD project?)
  • Some ideas based on experience with e-artexte by artexte (small non-profit)
  • Problem: (how to get moving and get support)
  • Agree to be a guinea pig (this is a perfect idea)
  • Find advocates in the community
  • Publishing and visibility (catalogues online via website) (LOD apparent in search interface too?)
  • Work with a partner (Concordia), extension of library service (piggy back)
  • Solution: (what they did)
  • Open access repository (see news release)
  • Lots of outreach (getting buy-in and engagement by long term partners and supporters)
  • Next steps: (building on success)
  • Research projects (taking on new ideas)
  • Success stories (these are needed for LOD projects that hit the spot!)
  • Ways to work with technophobes “helps me do something I already do” (solve a problem with LOD?)
  • Works for open data (Wikimedia), can work with linked open data
  • Who to convince? (what do you need: money, permission, technical partners, registrar time?)
  • Who to trust? (what and who are you relying on and have you relied on them before?)
  • How to manage the question of authority? (publish your own LOD because you created it and monitor that which you integrate or ingest externally)
  • Deliver to core user stories (don’t go off into the wilds unles you’ve been funded to)
  • Prototype stage (is this Agile, i.e. make sure if you have key stakeholders they’re fully engaged)
  • Keep (iterating and checking?)
  • Talk about enhancement of services (competition?)
  • Kickbacks, and feedback loops (look at how to make the most from what you have?)
  • Need to be able to demonstrate (keep the focus and the make the scope small)
  • Social – embedding your knowledge (into the LOD?)
  • Embed LOD in the tools people are already using
  • Attach LOD and allow it to emerge by stealth (trickery)
  • We need to consolidate stories for each to use (write these up)
  • Use the design pattern library