Crowdsourcing LOD-LAM – LODLAM Summit 2011

The ‘linked’ part of LOD can eat up a lot of time and resources. Machine processing might be able to match up places and subjects to established LOD sources, but disambiguating people and events can be trickier. Crowdsourcing may be one way in which these more complex or subtle relationships can be defined. It’s a topic that is perhaps worth some time at LOD-LAM.

It seems to me that there are at least 4 ways in which crowdsourced data might enrich the LOD offerings of LAMs…

Specifically designed projects
There are lots of examples now of projects created by institutions to enlist the public in extracting structured data from unstructured sources, or adding to existing descriptive data relating to objects, photographs or documents. There are fewer that seek to use crowdsourcing to define relationships between items, or between items and other entities. Any examples?

Machine tags
You can allow users to add semantics to the tags they add to collection items, creating machine tags (or triple tags). This allows people to refer to standard vocabularies in their tags, or define relationships with entities outside of your collection database. Flickr, for example, supports machine tags (you can browse them all here).

Machine tags are of course meant to be read by machines, so they’re not all that human friendly. If you wanted to encourage their use you’d probably want to create tools that simplified their construction, and perhaps some feedback mechanism to demonstrate their significance. That’s basically what I was experimenting with in the Flickr Machine Tag Challenge. People can generate machine tags automatically using my Identity Browser (based on People Australia), add them to Flickr, and keep track of their work via the FMTC scoreboard. Similarly I created a simple tool for generating machine tags from the NLA’s newspapers database.

Distributed linking
One of the good things about the Linked Data is that it’s linked! There’s no reason why all the activity has to happen on institutional website. It may be that the best way of enmeshing you collection in the cloud is to provide clear persistent uris and to help people and projects that use your stuff to publish their own research as LOD. Make it a co-operative endeavour rather than a ‘come to our site and help us’ project. This is what I have in mind for Invisible Australians, but we haven’t got very far yet.

Meta linking (I need some better names for these categories)
I’m sure there’s already something like this, but I can’t think of any examples right now (it’s late!). In wondering about where to go with the FMTC, I started thinking about a meta-level biographical linker, which would allow people to define and publish relationships between resources about people on the web. Sort of a semantic bookmarker, rdfa generator, biographical register… Perhaps using tools like LORE or even Zotero.

The point being, of course, that the links or annotations can exist completely separately from the resources they’re describing.

Anyway, I know there are people coming to LOD-LAM with much more experience than me on the crowdsourcing front, so I’d be really interested in having a discussion along these sorts of lines.

3 thoughts on “Crowdsourcing LOD-LAM”

Hi Tim

Looks like a great session. We’ve been thinking a little about crowdsourcing to match names and subjects in our Locah project, but we haven’t really progessed it. I’ll be interested in discussing this more. Ade

I’m really excited about this topic!

re machine tags – as with many things, I think it’s easier to convince people to invest in them when they can see the results.

I worked with curators at the Science Museum to add machine tags containing museum accession numbers to images on Flickr and posts on the ‘Stories from the Stores’ blog (http://sciencemuseumdiscovery.com/blogs/collections/) – the idea was to use the machine tags to pull in posts and images from WordPress and Flickr onto the main museum collections site. The curators didn’t have any trouble working out machine tags, but I suspect they’ve stopped doing it now as we never had resources to make a widget to use them so they could see how it created links between content in different systems.

And I would say this, but crowdsourcing games are a great way to add various sorts of structured and unstructured data to records, including links between resources and categories grouping resources.

I find this fascinating too – am especially interested in the possible applications in and for school-age audiences (& their teachers).

Harnessing the natural drive many children have to classify, share, describe and question, plus playful, gaming-led and competition drivers could be a fantastic way to engage kids/teens interest. And to boost ‘discoverability’ of collections for this under-served audience, not to mention providing us and their teachers with insights into the way children see, interpret and learn about their cultural heritage. Would also be cross-curricular, skills and knowledge-based, ticking lots of boxes for those aforementioned teachers.

Anyway, back to organising, see you all on Thursday…

3 thoughts on “Crowdsourcing LOD-LAM”

Leave a Reply Cancel reply