Why is the Communities Lab asking people to read old news?
Following Ruth’s post introducing the various ‘Labs’ within the project, this is a quick summary of the first outcomes the Communities Lab is working towards. Following our interest in integrating public engagement with research activities, one of our first outcomes will be a crowdsourcing project to collect historical newspaper articles about industrial (or workplace) accidents. Crowdsourcing in the context of cultural heritage means asking the public to help with tasks that contribute to a shared, significant goal related to cultural heritage collections or knowledge. As a form of online volunteering, the activities and/or goals should be inherently rewarding for participants. Crowdsourcing is based on the idea that humans are (still) better than machines at some tasks.
Our initial task will have two parts. The first will be to look at an article (selected with keyword searches) and classify it as being about an industrial accident, some other kind of accident (domestic, transport-related) or not about an accident at all. For articles where an accident is described, we’ll ask participants for additional information about any named individuals, organisations, locations, machinery etc., and for a summary of the accident. These tasks can’t currently be reliably done with software, so we’re relying on people’s ability to categorise and summarise information.
The task will be open to any member of the public, and we’re hoping that we’ll find some people keen to help us understand the impact of mechanisation as represented in new stories about accidents and injuries. Genealogists and historians used to the distracting delights of local newspapers should enjoy having a great excuse for spending time looking at old newspapers, and we hope that it’ll give people new to 19th Century history some first-hand experience with our research topics.
We’re interested to learn whether the data created shows any trends in descriptions of accidents over different periods and locations. It will also support various other Labs in their research. For example, we could work with 3I to develop machine learning models to find other articles about accidents in the wider corpus of study, based on the initial dataset. We could work with Languages on their questions about the ‘agency’ of machines, with Sources on questions about the ways in which different types of newspapers talked about accidents, and Space and Time on tracing the lives of people, families and communities mentioned in articles about accidents.
We’ll work on questions like: how do we design human computation interfaces that support sustained public engagement with historical collections? How much information do participants need to successfully understand the task and feel confident of their ability to undertake it? What forms of feedback or visualisations help motivate contributors? Can the data created be used to train machine learning models to find more articles about accidents? How much randomness or serendipity is required to make a task more engaging than boring?
In later stages, we want to integrate machine learning into our crowdsourcing activities as a step towards ‘human computation’ systems. There’s a lot of work on this in the commercial sector, but we want to build systems that encourage curiosity and deeper engagement with collections – values important to us at the British Library – alongside more efficient interfaces and workflows that allow us to scale up the nature of our enquiry. Ultimately, we want to produce recommendations on best practice for integrating human computation with academic research questions based on complex digitised collections.
Update, October 2019: our beta prototype is live! Have a go at Zooniverse Living with Machines and let us know what you think.
Want to find out more, or get in line to take part? Sign up to our newsletter for updates.